Research Article | Open Access
A Newton-Like Trust Region Method for Large-Scale Unconstrained Nonconvex Minimization
We present a new Newton-like method for large-scale unconstrained nonconvex minimization. And a new straightforward limited memory quasi-Newton updating based on the modified quasi-Newton equation is deduced to construct the trust region subproblem, in which the information of both the function value and gradient is used to construct approximate Hessian. The global convergence of the algorithm is proved. Numerical results indicate that the proposed method is competitive and efficient on some classical large-scale nonconvex test problems.
We consider the following unconstrained optimization: where is continuously differentiable.
Trust region methods [1–14] are robust, can be applied to ill-conditioned problems, and have strong global convergence properties. Another advantage of trust region methods is that there is no need to require the approximate Hessian of the trust region subproblem to be positive definite. So, trust region methods are important and efficient for nonconvex optimization problems [6–8, 10, 12, 14]. For a given iterate , the main computation of trust region algorithms is solving the following quadratic subproblem: where is the gradient of at , is the true Hessian or its approximation, is a trust region radius, and refers to the Euclidean norm on . For a trial step , which is generated by solving the subproblem (2), adequacy of the predicted reduction and true variation of the objective function is measured by means of the ratio Then the trust region radius is updated according to the value of . Trust region methods ensure that at least a Cauchy (steepest descent-like) decrease on each iteration satisfies an evaluation complexity bound of the same order under identical conditions . It follows that Newton’s method globalized by trust region regularization satisfies the same evaluation upper bound; such a bound can also be shown to be tight  provided additionally that the Hessian on the path of the iterates for which pure Newton steps are taken is Lipschitz continuous.
Newton’s method has been efficiently safeguarded to ensure its global convergence to first- and even second-order critical points, in the presence of local nonconvexity of the objective using line search , trust region , or other regularization techniques [9, 13]. Many variants of these globalization techniques have been proposed. These generally retain fast local convergence under some nondegeneracy assumptions, are often suitable when solving large-scale problems, and sometimes allow approximate rather than true Hessians to be employed. Solving-large scale problems needs expensive computation and storage. So many researchers have studied the limited memory techniques [15–24]. The limited memory techniques are firstly applied to line search method. Liu and Nocedal [15, 16] proposed a limited memory BFGS method (L-BFGS) for solving unconstrained optimization and proved its global convergence. Byrd et al.  gave the compact representations of the limited memory BFGS and SR1 formula, which made it possible for combining limited memory techniques with trust region method. Considering that the L-BFGS updating formula used the gradient information merely and ignored the available function value information, Yang and Xu  deduced modified quasi-Newton formula with limited memory compact representation based on the modified quasi-Newton equation with a vector parameter . Recently, some researchers combined the limited memory techniques with trust region method for solving large-scale unconstrained and constrained optimizations [20–24].
In this paper, we deduce a new straightforward limited memory quasi-Newton updating based on the modified quasi-Newton equation, which uses both available gradient and function value information, to construct the trust region subproblem. Then the corresponding trust region method is proposed for large-scale unconstrained nonconvex minimization. The global convergence of the new algorithm is proved under some appropriate conditions.
The rest of the paper is organized as follows. In the next section, we deduce a new straightforward limited memory quasi-Newton updating. In Section 3, a Newton-like trust region method for large-scale unconstrained nonconvex minimization is proposed and the convergence property is proved under some reasonable assumptions. Some numerical results are given in Section 4.
2. The Modified Limited Memory Quasi-Newton Formula
In this section, we deduce a straightforward limited memory quasi-Newton updating based on the modified quasi-Newton equation, which employs both the gradients and function values to construct the approximate Hessian and is a compensation for the missing data in limited memory techniques. And then we apply the derived formula in trust region method.
Consider the following modified quasi-Newton equation : where , , , and . The quasi-Newton updating matrix constructed by (4) achieves a higher order accuracy in approximating Hessian. Based on (4), the modified BFGS (MBFGS) updating is as follows: For twice continuously differentiable function, if converges to a point at which and is positive definite, then , and then . Moreover, if is sufficiently large, the MBFGS updating approaches to the BFGS updating.
Then formula (5) can be rewritten into the straightforward formula where and . Thus, can be recursively expressed as Let , and let . Then the above formula can be simply written as Formula (8) is called the whole memory quasi-Newton formula. For a given positive integer ( usually is taken for ), if we use the last pairs at the th iteration to update the starting matrix times, according to (8), we get the following limited memory MBFGS (L-MBFGS) formula: where , ; then where and .
Since the vectors and can be obtained and saved from the previous iterations, we only need to compute the vectors and to achieve the limited memory quasi-Newton updating matrix. Suppose , the computation of needs multiplications. Then we consider the computation of . If can be saved and multiplies by directly, the process needs multiplications. In this paper, we compute the product by (9). Consider So we need multiplications to achieve . Let ; then . It takes multiplications to compute . Ignoring lower order terms, it is a total of multiplications to obtain .
It is noticed that the only difference between the limited memory quasi-Newton method and the standard quasi-Newton method is in the matrix updating. Instead of storing the matrices , we need to store pairs vectors to define implicitly. The product or is obtained by performing a sequence of inner products involving and the most recent vectors pairs .
In the following, we discuss the computation of the products and , . As the situation of (11), we need multiplications to obtain . If has been computed, we only need to solve a vector product to obtain which needs multiplications. If has not been computed, we compute directly by using (9). Consider The whole computation only requires multiplications. Thus, multiplications are saved in contrast to the previous method.
If we take , and , have been obtained and saved from the previous iteration, from (11), there are multiplications to compute ; it is a considerable improvement on computation comparing with .
Algorithm 1. Compute and save , . For ,
Step 1. Compute .
Step 2. Compute .
Step 3. Compute .
Algorithm 2. Compute , . Let be the current iteration point, the vectors , and matrixes , have been obtained by the previous iteration.
Step 1. Update , .
Step 2. Compute , .
We use the form of (9) to store . Instead of updating into , we update , into , .
3. Newton-Like Trust Region Method
In this section, we present a Newton-like trust region method for large-scale unconstrained nonconvex minimization.
Algorithm 3. Step 0. Given , , , , , is a given matrix. Compute ; set .
Step 1. If , then stop.
Step 2. Solve the subproblem (2) to obtain .
Step 3. Compute
Step 4. Compute
Step 5. Update the trust region radius as the following:
Step 6. By implementing Algorithm 1 to update , into , in order to update into , set ; go to Step 1.
In Step 2, using CG-Steihaug algorithm in  to solve the subproblem (2), the algorithm is suitable for solving large-scale unconstrained optimization. In the solving process, the products and are computed by Algorithm 2. Then the whole computation of solving subproblem only requires multiplications.
To give the convergence result, we need the following assumptions.
Assumption 4. (H1) The level set is contained in a bounded convex set.(H2) The gradient of the objective function is Lipschitz continuous in the neighborhood of ; that is, there is a constant such that (H3) The solution of the subproblem (2) satisfies where .(H4) The solution of subproblem (2) satisfies for .
Lemma 5. Suppose that (H1) holds and is positive definite; there exist constants such that for any with . Then matrices are uniformly bounded.
Proof. From Taylor expansion
From (19), we obtain that
It is obvious that
Since , and from (9) (in which ), we have then by (25) and being positive definite, we have
By the definition of Euclidean norm: , when is a symmetric matrix, . Obviously, is a symmetric matrix. Suppose the eigenvalues of are ; then So, is uniformly bounded.
The proof is similar to Theorem 4.7 in  and is omitted.
4. Numerical Results
In this section, we apply Algorithm 3 to solve nonconvex programming problems. Preliminary numerical results to illustrate the performance of Algorithm 3 are denoted by NLMTR. The contrast tests are called NTR, which is the same as NLMTR except that is updated by BFGS formula. All tests are implemented by using Matlab R2008a on a PC with CPU 2.00 GHz and 2.00 GB RAM. The test problem collections for nonconvex unconstrained minimization are taken from Moré et al. in , the CUTEr collection [26, 27]. These problems are listed in Table 1.
All numerical results are listed in Table 2, in which iter stands for the number of iterations, which equals the number of gradient evaluations; nf stands for the number of objective function evaluations; Prob stands for the problem label; Dim stands for the number of variables of the tested problem; cpu denotes the CPU time for solving the problems; is the terminated gradient; and denotes the optimal value.
|The algorithm fails.|
We compare NLMTR with NTR. The trial step is computed by CG-steihaug algorithm . The matrix of NLMTR is updated by the straightforward modified L-MBFGS formula (9). Choosing , . The matrices of NTR is updated by BFGS formula in . The iteration is terminated by or , where . The related figures are listed in Table 2.
From Table 2, we can see that for small-scale problems, the optimal values and the gradient norms of NTR are more accurate than NLMTR. For middle-scale problems, the accuracy of NTR is higher, but the cpu time of NLMTR is shorter. For large-scale problems, the cpu time of NTR is much more than NLMTR, and for some problems NTR fails, especially when . So NLMTR is suitable for solving large-scale nonconvex problems.
This work is supported in part by the NNSF (11171003) of China, the Key Project of Chinese Ministry of Education (no. 211039), and Natural Science Foundation of Jilin Province of China (no. 201215102).
- M. J. D. Powell, “A new algorithm for unconstrained optimization,” in Nonlinear Programming, J. B. Rosen, O. L. Mangasarian, and K. Ritter, Eds., pp. 31–65, Academic Press, New York, NY, USA, 1970.
- J. Nocedal and Y.-X. Yuan, “Combining trust region and line search techniques,” in Advances in Nonlinear Programming, Y. Yuan, Ed., vol. 14, pp. 153–175, Kluwer Academic, Dordrecht, The Netherlands, 1998.
- J. Nocedal and S. J. Wright, Numerical Optimization, Springer, New York, NY, USA, 1999.
- A. R. Conn, N. I. M. Gould, and P. L. Toint, Trust-Region Methods, Society for Industrial and Applied Mathematics (SIAM), Philadelphia, Pa, USA, 2000.
- H. P. Wu and Q. Ni, “A new trust region algorithm with a conic model,” Numerical Mathematics, vol. 30, no. 1, pp. 57–67, 2008.
- Ph. L. Toint, “Global convergence of a class of trust-region methods for nonconvex minimization in Hilbert space,” IMA Journal of Numerical Analysis, vol. 8, no. 2, pp. 231–252, 1988.
- M. J. D. Powell and Y. Yuan, “A trust region algorithm for equality constrained optimization,” Mathematical Programming, vol. 49, no. 2, pp. 189–211, 1990.
- D.-H. Li and M. Fukushima, “A modified BFGS method and its global convergence in nonconvex minimization,” Journal of Computational and Applied Mathematics, vol. 129, no. 1-2, pp. 15–35, 2001.
- Y. Nesterov and B. T. Polyak, “Cubic regularization of Newton method and its global performance,” Mathematical Programming, vol. 108, no. 1, pp. 177–205, 2006.
- Q. Guo and J.-G. Liu, “Global convergence of a modified BFGS-type method for unconstrained non-convex minimization,” Journal of Applied Mathematics & Computing, vol. 24, no. 1-2, pp. 325–331, 2007.
- S. Gratton, A. Sartenaer, and P. L. Toint, “Recursive trust-region methods for multiscale nonlinear optimization,” SIAM Journal on Optimization, vol. 19, no. 1, pp. 414–444, 2008.
- C. Cartis, N. I. M. Gould, and P. L. Toint, “On the complexity of steepest descent, Newton's and regularized Newton's methods for nonconvex unconstrained optimization problems,” SIAM Journal on Optimization, vol. 20, no. 6, pp. 2833–2852, 2010.
- C. Cartis, N. I. M. Gould, and P. L. Toint, “Adaptive cubic regularisation methods for unconstrained optimization. Part I: motivation, convergence and numerical results,” Mathematical Programming, vol. 127, no. 2, pp. 245–295, 2011.
- D. Xue, W. Sun, and H. He, “A structured trust region method for nonconvex programming with separable structure,” Numerical Algebra, Control and Optimization, vol. 3, no. 2, pp. 283–293, 2013.
- J. Nocedal, “Updating quasi-Newton matrices with limited storage,” Mathematics of Computation, vol. 35, no. 151, pp. 773–782, 1980.
- D. C. Liu and J. Nocedal, “On the limited memory BFGS method for large scale optimization,” Mathematical Programming, vol. 45, no. 3, pp. 503–528, 1989.
- R. H. Byrd, J. Nocedal, and R. B. Schnabel, “Representations of quasi-Newton matrices and their use in limited memory methods,” Mathematical Programming, vol. 63, no. 2, pp. 129–156, 1994.
- C. Xu and J. Zhang, “A survey of quasi-Newton equations and quasi-Newton methods for optimization,” Annals of Operations Research, vol. 103, pp. 213–234, 2001.
- Y. T. Yang and C. X. Xu, “A compact limited memory method for large scale unconstrained optimization,” European Journal of Operational Research, vol. 180, no. 1, pp. 48–56, 2007.
- Q. Ni and Y. Yuan, “A subspace limited memory quasi-Newton algorithm for large-scale nonlinear bound constrained optimization,” Mathematics of Computation, vol. 66, no. 220, pp. 1509–1520, 1997.
- O. P. Burdakov, J. M. Martínez, and E. A. Pilotta, “A limited-memory multipoint symmetric secant method for bound constrained optimization,” Annals of Operations Research, vol. 117, no. 1–4, pp. 51–70, 2002.
- Z. H. Wang, “A limited memory trust region method for unconstrained optimization and its implementation,” Mathematica Numerica Sinica, vol. 27, no. 4, pp. 395–404, 2005.
- P. E. Gill, W. Murray, and M. A. Saunders, “SNOPT: an SQP algorithm for large-scale constrained optimization,” SIAM Review, vol. 47, no. 1, pp. 99–131, 2005.
- H. Liu and Q. Ni, “New limited-memory symmetric secant rank one algorithm for large-scale unconstrained optimization,” Transactions of Naniing University of Aeronautics and Astronautics, vol. 25, no. 3, pp. 235–239, 2008.
- J. J. Moré, B. S. Garbow, and K. E. Hillstrom, “Testing unconstrained optimization software,” Association for Computing Machinery, vol. 7, no. 1, pp. 17–41, 1981.
- N. I. M. Gould, D. Orban, and P. L. Toint, “GALAHAD, a library of thread-safe Fortran 90 packages for large-scale nonlinear optimization,” Association for Computing Machinery, vol. 29, no. 4, pp. 353–372, 2003.
- H. Y. Benson, “Cute models,” http://orfe.princeton.edu/~rvdb/ampl/nlmodels/cute/.
Copyright © 2013 Yang Weiwei et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.