A Newton-Like Trust Region Method for Large-Scale Unconstrained Nonconvex Minimization

Weiwei, Yang; Yueting, Yang; Chenhui, Zhang; Mingyuan, Cao

doi:https://doi.org/10.1155/2013/478407

Abstract and Applied Analysis

On this page

Abstract Introduction Acknowledgments References Copyright Related Articles

Research Article | Open Access

Volume 2013 | Article ID 478407 | https://doi.org/10.1155/2013/478407

A Newton-Like Trust Region Method for Large-Scale Unconstrained Nonconvex Minimization

Yang Weiwei,¹Yang Yueting,¹Zhang Chenhui,¹and Cao Mingyuan¹

Academic Editor: Bo-Qing Dong

Received08 Jun 2013

Accepted04 Sept 2013

Published21 Oct 2013

Abstract

We present a new Newton-like method for large-scale unconstrained nonconvex minimization. And a new straightforward limited memory quasi-Newton updating based on the modified quasi-Newton equation is deduced to construct the trust region subproblem, in which the information of both the function value and gradient is used to construct approximate Hessian. The global convergence of the algorithm is proved. Numerical results indicate that the proposed method is competitive and efficient on some classical large-scale nonconvex test problems.

1. Introduction

We consider the following unconstrained optimization: where is continuously differentiable.

Trust region methods [1–14] are robust, can be applied to ill-conditioned problems, and have strong global convergence properties. Another advantage of trust region methods is that there is no need to require the approximate Hessian of the trust region subproblem to be positive definite. So, trust region methods are important and efficient for nonconvex optimization problems [6–8, 10, 12, 14]. For a given iterate , the main computation of trust region algorithms is solving the following quadratic subproblem: where is the gradient of at , is the true Hessian or its approximation, is a trust region radius, and refers to the Euclidean norm on . For a trial step , which is generated by solving the subproblem (2), adequacy of the predicted reduction and true variation of the objective function is measured by means of the ratio Then the trust region radius is updated according to the value of . Trust region methods ensure that at least a Cauchy (steepest descent-like) decrease on each iteration satisfies an evaluation complexity bound of the same order under identical conditions [11]. It follows that Newton’s method globalized by trust region regularization satisfies the same evaluation upper bound; such a bound can also be shown to be tight [12] provided additionally that the Hessian on the path of the iterates for which pure Newton steps are taken is Lipschitz continuous.

Newton’s method has been efficiently safeguarded to ensure its global convergence to first- and even second-order critical points, in the presence of local nonconvexity of the objective using line search [3], trust region [4], or other regularization techniques [9, 13]. Many variants of these globalization techniques have been proposed. These generally retain fast local convergence under some nondegeneracy assumptions, are often suitable when solving large-scale problems, and sometimes allow approximate rather than true Hessians to be employed. Solving-large scale problems needs expensive computation and storage. So many researchers have studied the limited memory techniques [15–24]. The limited memory techniques are firstly applied to line search method. Liu and Nocedal [15, 16] proposed a limited memory BFGS method (L-BFGS) for solving unconstrained optimization and proved its global convergence. Byrd et al. [17] gave the compact representations of the limited memory BFGS and SR1 formula, which made it possible for combining limited memory techniques with trust region method. Considering that the L-BFGS updating formula used the gradient information merely and ignored the available function value information, Yang and Xu [19] deduced modified quasi-Newton formula with limited memory compact representation based on the modified quasi-Newton equation with a vector parameter [18]. Recently, some researchers combined the limited memory techniques with trust region method for solving large-scale unconstrained and constrained optimizations [20–24].

In this paper, we deduce a new straightforward limited memory quasi-Newton updating based on the modified quasi-Newton equation, which uses both available gradient and function value information, to construct the trust region subproblem. Then the corresponding trust region method is proposed for large-scale unconstrained nonconvex minimization. The global convergence of the new algorithm is proved under some appropriate conditions.

The rest of the paper is organized as follows. In the next section, we deduce a new straightforward limited memory quasi-Newton updating. In Section 3, a Newton-like trust region method for large-scale unconstrained nonconvex minimization is proposed and the convergence property is proved under some reasonable assumptions. Some numerical results are given in Section 4.

2. The Modified Limited Memory Quasi-Newton Formula

In this section, we deduce a straightforward limited memory quasi-Newton updating based on the modified quasi-Newton equation, which employs both the gradients and function values to construct the approximate Hessian and is a compensation for the missing data in limited memory techniques. And then we apply the derived formula in trust region method.

Consider the following modified quasi-Newton equation [18]: where , , , and . The quasi-Newton updating matrix constructed by (4) achieves a higher order accuracy in approximating Hessian. Based on (4), the modified BFGS (MBFGS) updating is as follows: For twice continuously differentiable function, if converges to a point at which and is positive definite, then , and then . Moreover, if is sufficiently large, the MBFGS updating approaches to the BFGS updating.

Then formula (5) can be rewritten into the straightforward formula where and . Thus, can be recursively expressed as Let , and let . Then the above formula can be simply written as Formula (8) is called the whole memory quasi-Newton formula. For a given positive integer ( usually is taken for ), if we use the last pairs at the th iteration to update the starting matrix times, according to (8), we get the following limited memory MBFGS (L-MBFGS) formula: where , ; then where and .

Since the vectors and can be obtained and saved from the previous iterations, we only need to compute the vectors and to achieve the limited memory quasi-Newton updating matrix. Suppose , the computation of needs multiplications. Then we consider the computation of . If can be saved and multiplies by directly, the process needs multiplications. In this paper, we compute the product by (9). Consider So we need multiplications to achieve . Let ; then . It takes multiplications to compute . Ignoring lower order terms, it is a total of multiplications to obtain .

It is noticed that the only difference between the limited memory quasi-Newton method and the standard quasi-Newton method is in the matrix updating. Instead of storing the matrices , we need to store pairs vectors to define implicitly. The product or is obtained by performing a sequence of inner products involving and the most recent vectors pairs .

In the following, we discuss the computation of the products and , . As the situation of (11), we need multiplications to obtain . If has been computed, we only need to solve a vector product to obtain which needs multiplications. If has not been computed, we compute directly by using (9). Consider The whole computation only requires multiplications. Thus, multiplications are saved in contrast to the previous method.

If we take , and , have been obtained and saved from the previous iteration, from (11), there are multiplications to compute ; it is a considerable improvement on computation comparing with .

Algorithm 1. Compute and save , . For ,

Step 1. Compute .

Step 2. Compute .

Step 3. Compute .

Algorithm 2. Compute , . Let be the current iteration point, the vectors , and matrixes , have been obtained by the previous iteration.

Step 1. Update , .

Step 2. Compute , .

Step 3. Compute by (11); compute by (12).

We use the form of (9) to store . Instead of updating into , we update , into , .

3. Newton-Like Trust Region Method

In this section, we present a Newton-like trust region method for large-scale unconstrained nonconvex minimization.

Algorithm 3. Step 0. Given , , , , , is a given matrix. Compute ; set .
Step 1. If , then stop.
Step 2. Solve the subproblem (2) to obtain .
Step 3. Compute
Step 4. Compute
Step 5. Update the trust region radius as the following:
Step 6. By implementing Algorithm 1 to update , into , in order to update into , set ; go to Step 1.
In Step 2, using CG-Steihaug algorithm in [3] to solve the subproblem (2), the algorithm is suitable for solving large-scale unconstrained optimization. In the solving process, the products and are computed by Algorithm 2. Then the whole computation of solving subproblem only requires multiplications.

To give the convergence result, we need the following assumptions.

Assumption 4. (H1) The level set is contained in a bounded convex set.(H2) The gradient of the objective function is Lipschitz continuous in the neighborhood of ; that is, there is a constant such that (H3) The solution of the subproblem (2) satisfies where .(H4) The solution of subproblem (2) satisfies for .

Lemma 5. Suppose that (H1) holds and is positive definite; there exist constants such that for any with . Then matrices are uniformly bounded.

Proof. From Taylor expansion we have Then From (19), we obtain that It is obvious that Thus,
Since , and from (9) (in which ), we have then by (25) and being positive definite, we have
By the definition of Euclidean norm: , when is a symmetric matrix, . Obviously, is a symmetric matrix. Suppose the eigenvalues of are ; then So, is uniformly bounded.

Theorem 6. Let in Algorithm 3. Suppose that Assumption 4 holds and for some constant . Let the sequence be generated by Algorithm 3. Then one has

The proof is similar to Theorem 4.7 in [3] and is omitted.

4. Numerical Results

In this section, we apply Algorithm 3 to solve nonconvex programming problems. Preliminary numerical results to illustrate the performance of Algorithm 3 are denoted by NLMTR. The contrast tests are called NTR, which is the same as NLMTR except that is updated by BFGS formula. All tests are implemented by using Matlab R2008a on a PC with CPU 2.00 GHz and 2.00 GB RAM. The test problem collections for nonconvex unconstrained minimization are taken from Moré et al. in [25], the CUTEr collection [26, 27]. These problems are listed in Table 1.

All numerical results are listed in Table 2, in which iter stands for the number of iterations, which equals the number of gradient evaluations; nf stands for the number of objective function evaluations; Prob stands for the problem label; Dim stands for the number of variables of the tested problem; cpu denotes the CPU time for solving the problems; is the terminated gradient; and denotes the optimal value.

We compare NLMTR with NTR. The trial step is computed by CG-steihaug algorithm [3]. The matrix of NLMTR is updated by the straightforward modified L-MBFGS formula (9). Choosing , . The matrices of NTR is updated by BFGS formula in [3]. The iteration is terminated by or , where . The related figures are listed in Table 2.

From Table 2, we can see that for small-scale problems, the optimal values and the gradient norms of NTR are more accurate than NLMTR. For middle-scale problems, the accuracy of NTR is higher, but the cpu time of NLMTR is shorter. For large-scale problems, the cpu time of NTR is much more than NLMTR, and for some problems NTR fails, especially when . So NLMTR is suitable for solving large-scale nonconvex problems.

Acknowledgments

This work is supported in part by the NNSF (11171003) of China, the Key Project of Chinese Ministry of Education (no. 211039), and Natural Science Foundation of Jilin Province of China (no. 201215102).

References

M. J. D. Powell, “A new algorithm for unconstrained optimization,” in Nonlinear Programming, J. B. Rosen, O. L. Mangasarian, and K. Ritter, Eds., pp. 31–65, Academic Press, New York, NY, USA, 1970.
View at: Google Scholar | Zentralblatt MATH | MathSciNet
J. Nocedal and Y.-X. Yuan, “Combining trust region and line search techniques,” in Advances in Nonlinear Programming, Y. Yuan, Ed., vol. 14, pp. 153–175, Kluwer Academic, Dordrecht, The Netherlands, 1998.
View at: Publisher Site | Google Scholar | Zentralblatt MATH | MathSciNet
J. Nocedal and S. J. Wright, Numerical Optimization, Springer, New York, NY, USA, 1999.
View at: Publisher Site | MathSciNet
A. R. Conn, N. I. M. Gould, and P. L. Toint, Trust-Region Methods, Society for Industrial and Applied Mathematics (SIAM), Philadelphia, Pa, USA, 2000.
View at: Publisher Site | MathSciNet
H. P. Wu and Q. Ni, “A new trust region algorithm with a conic model,” Numerical Mathematics, vol. 30, no. 1, pp. 57–67, 2008.
View at: Google Scholar | Zentralblatt MATH | MathSciNet
Ph. L. Toint, “Global convergence of a class of trust-region methods for nonconvex minimization in Hilbert space,” IMA Journal of Numerical Analysis, vol. 8, no. 2, pp. 231–252, 1988.
View at: Publisher Site | Google Scholar | Zentralblatt MATH | MathSciNet
M. J. D. Powell and Y. Yuan, “A trust region algorithm for equality constrained optimization,” Mathematical Programming, vol. 49, no. 2, pp. 189–211, 1990.
View at: Publisher Site | Google Scholar | MathSciNet
D.-H. Li and M. Fukushima, “A modified BFGS method and its global convergence in nonconvex minimization,” Journal of Computational and Applied Mathematics, vol. 129, no. 1-2, pp. 15–35, 2001.
View at: Publisher Site | Google Scholar | Zentralblatt MATH | MathSciNet
Y. Nesterov and B. T. Polyak, “Cubic regularization of Newton method and its global performance,” Mathematical Programming, vol. 108, no. 1, pp. 177–205, 2006.
View at: Publisher Site | Google Scholar | Zentralblatt MATH | MathSciNet
Q. Guo and J.-G. Liu, “Global convergence of a modified BFGS-type method for unconstrained non-convex minimization,” Journal of Applied Mathematics & Computing, vol. 24, no. 1-2, pp. 325–331, 2007.
View at: Publisher Site | Google Scholar | Zentralblatt MATH | MathSciNet
S. Gratton, A. Sartenaer, and P. L. Toint, “Recursive trust-region methods for multiscale nonlinear optimization,” SIAM Journal on Optimization, vol. 19, no. 1, pp. 414–444, 2008.
View at: Publisher Site | Google Scholar | Zentralblatt MATH | MathSciNet
C. Cartis, N. I. M. Gould, and P. L. Toint, “On the complexity of steepest descent, Newton's and regularized Newton's methods for nonconvex unconstrained optimization problems,” SIAM Journal on Optimization, vol. 20, no. 6, pp. 2833–2852, 2010.
View at: Publisher Site | Google Scholar | Zentralblatt MATH | MathSciNet
C. Cartis, N. I. M. Gould, and P. L. Toint, “Adaptive cubic regularisation methods for unconstrained optimization. Part I: motivation, convergence and numerical results,” Mathematical Programming, vol. 127, no. 2, pp. 245–295, 2011.
View at: Publisher Site | Google Scholar | MathSciNet
D. Xue, W. Sun, and H. He, “A structured trust region method for nonconvex programming with separable structure,” Numerical Algebra, Control and Optimization, vol. 3, no. 2, pp. 283–293, 2013.
View at: Publisher Site | Google Scholar | Zentralblatt MATH | MathSciNet
J. Nocedal, “Updating quasi-Newton matrices with limited storage,” Mathematics of Computation, vol. 35, no. 151, pp. 773–782, 1980.
View at: Publisher Site | Google Scholar | Zentralblatt MATH | MathSciNet
D. C. Liu and J. Nocedal, “On the limited memory BFGS method for large scale optimization,” Mathematical Programming, vol. 45, no. 3, pp. 503–528, 1989.
View at: Publisher Site | Google Scholar | Zentralblatt MATH | MathSciNet
R. H. Byrd, J. Nocedal, and R. B. Schnabel, “Representations of quasi-Newton matrices and their use in limited memory methods,” Mathematical Programming, vol. 63, no. 2, pp. 129–156, 1994.
View at: Publisher Site | Google Scholar | Zentralblatt MATH | MathSciNet
C. Xu and J. Zhang, “A survey of quasi-Newton equations and quasi-Newton methods for optimization,” Annals of Operations Research, vol. 103, pp. 213–234, 2001.
View at: Publisher Site | Google Scholar | Zentralblatt MATH | MathSciNet
Y. T. Yang and C. X. Xu, “A compact limited memory method for large scale unconstrained optimization,” European Journal of Operational Research, vol. 180, no. 1, pp. 48–56, 2007.
View at: Publisher Site | Google Scholar | Zentralblatt MATH | MathSciNet
Q. Ni and Y. Yuan, “A subspace limited memory quasi-Newton algorithm for large-scale nonlinear bound constrained optimization,” Mathematics of Computation, vol. 66, no. 220, pp. 1509–1520, 1997.
View at: Publisher Site | Google Scholar | Zentralblatt MATH | MathSciNet
O. P. Burdakov, J. M. Martínez, and E. A. Pilotta, “A limited-memory multipoint symmetric secant method for bound constrained optimization,” Annals of Operations Research, vol. 117, no. 1–4, pp. 51–70, 2002.
View at: Publisher Site | Google Scholar | Zentralblatt MATH | MathSciNet
Z. H. Wang, “A limited memory trust region method for unconstrained optimization and its implementation,” Mathematica Numerica Sinica, vol. 27, no. 4, pp. 395–404, 2005.
View at: Google Scholar | MathSciNet
P. E. Gill, W. Murray, and M. A. Saunders, “SNOPT: an SQP algorithm for large-scale constrained optimization,” SIAM Review, vol. 47, no. 1, pp. 99–131, 2005.
View at: Publisher Site | Google Scholar | Zentralblatt MATH | MathSciNet
H. Liu and Q. Ni, “New limited-memory symmetric secant rank one algorithm for large-scale unconstrained optimization,” Transactions of Naniing University of Aeronautics and Astronautics, vol. 25, no. 3, pp. 235–239, 2008.
View at: Google Scholar
J. J. Moré, B. S. Garbow, and K. E. Hillstrom, “Testing unconstrained optimization software,” Association for Computing Machinery, vol. 7, no. 1, pp. 17–41, 1981.
View at: Publisher Site | Google Scholar | Zentralblatt MATH | MathSciNet
N. I. M. Gould, D. Orban, and P. L. Toint, “GALAHAD, a library of thread-safe Fortran 90 packages for large-scale nonlinear optimization,” Association for Computing Machinery, vol. 29, no. 4, pp. 353–372, 2003.
View at: Publisher Site | Google Scholar | Zentralblatt MATH | MathSciNet
H. Y. Benson, “Cute models,” http://orfe.princeton.edu/~rvdb/ampl/nlmodels/cute/.
View at: Google Scholar

Copyright

Copyright © 2013 Yang Weiwei et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

PDF Download Citation

Download other formats

Order printed copies

Views

1504

Downloads

1772

Citations