Nonlinear Analysis: Optimization Methods, Convergence Theory, and Applications
View this Special IssueResearch Article  Open Access
A Limited Memory BFGS Method for Solving LargeScale Symmetric Nonlinear Equations
Abstract
A limited memory BFGS (LBFGS) algorithm is presented for solving largescale symmetric nonlinear equations, where a line search technique without derivative information is used. The global convergence of the proposed algorithm is established under some suitable conditions. Numerical results show that the given method is competitive to those of the normal BFGS methods.
1. Introduction
Consider where is continuously differentiable, the Jacobian of is symmetric for all , and denotes the largescale dimensions. It is not difficult to see that if is the gradient mapping of some function , problem (1) is the first order necessary condition for the problem . Furthermore, considering where is a vectorvalued function, then the KKT conditions can be represented as the system (1) with and , where is the vector of Lagrange multipliers. The above two cases show that problem (1) can come from an unconstrained problem or an equality constrained optimization problem in theory. Moreover, there are other practical problems that can also take the form of (1), such as the discretized twopoint boundary value problem, the saddle point problem, and the discretized elliptic boundary value problem (see Chapter 1 of [1] in detail). Let be the norm function ; then problem (1) is equivalent to the following global optimization problem: where is the Euclidean norm.
In this paper we will focus on the line search method for (1), where its normal iterative formula is defined by where is the socalled search direction and is a steplength along . To begin with, we briefly review some methods for .
(i) Normal Line Search (Brown and Saad [2]). The stepsize is determined by where and . The convergence is proved and some good results are obtained. We all know that the nonmonotone idea is more interesting than the normal technique in many cases. Then a nonmonotone line search technique based on this motivation is presented by Zhu [3].
(ii) Nonmonotone Line Search (Zhu [3]). The stepsize is determined by , , (for , and is a nonnegative integer. The global convergence and the superlinear convergence are established under mild conditions, respectively. It is not difficult to see that, for the above two line search techniques, the Jacobian matrix must be computed at each iteration, which obviously increase the workload and the CPU time consumed. In order to avoid this drawback, Yuan and Lu [4] presented a new backtracking inexact technique.
(iii) A New Line Search (Yuan and Lu [4]). The stepsize is determined by where and . They established the global convergence and the superlinear convergence. And the numerical tests showed that the new line search technique is more effective than those of the normal line search technique. However, these three line search techniques can not directly ensure the descent property of . Thus more interesting line search techniques are studied.
(iv) Approximate Monotone Line Search (Li and Fukushima [5]). The stepsize is determined by where , , is the smallest nonnegative integer satisfying (8), and are constants, and is such that The line search (8) can be rewritten as it is straightforward to see that as , the righthand side of the above inequality goes to . Then it is not difficult to see that the sequence generated by one algorithm with line search (8) is approximately norm descent. In order to ensure the sequence is norm descent, Gu et al. [6] presented the following line search.
(v) Monotone Descent Line Search (Gu et al. [6]). The stepsize is determined by where , , and are similar to (8).
In the following, we present some techniques for .
(i) Newton Method. The search direction is defined by Newton method is one of the most effective methods since it normally requires a fewest number of function evaluations and is very good at handling illconditioning. However, its efficiency largely depends on the possibility to efficiently solve a linear system (12) which arises when computing. Moreover, the exact solution of the system (12) could be too burdensome or is not necessary when is far from a solution [7]. Thus the quasiNewton methods are proposed.
(ii) QuasiNewton Method. The search direction is defined by where is the quasiNewton update matrix. The quasiNewton methods represent the basic approach underlying most of the Newtontype largescale algorithms (see [3, 4, 8], etc.), where the famous BFGS method is one of the most effective quasiNewton methods, generated by the following formula: where and with and . By (11) and (14), Yuan and Yao [9] proposed a BFGS method for nonlinear equations and some good results were obtained. Denote , and then (14) has the inverse update formula represented by Unfortunately, both the Newton method and the quasiNewton method require many space to store matrix at every iteration, which will prevent the efficiency of the algorithm for problems, especially for largescale problems. Therefore low storage matrix information method should be necessary.
(iii) Limited Memory QuasiNewton Method. The search direction is defined by where is generated by limited memory quasiNewton method, where the famous limited memory quasiNewton method is the socalled limited memory BFGS method. The LBFGS method is an adaptation of the BFGS method for largescale problems (see [10] in detail), which often requires minimal storage and provides a fast rate of linear convergence. The LBFGS method has the following form: where , , is an integer, and is the unit matrix. Formula (17) shows that matrix is obtained by updating the basic matrix times using BFGS formula with the previous iterations. By (17), together with (7) and (8), Yuan et al. [11] presented the LBFGS method for nonlinear equations and got the global convergence. At present, there are many papers proposed for (1) (see [6, 12–15], etc.).
In order to effectively solve largescale nonlinear equations and possess good theory property, based on the above discussions of and , we will combine (11) and (16) and present a LBFGS method for (1) since (11) can make the norm function be descent and (16) need less low storage. The main attributes of the new algorithm are stated as follows.(i)A LBFGS method with (11) is presented.(ii)The norm function is descent.(iii)The global convergence is established under appropriate conditions.(iv)Numerical results show that the given algorithm is more competitive than the normal algorithm for largescale nonlinear equations.
This paper is organized as follows. In the next section, the backtracking inexact LBFGS algorithm is stated. Section 3 will present the global convergence of the algorithm under some reasonable conditions. Numerical experiments are done to test the performance of the algorithms in Section 4.
2. Algorithms
This section will state the LBFGS method in association with the new backtracking line search technique (11) for solving (1).
Algorithm 1.
Step 0. Choose an initial point , an initial symmetric positive definite matrix , positive constants , , constants , and a positive integer . Let .
Step 1. Stop if .
Step 2. Determine by (16).
Step 3. If
then take and go to Step 5. Otherwise go to Step 4.
Step 4. Let be the smallest nonnegative integer such that (11) holds for . Let .
Step 5. Let the next iterative be .
Step 6. Let . Put and . Update for times to get by (17).
Step 7. Let . Go to Step 1.
In the following, to conveniently analyze the global convergence, we assume that the algorithm updates (the inverse of ) with the basically bounded and positive definite matrix (’s inverse). Then Algorithm 1 with has the following steps.
Algorithm 2.
Step 2. Determine by
Step 6. Let . Put and . Update for times; that is, for compute where , and for all .
Remark 3. Algorithms 1 and 2 are mathematically equivalent. Throughout this paper, Algorithm 2 is given only for the purpose of analysis, so we only discuss Algorithm 2 in theory. In the experiments, we implement Algorithm 1.
3. Global Convergence
Define the level set by In order to establish the global convergence of Algorithm 2, similar to [4, 11], we need the following assumptions.
Assumption A. is continuously differentiable on an open convex set containing . Moreover the Jacobian of is symmetric, bounded, and positive definite on ; namely, there exist positive constants satisfying
Assumption B. is a good approximation to ; that is, where is a small quantity.
Remark 4. Assumption A implies
The relations in (24) can ensure that generated by (20) inherits symmetric and positive definiteness of . Thus, (19) has a unique solution for each . Moreover, the following lemma holds.
Lemma 5 (see Theorem 2.1 in [16] or see Lemma 3.4 of [11]). Let Assumption hold and let be generated by Algorithm 2. Then, for any and , there are positive constants , ; the following relations hold for at least values of .
By Assumption B, similar to [4, 9, 11, 15], it is easy to get the following lemma.
Lemma 6. Let Assumption hold and let be generated by Algorithm 2. Then is a descent direction for at ; that is, holds.
Based on the above lemma, by Assumption B, similar to Lemma 3.8 in [2], we can get the following lemma.
Lemma 7. Let Assumption B hold and let be generated by Algorithm 2. Then . Moreover, converges.
Lemma 8. Let Assumptions A and B hold. Then, in a finite number of backtracking steps, Algorithm 2 will produce an iterate .
Proof. It is sufficient for us to prove that the line search (11) is reasonable. By Lemma 3.8 in [2], we can deduce that, in a finite number of backtracking steps, is such that By (19), we get Thus By Assumption B, we have Using (19) again and , we obtain Setting and implies (11). This completes the proof.
Remark 9. The above lemma shows that Algorithm 2 is well defined. By a way similar to Lemma 3.2 and Corollary 3.4 in [5], it is not difficult to deduce that holds; we do not prove it anymore. Now we establish the global convergence theorem.
Theorem 10. Let Assumptions A and B hold. Then the sequence generated by Algorithm 2 converges to the unique solution of (1).
Proof. Lemma 7 implies that converges. If
then every accumulation point of is a solution of (1). Assumption A means that (1) has only one solution. Moreover, since is bounded, has at least one accumulation point. Therefore itself converges to the unique solution of (1). Therefore, it suffices to verify (32).
If (18) holds for infinitely many ’s, then (32) is trivial. Otherwise, if (18) holds for only finitely many ’s, we conclude that Step 3 is executed for all sufficiently large. By (11), we have
Since is bounded, by adding these inequalities, we get
Then we have
which together with (31) implies (32). This completes the proof.
4. Numerical Results
This section reports numerical results with Algorithm 1 and normal BFGS algorithm. The test problems with the associated initial guess are listed with
Problem 1. Exponential function 1: Initial guess: .
Problem 2. Exponential function 2: Initial guess: .
Problem 3. Trigonometric function: Initial guess: .
Problem 4. Singular function: Initial guess: .
Problem 5. Logarithmic function: Initial guess: .
Problem 6. Broyden tridiagonal function [17, pages 471472]: Initial guess: .
Problem 7. Trigexp function [17, page 473]: Initial guess: .
Problem 8. Strictly convex function 1 [18, page 29]: is the gradient of . Consider Initial guess: .
Problem 9. Linear functionfull rank: Initial guess: .
Problem 10. Penalty function: Initial guess: .
Problem 11. Variable dimensioned function: Initial guess: .
Problem 12. Tridiagonal system [19]: Initial guess: .
Problem 13. Fivediagonal system [19]: Initial guess: .
Problem 14. Extended Freudentein and Roth function ( is even) [20]: for Initial guess: .
Problem 15. Discrete boundry value problem [21]: Initial guess: .
Problem 16. Troesch problem [22]: Initial guess: .
In the experiments, the parameters in Algorithm 1 and the normal BFGS method were chosen as , , , , and is the unit matrix. All codes were written in MATLAB r2013b and run on PC with 6600@2.40 GHz Core 2 CPU processor and 4.00 GB memory and Windows 7 operation system. We stopped the program when the condition was satisfied. Since the line search cannot always ensure the descent condition , uphill search direction may occur in the numerical experiments. In this case, the line search rule maybe fails. In order to avoid this case, the stepsize will be accepted if the searching time is larger than eight in the inner circle for the test problems. We also stop this program if the iteration number arrived at 1000. The columns of the tables have the following meaning. Dim: the dimension. NI: the total number of iterations. NG: the number of the norm function evaluations. Time: the CPU time in second. GN: the normal value of when the program stops. NaN: notanumber, impling that the code fails to get a real value. Inf: returning the IEEE arithmetic representation for positive infinity or infinity which is also produced by operations like dividing by zero.
From the numerical results in Table 1, it is not difficult to show that the proposed method is more successful than the normal BFGS method. We can see that there exist many problems which can not be successfully solved by the normal BFGS method. Moreover, the normal BFGS method fails to get real value for several problems. Then we can conclude that the presented method is more competitive than the normal BFGS method.

Conflict of Interests
The authors declare that there is no conflict of interests regarding the publication of this paper.
Acknowledgments
The authors would like to thank Professor Gonglin Yuan of Guangxi University for his suggestions in the paper organization and help in the codes of the program, which save them much time to complete this paper. The authors also thank the referees for valuable comments and the editor for suggestions in the idea and the English of this paper, which improves the paper greatly. This work is supported by Guangxi NSF (Grant no. 2012GXNSFAA053002) and China NSF (Grant no. 11261006).
References
 J. M. Ortega and W. C. Rheinboldt, Iterative Solution of Nonlinear Equations in Several Variables, Academic Press, New York, NY. USA, 1970. View at: MathSciNet
 P. N. Brown and Y. Saad, “Convergence theory of nonlinear NewtonKrylov algorithms,” SIAM Journal on Optimization, vol. 4, no. 2, pp. 297–330, 1994. View at: Publisher Site  Google Scholar  Zentralblatt MATH  MathSciNet
 D. Zhu, “Nonmonotone backtracking inexact quasiNewton algorithms for solving smooth nonlinear equations,” Applied Mathematics and Computation, vol. 161, no. 3, pp. 875–895, 2005. View at: Publisher Site  Google Scholar  Zentralblatt MATH  MathSciNet
 G. Yuan and X. Lu, “A new backtracking inexact BFGS method for symmetric nonlinear equations,” Computers & Mathematics with Applications, vol. 55, no. 1, pp. 116–129, 2008. View at: Publisher Site  Google Scholar  MathSciNet
 D. Li and M. Fukushima, “A globally and superlinearly convergent GaussNewtonbased BFGS method for symmetric nonlinear equations,” SIAM Journal on Numerical Analysis, vol. 37, no. 1, pp. 152–172, 1999. View at: Publisher Site  Google Scholar  MathSciNet
 G. Gu, D. Li, L. Qi, and S. Zhou, “Descent directions of quasiNewton methods for symmetric nonlinear equations,” SIAM Journal on Numerical Analysis, vol. 40, no. 5, pp. 1763–1774, 2002. View at: Publisher Site  Google Scholar  Zentralblatt MATH  MathSciNet
 S. G. Nash, “A survey of truncatedNewton methods,” Journal of Computational and Applied Mathematics, vol. 124, no. 12, pp. 45–59, 2000. View at: Publisher Site  Google Scholar  Zentralblatt MATH  MathSciNet
 A. Griewank, “The “global” convergence of Broydenlike methods with a suitable line search,” Australian Mathematical Society B, vol. 28, no. 1, pp. 75–92, 1986. View at: Publisher Site  Google Scholar  MathSciNet
 G. Yuan and S. Yao, “A BFGS algorithm for solving symmetric nonlinear equations,” Optimization, vol. 62, no. 1, pp. 85–99, 2013. View at: Publisher Site  Google Scholar  MathSciNet
 R. H. Byrd, J. Nocedal, and R. B. Schnabel, “Representations of quasiNewton matrices and their use in limited memory methods,” Mathematical Programming, vol. 63, no. 1–3, pp. 129–156, 1994. View at: Publisher Site  Google Scholar  MathSciNet
 G. Yuan, Z. Wei, and S. Lu, “Limited memory BFGS method with backtracking for symmetric nonlinear equations,” Mathematical and Computer Modelling, vol. 54, no. 12, pp. 367–377, 2011. View at: Publisher Site  Google Scholar  MathSciNet
 G. Yuan, “A new method with descent property for symmetric nonlinear equations,” Numerical Functional Analysis and Optimization, vol. 31, no. 7–9, pp. 974–987, 2010. View at: Publisher Site  Google Scholar  MathSciNet
 G. Yuan and X. Li, “A rankone fitting method for solving symmetric nonlinear equations,” Journal of Applied Functional Analysis, vol. 5, no. 4, pp. 389–407, 2010. View at: Google Scholar  Zentralblatt MATH  MathSciNet
 G. Yuan, X. Lu, and Z. Wei, “BFGS trustregion method for symmetric nonlinear equations,” Journal of Computational and Applied Mathematics, vol. 230, no. 1, pp. 44–58, 2009. View at: Publisher Site  Google Scholar  MathSciNet
 G. Yuan, Z. Wei, and X. Lu, “A BFGS trustregion method for nonlinear equations,” Computing, vol. 92, no. 4, pp. 317–333, 2011. View at: Publisher Site  Google Scholar  MathSciNet
 R. H. Byrd and J. Nocedal, “A tool for the analysis of quasiNewton methods with application to unconstrained minimization,” SIAM Journal on Numerical Analysis, vol. 26, no. 3, pp. 727–739, 1989. View at: Publisher Site  Google Scholar  Zentralblatt MATH  MathSciNet
 M. A. GomesRuggiero, J. M. Martínez, and A. C. Moretti, “Comparing algorithms for solving sparse nonlinear systems of equations,” SIAM Journal on Scientific and Statistical Computing, vol. 13, no. 2, pp. 459–483, 1992. View at: Publisher Site  Google Scholar  MathSciNet
 M. Raydan, “The Barzilai and Borwein gradient method for the large scale unconstrained minimization problem,” SIAM Journal on Optimization, vol. 7, no. 1, pp. 26–33, 1997. View at: Publisher Site  Google Scholar  MathSciNet
 G. Y. Li, “Successive column correction algorithms for solving sparse nonlinear systems of equations,” Mathematical Programming, vol. 43, no. 2, pp. 187–207, 1989. View at: Publisher Site  Google Scholar  MathSciNet
 Y. Bing and G. Lin, “An efficient implementation of Merrill's method for sparse or partially separable systems of nonlinear equations,” SIAM Journal on Optimization, vol. 1, no. 2, pp. 206–221, 1991. View at: Publisher Site  Google Scholar  MathSciNet
 J. J. Moré, B. S. Garbow, and K. E. Hillstrom, “Testing unconstrained optimization software,” ACM Transactions on Mathematical Software, vol. 7, no. 1, pp. 17–41, 1981. View at: Publisher Site  Google Scholar  MathSciNet
 S. M. Roberts and J. J. Shipman, “On the closed form solution of Troesch's problem,” Journal of Computational Physics, vol. 21, no. 3, pp. 291–304, 1976. View at: Publisher Site  Google Scholar  MathSciNet
Copyright
Copyright © 2014 Xiangrong Li et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.