Abstract

A limited memory BFGS (L-BFGS) algorithm is presented for solving large-scale symmetric nonlinear equations, where a line search technique without derivative information is used. The global convergence of the proposed algorithm is established under some suitable conditions. Numerical results show that the given method is competitive to those of the normal BFGS methods.

1. Introduction

Consider where is continuously differentiable, the Jacobian of is symmetric for all , and denotes the large-scale dimensions. It is not difficult to see that if is the gradient mapping of some function , problem (1) is the first order necessary condition for the problem . Furthermore, considering where is a vector-valued function, then the KKT conditions can be represented as the system (1) with and , where is the vector of Lagrange multipliers. The above two cases show that problem (1) can come from an unconstrained problem or an equality constrained optimization problem in theory. Moreover, there are other practical problems that can also take the form of (1), such as the discretized two-point boundary value problem, the saddle point problem, and the discretized elliptic boundary value problem (see Chapter 1 of [1] in detail). Let be the norm function ; then problem (1) is equivalent to the following global optimization problem: where is the Euclidean norm.

In this paper we will focus on the line search method for (1), where its normal iterative formula is defined by where is the so-called search direction and is a steplength along . To begin with, we briefly review some methods for .

(i) Normal Line Search (Brown and Saad [2]). The stepsize is determined by where and . The convergence is proved and some good results are obtained. We all know that the nonmonotone idea is more interesting than the normal technique in many cases. Then a nonmonotone line search technique based on this motivation is presented by Zhu [3].

(ii) Nonmonotone Line Search (Zhu [3]). The stepsize is determined by , , (for , and is a nonnegative integer. The global convergence and the superlinear convergence are established under mild conditions, respectively. It is not difficult to see that, for the above two line search techniques, the Jacobian matrix must be computed at each iteration, which obviously increase the workload and the CPU time consumed. In order to avoid this drawback, Yuan and Lu [4] presented a new backtracking inexact technique.

(iii) A New Line Search (Yuan and Lu [4]). The stepsize is determined by where and . They established the global convergence and the superlinear convergence. And the numerical tests showed that the new line search technique is more effective than those of the normal line search technique. However, these three line search techniques can not directly ensure the descent property of . Thus more interesting line search techniques are studied.

(iv) Approximate Monotone Line Search (Li and Fukushima [5]). The stepsize is determined by where , , is the smallest nonnegative integer satisfying (8), and are constants, and is such that The line search (8) can be rewritten as it is straightforward to see that as , the right-hand side of the above inequality goes to . Then it is not difficult to see that the sequence generated by one algorithm with line search (8) is approximately norm descent. In order to ensure the sequence is norm descent, Gu et al. [6] presented the following line search.

(v) Monotone Descent Line Search (Gu et al. [6]). The stepsize is determined by where , , and are similar to (8).

In the following, we present some techniques for .

(i) Newton Method. The search direction is defined by Newton method is one of the most effective methods since it normally requires a fewest number of function evaluations and is very good at handling ill-conditioning. However, its efficiency largely depends on the possibility to efficiently solve a linear system (12) which arises when computing. Moreover, the exact solution of the system (12) could be too burdensome or is not necessary when is far from a solution [7]. Thus the quasi-Newton methods are proposed.

(ii) Quasi-Newton Method. The search direction is defined by where is the quasi-Newton update matrix. The quasi-Newton methods represent the basic approach underlying most of the Newton-type large-scale algorithms (see [3, 4, 8], etc.), where the famous BFGS method is one of the most effective quasi-Newton methods, generated by the following formula: where and with and . By (11) and (14), Yuan and Yao [9] proposed a BFGS method for nonlinear equations and some good results were obtained. Denote , and then (14) has the inverse update formula represented by Unfortunately, both the Newton method and the quasi-Newton method require many space to store matrix at every iteration, which will prevent the efficiency of the algorithm for problems, especially for large-scale problems. Therefore low storage matrix information method should be necessary.

(iii) Limited Memory Quasi-Newton Method. The search direction is defined by where is generated by limited memory quasi-Newton method, where the famous limited memory quasi-Newton method is the so-called limited memory BFGS method. The L-BFGS method is an adaptation of the BFGS method for large-scale problems (see [10] in detail), which often requires minimal storage and provides a fast rate of linear convergence. The L-BFGS method has the following form: where , , is an integer, and is the unit matrix. Formula (17) shows that matrix is obtained by updating the basic matrix times using BFGS formula with the previous iterations. By (17), together with (7) and (8), Yuan et al. [11] presented the L-BFGS method for nonlinear equations and got the global convergence. At present, there are many papers proposed for (1) (see [6, 1215], etc.).

In order to effectively solve large-scale nonlinear equations and possess good theory property, based on the above discussions of and , we will combine (11) and (16) and present a L-BFGS method for (1) since (11) can make the norm function be descent and (16) need less low storage. The main attributes of the new algorithm are stated as follows.(i)A L-BFGS method with (11) is presented.(ii)The norm function is descent.(iii)The global convergence is established under appropriate conditions.(iv)Numerical results show that the given algorithm is more competitive than the normal algorithm for large-scale nonlinear equations.

This paper is organized as follows. In the next section, the backtracking inexact L-BFGS algorithm is stated. Section 3 will present the global convergence of the algorithm under some reasonable conditions. Numerical experiments are done to test the performance of the algorithms in Section 4.

2. Algorithms

This section will state the L-BFGS method in association with the new backtracking line search technique (11) for solving (1).

Algorithm 1.
Step  0. Choose an initial point , an initial symmetric positive definite matrix , positive constants , , constants , and a positive integer . Let .
Step  1. Stop if .
Step  2. Determine by (16).
Step  3. If then take and go to Step  5. Otherwise go to Step  4.
Step  4. Let be the smallest nonnegative integer such that (11) holds for . Let .
Step  5. Let the next iterative be .
Step  6. Let . Put and . Update for times to get by (17).
Step  7. Let . Go to Step  1.

In the following, to conveniently analyze the global convergence, we assume that the algorithm updates (the inverse of ) with the basically bounded and positive definite matrix    (’s inverse). Then Algorithm 1 with has the following steps.

Algorithm 2.
Step  2. Determine by

Step  6. Let . Put and . Update for times; that is, for compute where , and for all .

Remark 3. Algorithms 1 and 2 are mathematically equivalent. Throughout this paper, Algorithm 2 is given only for the purpose of analysis, so we only discuss Algorithm 2 in theory. In the experiments, we implement Algorithm 1.

3. Global Convergence

Define the level set by In order to establish the global convergence of Algorithm 2, similar to [4, 11], we need the following assumptions.

Assumption A. is continuously differentiable on an open convex set containing . Moreover the Jacobian of is symmetric, bounded, and positive definite on ; namely, there exist positive constants satisfying

Assumption B. is a good approximation to ; that is, where is a small quantity.

Remark 4. Assumption A implies

The relations in (24) can ensure that generated by (20) inherits symmetric and positive definiteness of . Thus, (19) has a unique solution for each . Moreover, the following lemma holds.

Lemma 5 (see Theorem  2.1 in [16] or see Lemma 3.4 of [11]). Let Assumption hold and let be generated by Algorithm 2. Then, for any and , there are positive constants , ; the following relations hold for at least values of .

By Assumption B, similar to [4, 9, 11, 15], it is easy to get the following lemma.

Lemma 6. Let Assumption hold and let be generated by Algorithm 2. Then is a descent direction for at ; that is, holds.

Based on the above lemma, by Assumption B, similar to Lemma  3.8 in [2], we can get the following lemma.

Lemma 7. Let Assumption B hold and let be generated by Algorithm 2. Then . Moreover, converges.

Lemma 8. Let Assumptions A and B hold. Then, in a finite number of backtracking steps, Algorithm 2 will produce an iterate .

Proof. It is sufficient for us to prove that the line search (11) is reasonable. By Lemma  3.8 in [2], we can deduce that, in a finite number of backtracking steps, is such that By (19), we get Thus By Assumption B, we have Using (19) again and , we obtain Setting and implies (11). This completes the proof.

Remark 9. The above lemma shows that Algorithm 2 is well defined. By a way similar to Lemma 3.2 and Corollary  3.4 in [5], it is not difficult to deduce that holds; we do not prove it anymore. Now we establish the global convergence theorem.

Theorem 10. Let Assumptions A and B hold. Then the sequence generated by Algorithm 2 converges to the unique solution of (1).

Proof. Lemma 7 implies that converges. If then every accumulation point of is a solution of (1). Assumption A means that (1) has only one solution. Moreover, since is bounded, has at least one accumulation point. Therefore itself converges to the unique solution of (1). Therefore, it suffices to verify (32).
If (18) holds for infinitely many ’s, then (32) is trivial. Otherwise, if (18) holds for only finitely many ’s, we conclude that Step  3 is executed for all sufficiently large. By (11), we have Since is bounded, by adding these inequalities, we get Then we have which together with (31) implies (32). This completes the proof.

4. Numerical Results

This section reports numerical results with Algorithm 1 and normal BFGS algorithm. The test problems with the associated initial guess are listed with

Problem 1. Exponential function 1: Initial guess: .

Problem 2. Exponential function 2: Initial guess: .

Problem 3. Trigonometric function: Initial guess: .

Problem 4. Singular function: Initial guess: .

Problem 5. Logarithmic function: Initial guess: .

Problem 6. Broyden tridiagonal function [17, pages 471-472]: Initial guess: .

Problem 7. Trigexp function [17, page 473]: Initial guess: .

Problem 8. Strictly convex function 1 [18, page 29]: is the gradient of . Consider Initial guess: .

Problem 9. Linear function-full rank: Initial guess: .

Problem 10. Penalty function: Initial guess: .

Problem 11. Variable dimensioned function: Initial guess: .

Problem 12. Tridiagonal system [19]: Initial guess: .

Problem 13. Five-diagonal system [19]: Initial guess: .

Problem 14. Extended Freudentein and Roth function ( is even) [20]: for Initial guess: .

Problem 15. Discrete boundry value problem [21]: Initial guess: .

Problem 16. Troesch problem [22]: Initial guess: .

In the experiments, the parameters in Algorithm 1 and the normal BFGS method were chosen as , , , , and is the unit matrix. All codes were written in MATLAB r2013b and run on PC with [email protected] GHz Core 2 CPU processor and 4.00 GB memory and Windows 7 operation system. We stopped the program when the condition was satisfied. Since the line search cannot always ensure the descent condition , uphill search direction may occur in the numerical experiments. In this case, the line search rule maybe fails. In order to avoid this case, the stepsize will be accepted if the searching time is larger than eight in the inner circle for the test problems. We also stop this program if the iteration number arrived at 1000. The columns of the tables have the following meaning.Dim: the dimension. NI: the total number of iterations.NG: the number of the norm function evaluations. Time: the CPU time in second.GN: the normal value of when the program stops.NaN: not-a-number, impling that the code fails to get a real value.Inf: returning the IEEE arithmetic representation for positive infinity or infinity which is also produced by operations like dividing by zero.

From the numerical results in Table 1, it is not difficult to show that the proposed method is more successful than the normal BFGS method. We can see that there exist many problems which can not be successfully solved by the normal BFGS method. Moreover, the normal BFGS method fails to get real value for several problems. Then we can conclude that the presented method is more competitive than the normal BFGS method.

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

Acknowledgments

The authors would like to thank Professor Gonglin Yuan of Guangxi University for his suggestions in the paper organization and help in the codes of the program, which save them much time to complete this paper. The authors also thank the referees for valuable comments and the editor for suggestions in the idea and the English of this paper, which improves the paper greatly. This work is supported by Guangxi NSF (Grant no. 2012GXNSFAA053002) and China NSF (Grant no. 11261006).