Research Article | Open Access
Global Convergence of a Spectral Conjugate Gradient Method for Unconstrained Optimization
A new nonlinear spectral conjugate descent method for solving unconstrained optimization problems is proposed on the basis of the CD method and the spectral conjugate gradient method. For any line search, the new method satisfies the sufficient descent condition . Moreover, we prove that the new method is globally convergent under the strong Wolfe line search. The numerical results show that the new method is more effective for the given test problems from the CUTE test problem library (Bongartz et al., 1995) in contrast to the famous CD method, FR method, and PRP method.
Unconstrained optimization problems have extensive applications, for example, in petroleum exploration, aerospace, transportation, and other domains. However, the amount of necessary calculation also grows exponentially with the increasing scale of the problem. Therefore, it is required to develop new methods to solve the large-scale unconstrained optimization problems. The primary objective of this paper is to study the global convergence properties and practical computational performance of a new nonlinear spectral conjugate gradient method for unconstrained optimization problems without restarts, and with suitable conditions.
Consider the following unconstrained optimization problem where is a continuously differentiable function and its gradient is available.
Due to need less computer memory especially, conjugate gradient method is very appealing for solving (1.1) when the number of variable is large. This method can be described by the following where is the current iteration, is the step-size which is determined by some line search, is the search direction, is the gradient of at the point , and is a scalar which determines the different conjugate gradient methods [1, 2]. There are many well-known formulas for , such as the Fletcher-Reeves (FR) , Polak-Ribiere-Polyak (PRP) , Hestenes-Stiefel (HS) , and conjugate-descent (CD) . The conjugate gradient method is a powerful line search method for solving optimization problems, and it remains very popular for engineers and mathematicians who are interested in solving large-scale problems. This method can avoid, like steepest descent method, the computation and storage of some matrices associated with the Hessian of objective functions.
The original CD method proposed by Fletcher , in which is defined by the following where denotes the Euclidean norm of vectors. An important property of the CD method is that the method will produce a descent direction under the strong Wolfe line search. In the strong Wolfe line search, the step-size is required to satisfy the following: where <. Some good results about the CD method have also been reported in recent years [7–10].
Another popular method to solving problem (1.1) is the spectral gradient method, which was developed originally by Barzilai and Borwein  in 1988. Raydan  further introduced the spectral gradient method for potentially large-scale unconstrained optimization problems. The main feature of this method is that only gradient directions are used at each line search whereas a nonmonotone strategy guarantees global convergence. What is more, this method outperforms sophisticated conjugate gradient method in many problems. Birgin and Martínez  proposed three kinds of spectral conjugate gradient methods. The direction is given by the following way where the parameter is computed by the following respectively, and is taken to be the spectral gradient and computed by the following where . The numerical results show that these methods are very effective. Unfortunately, they cannot guarantee to generate descent directions. Based on the FR conjugate gradient method, Zhang et al.  take modification to the FR method such that the direction generated is always a descent direction. The is defined by the following where is specified in , and . They prove that this method can guarantee to generate descent directions and is globally convergent.
In this paper, motivated by success of the spectral gradient method, we propose a new spectral conjugate gradient method by combining the CD method and the spectral gradient method. The direction is given by the following way: where is specified by the following Under some mild conditions, we give the global convergence of the new spectral conjugate gradient method with the strong Wolfe line search.
This paper is organized as follows. In Section 2, we propose our algorithm, and global convergence analysis is provided under suitable conditions. Preliminary numerical results are presented in Section 3.
2. Global Convergence Analysis
In order to establish the global convergence of our method, we need the following assumption on objective function, which have often been used in the literatures to analyze the global convergence of nonlinear conjugate gradient method and the spectral conjugate gradient method with inexact line searches.
Assumption 2.1. (i) The level set = is bounded, where is the starting point.
(ii) In some neighborhood of , the objective function is continuously differentiable, and its gradient is Lipschitz continuous, that is, there exists a constant such that Now we present the new spectral conjugate gradient method as follows.
Algorithm 2.2 (SCD method). Step 1. Data: , . Set , if , then stop.
Step 2. Compute by some line search.
Step 3. Let , , if , then stop.
Step 4. Compute by (1.11), and generate by (1.10).
Step 5. Set , go to Step 2.
The following theorem shows that Algorithm 2.2 possesses the sufficient descent condition for any line search.
Theorem 2.3. Let the sequences and be generated by Algorithm 2.2, and let the step-size be determined by any line search, then
Proof. We can prove the conclusion by induction. From , the conclusion (2.2) holds for . Now we assume that the conclusion is true for and , that is, . In the following, we need to prove that the conclusion holds for .
If , then . From (1.4), (1.10), and (1.12), we have If , then . From (1.10), (1.12), and our assumption: , we have From (2.3) and (2.4), we know that the conclusion (2.2) holds for .
The conclusion of the following lemma, often called the Zoutendijk condition, is used to prove the global convergence of nonlinear conjugate gradient methods. It was originally given by Zoutendijk .
Introduction. The strong Wolfe line search is a special case of the Wolfe line search, so the Lemma 2.6 also holds under the strong Wolfe line search. What is more, we can also use the same method to prove the Zoutendijk condition holding for the spectral conjugate gradient method.
The following theorem establishes the global convergence of the new spectral conjugate gradient method with the strong Wolfe line search for the general functions.
Proof. According to the given conditions, Lemma 2.6 all hold. In the following, we will obtain the conclusion (2.7) by contradiction. Suppose by contradiction that there exists a positive constant such that holds for . On the one hand, rewriting (1.11) as follows and squaring both side of it, we get From the above equation and Remark 2.5, we have Dividing the above inequality by , we have Using (2.12) recursively and noting that , we get Then we get from (2.13) and (2.8) that which indicates This contradicts the Zoutendijk condition (2.6). Therefore the conclusion (2.7) holds.
3. Numerical Experiments
In this section, we report some numerical results. Under the strong Wolfe line search, we compare the performances of the CPU time and the iteration number of the SCD method with that of CD, FR, and PRP methods on the given test problems which come from the CUTE test problem library . The parameters in the strong Wolfe line search are met the following requirements: and . We stop the iteration if the iteration number exceeds 5000 or the inequity is satisfied. All codes were written in FORTRAN 90 and run on a PC with 2.0 GHz CPU processor and 512 MB memory and Windows XP operation system.
In Table 1, the column “Problem” represents the problem’s name. “Dim” denotes the problem’s dimension. The detailed numerical results are listed in the form NI/CPU, where NI and CPU denote the iteration number and the CPU time in seconds, respectively. From Table 1, some CPU times are zero. This is because that the CPU times are retained two digits in our numerical experiments. In order to compare these methods in the performance of the CPU time, we use a constant instead of when CPU time of the problem is zero, where denotes the set of the test problems whose CPU time is not zero.
In this paper, we adopt the performance profiles by Dolan and Moré  to compare the SCD method with CD, FR, and PRP methods. Figure 1 shows the performance profiles with respect to CPU time means that for each method, we plot the fraction of problems for which the method is within a factor τ of the best time. The left side of the figure gives the percentage of the test problems for which a method is the fastest; the right side gives the percentage of the test problems that are successfully solved by each of the methods. The top curve is the method that solved the most problems in a time that was within a factor τ of the best time. Using the same method, we also test on the iteration number, see Figure 2.
From Figures 1–2, the SCD method performs a little worse than the famous PRP method in the performances of the CPU time and the iteration number. However, the SCD method has absolute potential compared with the famous CD and FR methods in the performances of the CPU time and the iteration number. So the efficiency of the SCD method is encouraging.
The authors wish to express their heartfelt thanks to the anonymous referees and the editor for their detailed and helpful suggestions for revising the paper. This work was supported by The Nature Science Foundation of Chongqing Education Committee (KJ121112).
- G. H. Yu, Nonlinear self-scaling conjugate gradient methods for large-scale optimization problems [Ph.D. thesis], Sun Yat-Sen University, Guangzhou, China, 2007.
- G. Yuan and Z. Wei, “New line search methods for unconstrained optimization,” Journal of the Korean Statistical Society, vol. 38, no. 1, pp. 29–39, 2009.
- R. Fletcher and C. M. Reeves, “Function minimization by conjugate gradients,” The Computer Journal, vol. 7, pp. 149–154, 1964.
- E. Polak and G. Ribière, “Note sur la convergence de méthodes de directions conjuguées,” vol. 3, no. 16, pp. 35–43, 1969.
- M. R. Hestenes and E. Stiefel, “Methods of conjugate gradients for solving linear systems,” Journal of Research of the National Bureau of Standards, vol. 49, pp. 409–436, 1952.
- R. Fletcher, Practical Methods of Optimization: Unconstrained Optimization, vol. 1, John Wiley & Sons, New York, NY, USA, 1987.
- B. Qu, G.-F. Hu, and X.-C. Zhang, “A global convergence result for the conjugate descent method,” Journal of Dalian University of Technology, vol. 42, no. 1, pp. 13–16, 2002.
- X.-W. Du, L.-Q. Ye, and C.-X. Xu, “Global convergence of a class of unconstrained optimal methods that include the conjugate descent method,” Journal of Applied Mathematics, vol. 18, no. 2, pp. 120–122, 2001.
- C.-Y. Pan and L.-P. Chen, “A class of efficient new descent methods,” Acta Mathematicae Applicatae Sinica, vol. 30, no. 1, pp. 88–98, 2007.
- Y. Dai and Y. Yuan, “Convergence properties of the conjugate descent method,” Advances in Mathematics, vol. 25, no. 6, pp. 552–562, 1996.
- J. Barzilai and J. M. Borwein, “Two-point step size gradient methods,” IMA Journal of Numerical Analysis, vol. 8, no. 1, pp. 141–148, 1988.
- M. Raydan, “The Barzilai and Borwein gradient method for the large scale unconstrained minimization problem,” SIAM Journal on Optimization, vol. 7, no. 1, pp. 26–33, 1997.
- E. G. Birgin and J. M. Martínez, “A spectral conjugate gradient method for unconstrained optimization,” Applied Mathematics and Optimization, vol. 43, no. 2, pp. 117–128, 2001.
- L. Zhang, W. Zhou, and D. Li, “Global convergence of a modified Fletcher-Reeves conjugate gradient method with Armijo-type line search,” Numerische Mathematik, vol. 104, no. 4, pp. 561–572, 2006.
- G. Zoutendijk, “Nonlinear programming, computational methods,” in Integer and Nonlinear Programming, pp. 37–86, North-Holland, Amsterdam, The Netherlands, 1970.
- I. Bongartz, A. R. Conn, N. Gould, and P. L. Toint, “CUTE: constrained and unconstrained testing environment,” ACM Transactions on Mathematical Software, vol. 21, no. 1, pp. 123–160, 1995.
- E. D. Dolan and J. J. Moré, “Benchmarking optimization software with performance profiles,” Mathematical Programming, vol. 91, no. 2, pp. 201–213, 2002.
Copyright © 2012 Jinkui Liu and Youyi Jiang. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.