Research Article | Open Access
A Truncated Descent HS Conjugate Gradient Method and Its Global Convergence
Recently, Zhang (2006) proposed a three-term modified HS (TTHS) method for unconstrained optimization problems. An attractive property of the TTHS method is that the direction generated by the method is always descent. This property is independent of the line search used. In order to obtain the global convergence of the TTHS method, Zhang proposed a truncated TTHS method. A drawback is that the numerical performance of the truncated TTHS method is not ideal. In this paper, we prove that the TTHS method with standard Armijo line search is globally convergent for uniformly convex problems. Moreover, we propose a new truncated TTHS method. Under suitable conditions, global convergence is obtained for the proposed method. Extensive numerical experiment show that the proposed method is very efficient for the test problems from the CUTE Library.
Consider the unconstrained optimization problem: where is continuously differentiable. Conjugate gradient methods are very important methods for solving (1.1), especially if the dimension is large. The methods are of the form where denotes the gradient of at is the step length obtained by a line search and is a scalar. The strong Wolfe line search is to find a step length such that where and . In the conjugate gradient methods field, it is also possible to use the Wolfe line search [1, 2], which calculates an satisfying (1.4) and In particular, some conjugate gradient methods admit to use the Armijo line search, namely, the step length can be obtained by letting satisfy where , and . Varieties of this method differ in the way of selecting . In this paper, we are interested in the HS method , namely, Here and throughout the paper, without specification, we always use to denote the Euclidian norm of vectors, and .
We refer to a book  and a recent review paper  about progress of the global convergence of conjugate gradient methods. We know that the study in the HS method has made great progress. In practical computation, the HS method is generally believed to be one of the most efficient conjugate gradient methods. Theoretically, the HS method has the property that the conjugacy condition always holds, which is independent of line search used. Expecting the fast convergence of the method, Dai and Liao  modified the numerator of the HS method to obtain DL method by using the secant condition of quasi-Newton methods. Due to Powell's  example, the DL method may not converge with exact line search for general function. Similar to the PRP+ method , Dai and Liao  proposed the DL+ method from a view of global convergence. In a further development of this update strategy, Yabe and Takano  used another modified secant condition in [10, 11] and proposed the YT and YT+ methods. Recently, Hager and Zhang  modified the HS method to propose a new conjugate gradient method called CG_DESCENT method. A good property of the CG_DESCENT method lies in that the direction satisfies sufficient descent property which is independent of the line search used. Hager and Zhang  proved that the CG_DESCENT method with Wolfe line search is globally convergent even for nonconvex problems. Zhang  proposed the TTHS method. The sufficient descent property of the TTHS method is also independent of line search used. In order to obtain the global convergence of the TTHS method, Zhang truncated the search direction of the TTHS method. Numerical experiments in  show the truncated TTHS method is not very effective. In this paper, we further study the TTHS method. We prove that the TTHS method with standard Armijo line search is globally convergent for uniformly convex problems. To improve the efficiency of the truncated TTHS method, we propose a new truncated strategy to the TTHS method. Under suitable conditions, global convergence is obtained for the proposed method. Numerical experiments show that the proposed method outperforms the known CG_DESCENT method.
2. Global Convergence Analysis
Recently, Zhang  proposed a three-term modified HS method as follows where An attractive property of the TTHS method is that the direction always satisfies which is independent of the line search used. In order to obtain the global convergence of the TTHS method, Zhang truncated the TTHS method as follows where and are positive constants. Zhang proved that the truncated TTHS method converges globally with the Wolfe line search (1.4) and (1.6). However, numerical results show the truncated TTHS method is not very effective. In this paper, we will study the TTHS method again. In the rest of this section, we will establish two preliminary convergent results for the TTHS method.(i)Uniformly convex functions: converge globally with the standard Armijo line search (1.7).(ii)General functions: converge globally with the strong Wolfe line search (1.4) and (1.5) by using a new truncated strategy to the TTHS method.
In order to establish the global convergence of our method, we need the following assumption.
Assumption 2.1. (i) The level set is bounded.
(ii) In some neighborhood of is continuously differentiable and its gradient is Lipschitz continuous, namely, there exists a constant such that
Under Assumption 2.1, It is clear that there exist positive constants and such that
Proof. If , then Combining with yields On the other hand, if by the line search rule, then does not satisfy (1.7). This implies By the mean-value theorem, there exists such that This together with (2.11) implies Since is Lipschitz continuous, the last inequality shows That is This implies that there is a constant such that Inequality (2.10) together with (2.16) shows that with some constant . Summing these inequalities, we obtain (2.7).
The following theorem establishes the global convergence of the TTHS method with the standard Armijo line search (1.7) for uniformly convex problems.
Proof. We proceed by contradiction. If (2.18) does not hold, there exists a positive constant such that for all
From Lemma 2.2, we get
Since is a uniformly convex function, there exists a constant such that
By (2.1), (2.4), (2.6), and (2.22), one has This implies This yield a contradiction with (2.20).
We are going to investigate the global convergence of the TTHS method with the strong Wolfe line search (1.4) and (1.5). Similar to the PRP+ method , we restrict . In this case, the search direction (2.1) may not be a descent direction. Noting the search direction (2.1) can be rewritten as where . Since the term may be zero in practice computation, we consider the following search direction where is a positive constant and . It is clear that the relation (2.2) always holds. For simplicity, we regard the method defined by (1.2) and (2.26) as the method (2.26).
Now, we describe a lemma for the search directions, which shows that they change slowly, asymptotically. The lemma is similar to [8, Lemma 3.4].
Lemma 2.4. Suppose that Assumption 2.1 holds. Consider be generated the method (2.26), where is obtained by the strong Wolfe line search (1.4) and (1.5). If there exists a constant such that for all then and where
Proof. Noting that , for otherwise (2.2) would imply Therefore, is well defined. Now, let us define and where
From (2.26), we have
Since are unit vectors, we have
Since , it follows that
Then we have
Now, we evaluate the quantity . If by (1.5), we have
By the strong Wolfe condition (1.5) and the relation (2.2), we obtain Inequalities (2.34) and (2.35) yield This implies If , then The relation (2.37) also holds. It follows from the definition of , Lemma 2.2, (2.27) and (2.37) that By (2.33), we get the conclusion (2.28).
Proof. We assume that the conclusion (2.39) is not true, then there exists a constant such that for all
The proof is divided into the following three steps.
Step 1. A bound for From (2.4), (2.6), and (2.34), we get Step 2. A bound on the steps . This is a modified version of [8, Theorem 4.3]. Observe that for any Taking norms and by the triangle inequality to the last equality, we get from (2.5) that Let be a positive integer, chosen large enough that where By Lemma 2.4, we can chose large enough that If and , then by (2.45) and the Cauchy-Schwarz inequality, we have Combining this with (2.43) yields where and .Step 3. A bound on the direction determined by (2.26). If , from (2.26), (2.27), (2.35), and (2.41), we have If then we know that the relation (2.48) also holds. Define , we conclude that for , Proceeding the similar proof as the case III of [15, Theorem 3.2], we get the conclusion.
3. Numerical Experiments
In this section, we report some numerical results. We tested 111 problems that are from the CUTE  library. We compared the performance of the method (2.26) with the CG_DESECENT method. The CG_DESECNT code can be obtained from Hager's web page at http://www.math.ufl.edu/hager/papers/CG.
In the numerical experiments, we used the latest version—Source code Fortran 77 Version 1.4 (November 14, 2005) with default parameters. We implemented the method (2.26) with the approximate Wolfe line search in . Namely, the method (2.26) used the same line search and parameters as the CG_DESECENT method. The stop criterion is that the inequality is satisfied or the iteration number exceeds . All codes were written in Fortran 77 and run on a PC with PIII 866 processor and 192 RAM memory and Linux operation system. Detailed results are posted at the following web site: http://hi.814e.com/wanyoucheng/results.htm.
We adopt the performance profiles by Dolan and Moré  to compare the performance between different methods. That is, for each method, we plot the fraction of problems for which the method is within a factor of the best time. The left side of the figure gives the percentage of the test problems for which a method is the fastest; the right side gives the percentage of the test problems that are successfully solved by each of the methods. The top curve is the method that solved the most problems in a time that is within a factor of the best time.
The authors are indebted to the anonymous referee for his helpful suggestions which improved the quality of this paper. The authors are very grateful also to Professor W. W. Hager and Dr. H. Zhang for their CG_DESCENT code and line search code. This paper is supported by the NSF of China via Grant 10771057.
- P. Wolfe, “Convergence conditions for ascent methods,” SIAM Review, vol. 11, no. 2, pp. 226–235, 1969.
- P. Wolfe, “Convergence conditions for ascent methods—II: some corrections,” SIAM Review, vol. 13, no. 2, pp. 185–188, 1971.
- M. R. Hestenes and E. Stiefel, “Methods of conjugate gradients for solving linear systems,” Journal of Research of the National Bureau of Standards, vol. 49, pp. 409–436, 1952.
- Y.-H. Dai and Y. Yuan, Nonlinear Conjugate Gradient Methods, Shanghai Science and Technology, Shanghai, China, 2000.
- W. W. Hager and H. Zhang, “A survey of nonlinear conjugate gradient methods,” Pacific Journal of Optimization, vol. 2, no. 1, pp. 35–58, 2006.
- Y.-H. Dai and L.-Z. Liao, “New conjugacy conditions and related nonlinear conjugate gradient methods,” Applied Mathematics and Optimization, vol. 43, no. 1, pp. 87–101, 2001.
- M. J. D. Powell, “Convergence properties of algorithms for nonlinear optimization,” SIAM Review, vol. 28, no. 4, pp. 487–500, 1986.
- J. C. Gilbert and J. Nocedal, “Global convergence properties of conjugate gradient methods for optimization,” SIAM Journal on Optimization, vol. 2, no. 1, pp. 21–42, 1992.
- H. Yabe and M. Takano, “Global convergence properties of nonlinear conjugate gradient methods with modified secant condition,” Computational Optimization and Applications, vol. 28, no. 2, pp. 203–225, 2004.
- J. Z. Zhang, N. Y. Deng, and L. H. Chen, “New quasi-Newton equation and related methods for unconstrained optimization,” Journal of Optimization Theory and Applications, vol. 102, no. 1, pp. 147–167, 1999.
- J. Z. Zhang and C. Xu, “Properties and numerical performance of quasi-Newton methods with modified quasi-Newton equations,” Journal of Computational and Applied Mathematics, vol. 137, no. 2, pp. 269–278, 2001.
- L. Zhang, Nonlinear conjugate gradient methods for optimization problems, Ph.D. thesis, College of Mathematics and Econometrics, Hunan University, Changsha, China, 2006.
- I. Bongartz, A. R. Conn, N. Gould, and P. L. Toint, “CUTE: constrained and unconstrained testing environment,” ACM Transactions on Mathematical Software, vol. 21, no. 1, pp. 123–160, 1995.
- E. D. Dolan and J. J. Moré, “Benchmarking optimization software with performance profiles,” Mathematical Programming, vol. 91, no. 2, pp. 201–213, 2002.
- W. W. Hager and H. Zhang, “A new conjugate gradient method with guaranteed descent and an efficient line search,” SIAM Journal on Optimization, vol. 16, no. 1, pp. 170–192, 2005.
Copyright © 2009 Wanyou Cheng and Zongguo Zhang. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.