A Truncated Descent HS Conjugate Gradient Method and Its Global Convergence

Cheng, Wanyou; Zhang, Zongguo

doi:https://doi.org/10.1155/2009/875097

Mathematical Problems in Engineering

On this page

Abstract Introduction Acknowledgments References Copyright Related Articles

Research Article | Open Access

Volume 2009 | Article ID 875097 | https://doi.org/10.1155/2009/875097

A Truncated Descent HS Conjugate Gradient Method and Its Global Convergence

Wanyou Cheng¹and Zongguo Zhang²

Academic Editor: Ekaterina Pavlovskaia

Received02 Dec 2008

Accepted08 Apr 2009

Published15 Jul 2009

Abstract

Recently, Zhang (2006) proposed a three-term modified HS (TTHS) method for unconstrained optimization problems. An attractive property of the TTHS method is that the direction generated by the method is always descent. This property is independent of the line search used. In order to obtain the global convergence of the TTHS method, Zhang proposed a truncated TTHS method. A drawback is that the numerical performance of the truncated TTHS method is not ideal. In this paper, we prove that the TTHS method with standard Armijo line search is globally convergent for uniformly convex problems. Moreover, we propose a new truncated TTHS method. Under suitable conditions, global convergence is obtained for the proposed method. Extensive numerical experiment show that the proposed method is very efficient for the test problems from the CUTE Library.

1. Introduction

Consider the unconstrained optimization problem: where is continuously differentiable. Conjugate gradient methods are very important methods for solving (1.1), especially if the dimension is large. The methods are of the form where denotes the gradient of at is the step length obtained by a line search and is a scalar. The strong Wolfe line search is to find a step length such that where and . In the conjugate gradient methods field, it is also possible to use the Wolfe line search [1, 2], which calculates an satisfying (1.4) and In particular, some conjugate gradient methods admit to use the Armijo line search, namely, the step length can be obtained by letting satisfy where , and . Varieties of this method differ in the way of selecting . In this paper, we are interested in the HS method [3], namely, Here and throughout the paper, without specification, we always use to denote the Euclidian norm of vectors, and .

We refer to a book [4] and a recent review paper [5] about progress of the global convergence of conjugate gradient methods. We know that the study in the HS method has made great progress. In practical computation, the HS method is generally believed to be one of the most efficient conjugate gradient methods. Theoretically, the HS method has the property that the conjugacy condition always holds, which is independent of line search used. Expecting the fast convergence of the method, Dai and Liao [6] modified the numerator of the HS method to obtain DL method by using the secant condition of quasi-Newton methods. Due to Powell's [7] example, the DL method may not converge with exact line search for general function. Similar to the PRP+ method [8], Dai and Liao [6] proposed the DL+ method from a view of global convergence. In a further development of this update strategy, Yabe and Takano [9] used another modified secant condition in [10, 11] and proposed the YT and YT+ methods. Recently, Hager and Zhang [5] modified the HS method to propose a new conjugate gradient method called CG_DESCENT method. A good property of the CG_DESCENT method lies in that the direction satisfies sufficient descent property which is independent of the line search used. Hager and Zhang [5] proved that the CG_DESCENT method with Wolfe line search is globally convergent even for nonconvex problems. Zhang [12] proposed the TTHS method. The sufficient descent property of the TTHS method is also independent of line search used. In order to obtain the global convergence of the TTHS method, Zhang truncated the search direction of the TTHS method. Numerical experiments in [12] show the truncated TTHS method is not very effective. In this paper, we further study the TTHS method. We prove that the TTHS method with standard Armijo line search is globally convergent for uniformly convex problems. To improve the efficiency of the truncated TTHS method, we propose a new truncated strategy to the TTHS method. Under suitable conditions, global convergence is obtained for the proposed method. Numerical experiments show that the proposed method outperforms the known CG_DESCENT method.

The paper is organized as follows. In Section 2, we propose our algorithm. Convergence analysis is provided under suitable conditions. Preliminary numerical results are presented in Section 3.

2. Global Convergence Analysis

Recently, Zhang [12] proposed a three-term modified HS method as follows where An attractive property of the TTHS method is that the direction always satisfies which is independent of the line search used. In order to obtain the global convergence of the TTHS method, Zhang truncated the TTHS method as follows where and are positive constants. Zhang proved that the truncated TTHS method converges globally with the Wolfe line search (1.4) and (1.6). However, numerical results show the truncated TTHS method is not very effective. In this paper, we will study the TTHS method again. In the rest of this section, we will establish two preliminary convergent results for the TTHS method.

(i)Uniformly convex functions: converge globally with the standard Armijo line search (1.7).(ii)General functions: converge globally with the strong Wolfe line search (1.4) and (1.5) by using a new truncated strategy to the TTHS method.

In order to establish the global convergence of our method, we need the following assumption.

Assumption 2.1. (i) The level set is bounded.
(ii) In some neighborhood of is continuously differentiable and its gradient is Lipschitz continuous, namely, there exists a constant such that

Under Assumption 2.1, It is clear that there exist positive constants and such that

Lemma 2.2. Suppose that Assumption 2.1 holds. Consider be generated by the TTHS method, where is obtained by the Armijo line search (1.7), one has

Proof. If , then Combining with yields On the other hand, if by the line search rule, then does not satisfy (1.7). This implies By the mean-value theorem, there exists such that This together with (2.11) implies Since is Lipschitz continuous, the last inequality shows That is This implies that there is a constant such that Inequality (2.10) together with (2.16) shows that with some constant . Summing these inequalities, we obtain (2.7).

The following theorem establishes the global convergence of the TTHS method with the standard Armijo line search (1.7) for uniformly convex problems.

Theorem 2.3. Suppose that Assumption 2.1 holds and is a uniformly convex function. Consider the TTHS method, where is obtained by the Armijo line search (1.7), one has that

Proof. We proceed by contradiction. If (2.18) does not hold, there exists a positive constant such that for all From Lemma 2.2, we get Since is a uniformly convex function, there exists a constant such that This means
By (2.1), (2.4), (2.6), and (2.22), one has This implies This yield a contradiction with (2.20).

We are going to investigate the global convergence of the TTHS method with the strong Wolfe line search (1.4) and (1.5). Similar to the PRP+ method [8], we restrict . In this case, the search direction (2.1) may not be a descent direction. Noting the search direction (2.1) can be rewritten as where . Since the term may be zero in practice computation, we consider the following search direction where is a positive constant and . It is clear that the relation (2.2) always holds. For simplicity, we regard the method defined by (1.2) and (2.26) as the method (2.26).

Now, we describe a lemma for the search directions, which shows that they change slowly, asymptotically. The lemma is similar to [8, Lemma 3.4].

Lemma 2.4. Suppose that Assumption 2.1 holds. Consider be generated the method (2.26), where is obtained by the strong Wolfe line search (1.4) and (1.5). If there exists a constant such that for all then and where

Proof. Noting that , for otherwise (2.2) would imply Therefore, is well defined. Now, let us define and where From (2.26), we have Since are unit vectors, we have Since , it follows that Then we have
Now, we evaluate the quantity . If by (1.5), we have
By the strong Wolfe condition (1.5) and the relation (2.2), we obtain Inequalities (2.34) and (2.35) yield This implies If , then The relation (2.37) also holds. It follows from the definition of , Lemma 2.2, (2.27) and (2.37) that By (2.33), we get the conclusion (2.28).

The next theorem establishes the global convergence of method (2.26) with the strong Wolfe line search (1.4) and (1.5). The proof of the theorem is similar to [15, Theorem 3.2].

Theorem 2.5. Suppose that Assumption 2.1 holds. Consider be generated by the method (2.26), where is obtained by the strong Wolfe line search (1.4) and (1.5), one has

Proof. We assume that the conclusion (2.39) is not true, then there exists a constant such that for all The proof is divided into the following three steps.
Step 1. A bound for From (2.4), (2.6), and (2.34), we get Step 2. A bound on the steps . This is a modified version of [8, Theorem 4.3]. Observe that for any Taking norms and by the triangle inequality to the last equality, we get from (2.5) that Let be a positive integer, chosen large enough that where By Lemma 2.4, we can chose large enough that If and , then by (2.45) and the Cauchy-Schwarz inequality, we have Combining this with (2.43) yields where and .Step 3. A bound on the direction determined by (2.26). If , from (2.26), (2.27), (2.35), and (2.41), we have If then we know that the relation (2.48) also holds. Define , we conclude that for , Proceeding the similar proof as the case III of [15, Theorem 3.2], we get the conclusion.

3. Numerical Experiments

In this section, we report some numerical results. We tested 111 problems that are from the CUTE [13] library. We compared the performance of the method (2.26) with the CG_DESECENT method. The CG_DESECNT code can be obtained from Hager's web page at http://www.math.ufl.edu/hager/papers/CG.

In the numerical experiments, we used the latest version—Source code Fortran 77 Version 1.4 (November 14, 2005) with default parameters. We implemented the method (2.26) with the approximate Wolfe line search in [5]. Namely, the method (2.26) used the same line search and parameters as the CG_DESECENT method. The stop criterion is that the inequality is satisfied or the iteration number exceeds . All codes were written in Fortran 77 and run on a PC with PIII 866 processor and 192 RAM memory and Linux operation system. Detailed results are posted at the following web site: http://hi.814e.com/wanyoucheng/results.htm.

We adopt the performance profiles by Dolan and Moré [14] to compare the performance between different methods. That is, for each method, we plot the fraction of problems for which the method is within a factor of the best time. The left side of the figure gives the percentage of the test problems for which a method is the fastest; the right side gives the percentage of the test problems that are successfully solved by each of the methods. The top curve is the method that solved the most problems in a time that is within a factor of the best time.

The curves in Figures 1, 2, 3, and 4 have the following meaning:

(i)cg-descent: the CG_DSCENT method with the approximate Wolfe line search proposed by Hager and Zhang [15];(ii)mhs+: the method (2.26) with the same line search as “cg-descent” and .

From Figures 1–4, it is clear that the “mhs+” method outperforms the “cg-descent” method.

Acknowledgments

The authors are indebted to the anonymous referee for his helpful suggestions which improved the quality of this paper. The authors are very grateful also to Professor W. W. Hager and Dr. H. Zhang for their CG_DESCENT code and line search code. This paper is supported by the NSF of China via Grant 10771057.

References

P. Wolfe, “Convergence conditions for ascent methods,” SIAM Review, vol. 11, no. 2, pp. 226–235, 1969.
View at: Publisher Site | Google Scholar | Zentralblatt MATH | MathSciNet
P. Wolfe, “Convergence conditions for ascent methods—II: some corrections,” SIAM Review, vol. 13, no. 2, pp. 185–188, 1971.
View at: Publisher Site | Google Scholar | Zentralblatt MATH | MathSciNet
M. R. Hestenes and E. Stiefel, “Methods of conjugate gradients for solving linear systems,” Journal of Research of the National Bureau of Standards, vol. 49, pp. 409–436, 1952.
View at: Google Scholar | Zentralblatt MATH | MathSciNet
Y.-H. Dai and Y. Yuan, Nonlinear Conjugate Gradient Methods, Shanghai Science and Technology, Shanghai, China, 2000.
W. W. Hager and H. Zhang, “A survey of nonlinear conjugate gradient methods,” Pacific Journal of Optimization, vol. 2, no. 1, pp. 35–58, 2006.
View at: Google Scholar | Zentralblatt MATH
Y.-H. Dai and L.-Z. Liao, “New conjugacy conditions and related nonlinear conjugate gradient methods,” Applied Mathematics and Optimization, vol. 43, no. 1, pp. 87–101, 2001.
View at: Publisher Site | Google Scholar | Zentralblatt MATH | MathSciNet
M. J. D. Powell, “Convergence properties of algorithms for nonlinear optimization,” SIAM Review, vol. 28, no. 4, pp. 487–500, 1986.
View at: Publisher Site | Google Scholar | Zentralblatt MATH | MathSciNet
J. C. Gilbert and J. Nocedal, “Global convergence properties of conjugate gradient methods for optimization,” SIAM Journal on Optimization, vol. 2, no. 1, pp. 21–42, 1992.
View at: Publisher Site | Google Scholar | Zentralblatt MATH | MathSciNet
H. Yabe and M. Takano, “Global convergence properties of nonlinear conjugate gradient methods with modified secant condition,” Computational Optimization and Applications, vol. 28, no. 2, pp. 203–225, 2004.
View at: Publisher Site | Google Scholar | Zentralblatt MATH | MathSciNet
J. Z. Zhang, N. Y. Deng, and L. H. Chen, “New quasi-Newton equation and related methods for unconstrained optimization,” Journal of Optimization Theory and Applications, vol. 102, no. 1, pp. 147–167, 1999.
View at: Publisher Site | Google Scholar | Zentralblatt MATH | MathSciNet
J. Z. Zhang and C. Xu, “Properties and numerical performance of quasi-Newton methods with modified quasi-Newton equations,” Journal of Computational and Applied Mathematics, vol. 137, no. 2, pp. 269–278, 2001.
View at: Publisher Site | Google Scholar | Zentralblatt MATH | MathSciNet
L. Zhang, Nonlinear conjugate gradient methods for optimization problems, Ph.D. thesis, College of Mathematics and Econometrics, Hunan University, Changsha, China, 2006.
I. Bongartz, A. R. Conn, N. Gould, and P. L. Toint, “CUTE: constrained and unconstrained testing environment,” ACM Transactions on Mathematical Software, vol. 21, no. 1, pp. 123–160, 1995.
View at: Publisher Site | Google Scholar | Zentralblatt MATH
E. D. Dolan and J. J. Moré, “Benchmarking optimization software with performance profiles,” Mathematical Programming, vol. 91, no. 2, pp. 201–213, 2002.
View at: Publisher Site | Google Scholar | Zentralblatt MATH | MathSciNet
W. W. Hager and H. Zhang, “A new conjugate gradient method with guaranteed descent and an efficient line search,” SIAM Journal on Optimization, vol. 16, no. 1, pp. 170–192, 2005.
View at: Publisher Site | Google Scholar | Zentralblatt MATH | MathSciNet

Copyright

Copyright © 2009 Wanyou Cheng and Zongguo Zhang. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

PDF Download Citation

Download other formats

Order printed copies

Views

641

Downloads

871

Citations