- About this Journal
- Abstracting and Indexing
- Aims and Scope
- Annual Issues
- Article Processing Charges
- Articles in Press
- Author Guidelines
- Bibliographic Information
- Citations to this Journal
- Contact Information
- Editorial Board
- Editorial Workflow
- Free eTOC Alerts
- Publication Ethics
- Reviewers Acknowledgment
- Submit a Manuscript
- Subscription Information
- Table of Contents
Journal of Applied Mathematics
Volume 2013 (2013), Article ID 147025, 14 pages
A Globally Convergent Hybrid Conjugate Gradient Method and Its Numerical Behaviors
1Department of Mathematics, Xidian University, Xi’an 710071, China
2College of Mathematics Science, Chong Qing Normal University, Chong Qing 40047, China
Received 21 December 2012; Revised 12 March 2013; Accepted 27 March 2013
Academic Editor: Martin Weiser
Copyright © 2013 Yuan-Yuan Huang et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
We consider a hybrid Dai-Yuan conjugate gradient method. We confirm that its numerical performance can be improved provided that this method uses a practical steplength rule developed by Dong, and the associated convergence is analyzed as well.
Consider the following problem of finding an such that where is the -dimensional Euclidean space and is continuous. Throughout this paper, this problem corresponds to optimality condition of a certain problem of minimizing which may be not easy to calculate or cannot be expressed as elementary functions. When the dimension is large, conjugate gradient methods can be efficient to solve problem (1). For any given starting point , a sequence is generated by the following recursive relation: with where is a steplength, is a descent direction, stands for , and is a parameter. Different choices of result in different nonlinear conjugate gradient methods. The Dai-Yuan (DY) formula  and the Hestenes-Stiefel (HS) formula  are two famous ones, and are given by respectively, where means the -norm and . Other well-known formulae for such as the Fletcher-Reeves formula , the Polak-Ribière-Polyak formula [4, 5], and the Hager-Zhang formula , please refer to [7, 8] for further survey.
In , the steplength is obtained by the following weak Wolfe line search: where . With the same line search, two hybrid versions related to the DY method and the HS method were proposed in , which generate the parameter by respectively. And initial numerical results in  suggested that the two hybrid conjugate gradient methods (abbreviated as DYHS and DYHS+, resp.) are efficient, especially the DYHS+ method performed better.
The line search plays an important role in the efficiency of conjugate gradient methods. Hager and Zhang  showed that the first condition (5) of the weak Wolfe line search limits the accuracy of a conjugate gradient method to the order of the square root of the machine precision (see also [11, 12]); thus, in order to get higher precision, they proposed approximate Wolfe conditions [6, 7], where and , which are usually used combining with the weak Wolfe line search. However, there is no theory to guarantee convergence in [6–8]. By following a referee’s suggestion, we adapt the approximate Wolfe conditions to a Dai-Yuan hybrid conjugate gradient method and investigate its numerical performances.
More recently, Dong designed a practical Armijo-type steplength rule  only using gradient, please see  for a more conceptual version, the steplength is chosen by the following steps: choose , compute some appropriate initial steplength , determine a real number (see (12)) and find to be the largest such that where is a nonnegative integer and .
The main differences between the practical steplength rule and the weak Wolfe conditions are that the former does not require function evaluations, it is in high accuracy and has broader application scope. The feature of high accuracy is supported by the corresponding theory analysis in [6, 11]. Numerical results reported in  also imply that the line search (9) is efficient and highly accurate. So, it is meaningful to imbed the line search into the hybrid conjugate gradient method with parameter and to check its efficiency.
This paper is to solve problem (1), which corresponds to optimality condition of a certain problem of minimizing . If the original function cannot be expressed as elementary functions, the weak Wolfe conditions cannot be applied directly, while the practical steplength rule (9) and the approximate Wolfe conditions (8) can be used to solve this kind of nonlinear unconstrained minimization problems. So, in order to investigate the numerical performances of the two modified methods with steplength rules (9) and (8) and to confirm their broader application scope, two classes of test problems are selected. One class is composed of unconstrained nonlinear optimization problems from the CUTEr library, and the other class is composed of some boundary value problems.
The rest of this paper is organized as follows. In Section 2, we give some basic definitions and properties used in this paper. In Section 3, we describe two modified versions of the Dai-Yuan hybrid conjugate gradient method with the line search (9) and the approximate Wolfe conditions (8) and illustrate that if is Lipschitz continuous, the former version is convergent in the sense that . In Section 4, we test the modified hybrid conjugate gradient methods over two classes of standard test problems and compare them with the DYHS method and the DYHS+ method. Finally, some conclusions are given in Section 5.
In this section, we give some basic definitions and related properties which will be used in the following discussions.
Assumption 1. Assume is -Lipschitz continuous in , that is, there exists a constant such that And its original function is bounded below in the level set .
Definition 2. A mapping is said to be -monotone on the set if there exists such that
By using the definition above, we know that if the gradient is -Lipschitz continuous, then for any given , the gradient must be -monotone along the ray for some . Then, how to evaluate such ? In , it is suggested to evaluate using the following approximation formula:
Lemma 3. For any given and for all , if is continuously differentiable and the given gradient is -monotone along the ray , then the following inequality holds:
Proof. Please see  for a detailed proof.
3. Algorithm and Its Convergence
In this section, we first formally describe the hybrid conjugate gradient method. Then, we give its two modified versions and illustrate that the modified version with steplength rule (9) is convergent.
Algorithm 4. Step 0. Choose . Set and .
Step 1. If , then stop, otherwise find such that certain steplength rule holds.
Step 2. Compute the new iterate by (2) and the new search direction by
Set and go to Step 1.
Denote (i)We abbreviate Algorithm 4 as MDYHS+, if the steplength rule in Step 1 is finding to be the largest () such that holds, where is defined in (12) and . (ii)And we abbreviate Algorithm 4 as MDYHS + 1, if in Step 1 is located to satisfy the approximate Wolfe conditions (8). It should be noticed that on the face of it, the approximate Wolfe conditions only use gradient information to locate the steplength, while they require function evaluations in practical implementations in [6–8], please refer to [8, pages 120–125] for details. So, the algorithm in [6–8] to generate the steplength satisfying the approximate Wolfe conditions (8) is not applicable. Here, we determine the steplength of the MDYHS+1 method following the inexact line search strategies of [15, Algorithm 2.6]. Detailed steps are described in Algorithm 6. The initial value of is taken to be .
Remark 5. The choice of comes from . Since it is important to use current information about the algorithm and the problem to make an initial guess of , the author of  uses the relation and to give an approximation to the optimal steplength through
Algorithm 6. Step 0. Set and . Choose . Set .
Step 1. If does not satisfy then set , and go to Step 2. If does not satisfy then set , and go to Step 3. Otherwise, set , and return.
Step 2. Set . Then go to Step 1.
Step 3. Set . Then go to Step 1.
Next, we analyze the convergence properties of the MDYHS+ method.
Lemma 7. Consider the previous MDYHS+ method. If for all , then the steplength is well-defined, namely, can be found to satisfy (16) after a finite number of trials. The search direction satisfies the sufficient descent condition Furthermore, .
Proof. We prove the desired results by induction. When , by using , we have We now show that the steplength will be determined within a finite number of trials. Otherwise, for any , the following inequality holds where . Since is continuous, taking the limits with respect to on the both sides of (23) yields this is a contradiction. Then, is well-defined and Assume that (21) holds for and . A similar discussion to the case yields is well-defined. Multiplying by we have that it follows from and that Then,
Proof. Since is -monotone, by using Lemma 3, we obtain
Combining (33) with (16) yields
which, together with (21), implies
Next, we follow  to consider two possible cases.
Case 1 . Consider Assumption 1, we have Thus, (30) holds.
Case 2 . From this case, we have . Since is the maximal number in such that (16) holds, then there must exist a natural number , when , there exists such that and violates (16), namely, then Combining it with the Lipschitz continuity of and the fact that yields which, together with (34), implies with . Using (40) recursively, we obtain Combining it with Assumption 1 yields namely,
Theorem 9. The MDYHS+ method is convergent in the sense that .
4. Numerical Experiments
In this section, we did some numerical experiments to test the performances of the MDYHS+ method and the MDYHS + 1 method. One purpose of this section is to compare them with the DYHS method and the DYHS+ method. The other purpose is to confirm their broader application scope by solving boundary value problems. So, two classes of test problems were selected here. One class was drawn from the CUTEr library [17, 18], and the other class came from . More information was described in the following subsections.
For the MDYHS+ method, we set and . For the MDYHS + 1 method, we followed  to choose and . And for the hybrid conjugate gradient methods DYHS and DYHS+, the values of and in (5) and (6) were taken to be 0.01 and 0.1, respectively. The initial value of was taken to be for the first iteration and for (see ). For all the methods, the largest trial times of choosing steplength at each iteration was taken to be , and the stopping criterion used was . In order to understand the numerical performance of each method deeply, we did numerical experiments with . Our computations were carried out using Matlab R2012a on a desktop computer with an Intel(R) Xeon(R) 2.40 GHZ CPU and 6.00 GB of RAM. The operating system is Linux: Ubuntu 8.04.
4.1. Tested by Some Problems in the CUTEr Library
In this subsection, we implemented four different hybrid conjugate gradient methods and compared their numerical performances. Because the DYHS method and the DYHS+ method need the information of the original function of (1), we selected a collection of test problems from the CUTEr library and listed them in Table 1. The first column “Prob.” denotes the problem number, and the columns “Name” and “” denote the name and the dimension of the problem, respectively. Since we were interested in large-scale problems, we only considered problems with size at least . The largest dimension was set to 10,000. Moreover, we accessed CUTEr functions from within Matlab R2012a by using Matlab interface.
Our numerical results were reported in Tables 2, 3, 4, 5, and 6 in the form of , where , , and stand for the number of iterations, the total trial times of the line search and the CPU time elapsed, respectively. For the DYHS+ and the DYHS, we let and be the number of function evaluations and the number of gradient evaluations, respectively, and set by automatic differentiation (see [10, 20] for details). Moreover, “—” means the method’s failure to achieve a prescribed accuracy when the number of iterations exceeded , and the test problems are represented in the form of #Pro.().
The performances of the four methods, relative to CPU time, were evaluated using the profiles of Dolan and Morè . That is, for the four methods, we plotted the fraction of problems for which each of the methods was within a factor of the best time. Figures 1, 2, and 3 showed the performance profiles referring to CPU time, the number of iterations and the total trial times of the line search, respectively. These figures revealed that the MDYHS+ method and the MDYHS + 1 method performed better than the DYHS method and the DYHS+ method. The performance profiles also showed that the MDYHS+ method and the MDYHS + 1 method were comparable and solved almost all of the test problems up to . Yet, the latter has no convergence.
4.2. Tested by Some Boundary Value Problems
In this section, we implemented the MDYHS+ method and the MDYHS + 1 method to solve some boundary value problems. See [22, Chapter 1] for the background of the boundary value problems.
In order to confirm the efficiency of the MDYHS+ method and the MDYHS + 1 method to solve this class of problems, We drew a set of 11 boundary value problems from  and listed them in Table 7, where the test problems were expressed by #Pro.() (#Pro. denotes the problem number in  and denotes the dimension), and the test results were listed in the form of .
From Table 7, we can see that both of the MDYHS+ method and the MDYHS + 1 method are efficient in solving boundary value problems. The MDYHS + 1 method seems a little better but has no convergence.
This paper has studied two modified versions of a Dai-Yuan hybrid conjugate gradient method with two different line searches only using gradient information and has proven that with the line search (9), it is convergent in the sense of . Then, we investigated the numerical behaviors of the two modified versions over two classes of standard test problems. From the numerical results, we can conclude that the two modified hybrid conjugate gradient methods are more efficient (especially in high precision) in solving large-scale nonlinear unconstrained minimization problems and have broader application scope. For example, they can be used to solve some boundary value problems, where functions are not explicit.
The authors are very grateful to the associate editor and the referees for their valuable suggestions. Meanwhile, the first author is also very grateful to Yunda Dong for suggesting her to write this paper and to add Section 4.2 to the revised version. This work was supported by National Science Foundation of China, no. 60974082.
- Y. H. Dai and Y. Yuan, “A nonlinear conjugate gradient method with a strong global convergence property,” SIAM Journal on Optimization, vol. 10, no. 1, pp. 177–182, 1999.
- M. R. Hestenes and E. Stiefel, “Methods of conjugate gradients for solving linear systems,” Journal of Research of the National Bureau of Standards, vol. 49, pp. 409–436, 1952.
- R. Fletcher and C. M. Reeves, “Function minimization by conjugate gradients,” The Computer Journal, vol. 7, pp. 149–154, 1964.
- E. Polak and G. Ribiere, “Note sur la convergence de méthodes de directions conjuguées,” Revue francaise d’informatique et derecherche opérationnelle, série rouge, vol. 3, no. 16, pp. 35–43, 1969.
- B. T. Polyak, “The conjugate gradient method in extremal problems,” USSR Computational Mathematics and Mathematical Physics, vol. 9, no. 4, pp. 94–112, 1969.
- W. W. Hager and H. Zhang, “A new conjugate gradient method with guaranteed descent and an efficient line search,” SIAM Journal on Optimization, vol. 16, no. 1, pp. 170–192, 2005.
- W. W. Hager and H. Zhang, “A survey of nonlinear conjugate gradient methods,” Pacific Journal of Optimization, vol. 2, no. 1, pp. 35–58, 2006.
- W. W. Hager and H. Zhang, “Algorithm 851: , a conjugate gradient method with guaranteed descent,” Association for Computing Machinery, vol. 32, no. 1, pp. 113–137, 2006.
- Y. H. Dai and Y. Yuan, “An efficient hybrid conjugate gradient method for unconstrained optimization,” Annals of Operations Research, vol. 103, pp. 33–47, 2001.
- Y.-h. Dai and Q. Ni, “Testing different conjugate gradient methods for large-scale unconstrained optimization,” Journal of Computational Mathematics, vol. 21, no. 3, pp. 311–320, 2003.
- R. Pytlak, Conjugate Gradient Algorithms in Nonconvex Optimization, vol. 89 of Nonconvex Optimization and its Applications, Springer, Berlin, Germany, 2009.
- Y.-H. Dai and C.-X. Kou, “A nonlinear conjugate gradient algorithm with an optimal property and an improved Wolfe line search,” SIAM Journal on Optimization, vol. 23, no. 1, pp. 296–320, 2013.
- Y. Dong, “A practical PR+ conjugate gradient method only using gradient,” Applied Mathematics and Computation, vol. 219, no. 4, pp. 2041–2052, 2012.
- Y. Dong, “New step lengths in conjugate gradient methods,” Computers & Mathematics with Applications, vol. 60, no. 3, pp. 563–571, 2010.
- A. S. Lewis and M. L. Overton, “Nonsmooth optimization via quasi-Newton methods,” Mathematical Programming A, 2012.
- C. Lemarechal, “A view of line search,” Optimization and Optimal Control, vol. 30, pp. 59–78, 1981, Lecture Notes on Control and Information Sciences.
- I. Bongartz, A. R. Conn, N. Gould, and P. L. Toint, “CUTE: constrained and unconstrained testing environment,” ACM Transactions on Mathematical Software, vol. 21, no. 1, pp. 123–160, 1995.
- N. I. M. Gould, D. Orban, and P. L. Toint, “CUTEr and SifDec: a constrained and unconstrained testing environment, revisited,” ACM Transactions on Mathematical Software, vol. 29, pp. 373–394, 2003.
- E. Spedicato and Z. Huang, “Numerical experience with Newton-like methods for nonlinear algebraic systems,” Computing, vol. 58, no. 1, pp. 69–89, 1997.
- A. Griewank, “On automatic differentiation,” in Mathematical Programming: Recent Developments and Applications, M. Iri and K. Tanabe, Eds., pp. 83–108, Kluwer Academic, 1989.
- E. D. Dolan and J. J. Moré, “Benchmarking optimization software with performance profiles,” Mathematical Programming A, vol. 91, no. 2, pp. 201–213, 2002.
- J. M. Ortega and W. C. Rheinboldt, Iterative Solution of Nonlinear Equations in Several Variables, Academic Press, New York, NY, USA, 1970.