Global Convergence of a Nonlinear Conjugate Gradient Method

Jin-kui, Liu; Li-min, Zou; Xiao-qian, Song

doi:https://doi.org/10.1155/2011/463087

Mathematical Problems in Engineering

On this page

Abstract Introduction Conclusions Acknowledgments References Copyright Related Articles

Research Article | Open Access

Volume 2011 | Article ID 463087 | https://doi.org/10.1155/2011/463087

Global Convergence of a Nonlinear Conjugate Gradient Method

Liu Jin-kui,¹Zou Li-min,¹and Song Xiao-qian¹

Academic Editor: Piermarco Cannarsa

Received05 Mar 2011

Revised21 May 2011

Accepted04 Jun 2011

Published21 Jul 2011

Abstract

A modified PRP nonlinear conjugate gradient method to solve unconstrained optimization problems is proposed. The important property of the proposed method is that the sufficient descent property is guaranteed independent of any line search. By the use of the Wolfe line search, the global convergence of the proposed method is established for nonconvex minimization. Numerical results show that the proposed method is effective and promising by comparing with the VPRP, CG-DESCENT, and DL⁺ methods.

1. Introduction

The nonlinear conjugate gradient method is one of the most efficient methods in solving unconstrained optimization problems. It comprises a class of unconstrained optimization algorithms which is characterized by low memory requirements and simplicity.

Consider the unconstrained optimization problem where is continuously differentiable, and its gradient is available.

The iterates of the conjugate gradient method for solving (1.1) are given by where stepsize is positive and computed by certain line search, and the search direction is defined by where , and is a scalar. Some well-known conjugate gradient methods include Polak-Ribière-Polyak (PRP) method [1, 2], Hestenes-Stiefel (HS) method [3], Hager-Zhang (HZ) method [4], and Dai-Liao (DL) method [5]. The parameters of these methods are specified as follows: where is the Euclidean norm and . We know that if is a strictly convex quadratic function, the above methods are equivalent in the case that an exact line search is used. If is nonconvex, their behaviors may be further different.

In the past few years, the PRP method has been regarded as the most efficient conjugate gradient method in practical computation. One remarkable property of the PRP method is that it essentially performs a restart if a bad direction occurs (see [6]). Powell [7] constructed an example which showed that the PRP method can cycle infinitely without approaching any stationary point even if an exact line search is used. This counterexample also indicates that the PRP method has a drawback that it may not globally be convergent when the objective function is nonconvex. Powell [8] suggested that the parameter is negative in the PRP method and defined as Gilbert and Nocedal [9] considered Powell’s suggestion and proved the global convergence of the modified PRP method for nonconvex functions under the appropriate line search. In addition, there are many researches on convergence properties of the PRP method (see [10–12]).

In recent years, much effort has been investigated to create new methods, which not only possess global convergence properties for general functions but also are superior to original methods from the computation point of view. For example, Yu et al. [13] proposed a new nonlinear conjugate gradient method in which the parameter is defined on the basic of such as where (in this paper, we call this method as VPRP method). And they proved the global convergence of the VPRP method with the Wolfe line search. Hager and Zhang [4] discussed the global convergence of the HZ method for strong convex functions under the Wolfe line search and Goldstein line search. In order to prove the global convergence for general functions, Hager and Zhang modified the parameter as where The corresponding method of (1.7) is the famous CG-DESCENT method.

Dai and Liao [5] proposed a new conjugate condition, that is, Under the new conjugate condition, they proved global convergence of the DL conjugate gradient method for uniformly convex functions. According to Powell’s suggestion, Dai and Liao gave a modified parameter The corresponding method of (1.10) is the famous DL⁺ method. Under the strong Wolfe line search, they researched the global convergence of the DL⁺ method for general functions. Zhang et al. [14] proposed a modified DL conjugate gradient method and proved its global convergence. Moreover, some researchers have been studying a new type of method called the spectral conjugate gradient method (see [15–17]).

This paper is organized as follows: in the next section, we propose a modified PRP method and prove its sufficient descent property. In Section 3, the global convergence of the method with the Wolfe line search is given. In Section 4, numerical results are reported. We have a conclusion in the last section.

2. Modified PRP Method

In this section, we propose a modified PRP conjugate gradient method in which the parameter is defined on the basic of as follows: in which . We introduce the modified PRP method as follows.

2.1. Modified PRP (MPRP) Method

Step 1. Set , , and , if , then stop.

Step 2. Compute by some inexact line search.

Step 3. Let , , if , then stop.

Step 4. Compute by (2.1), and generate by (1.3).

Step 5. Set , and go to Step 2.

In the convergence analyses and implementations of conjugate gradient methods, one often requires the inexact line search to satisfy the Wolfe line search or the strong Wolfe line search. The Wolfe line search is to find such that where . The strong Wolfe line search consists of (2.2) and the following strengthened version of (2.3):

Moreover, in most references, we can see that the sufficient descent condition is always given which plays a vital role in guaranteeing the global convergence properties of conjugate gradient methods. But, in this paper, can satisfy (2.5) without any line search.

Theorem 2.1. Consider any method (1.2)-(1.3), where . If for all , then

Proof. Multiplying (1.3) by , we get If , from (2.7), we know that the conclusion (2.6) holds. If , the proof is divided into two cases in the following.
Firstly, if , then from (2.1) and (2.7), one has
Secondly, if , then from (2.7), we also have From the above, the conclusion (2.6) holds under any line search.

3. Global Convergences of the Modified PRP Method

In order to prove the global convergence of the modified PRP method, we assume that the objective function satisfies the following assumption.

Assumption H
(i) The level set is bounded, that is, there exists a positive constant such that for all , .
(ii) In a neighborhood of , is continuously differentiable and its gradient is Lipchitz continuous, namely, there exists a constant such that Under these assumptions on , there exists a constant such that
The conclusion of the following lemma, often called the Zoutendijk condition, is used to prove the global convergence properties of nonlinear conjugate gradient methods. It was originally given by Zoutendijk [18].

Lemma 3.1. Suppose that, Assumption H holds. Consider any iteration of (1.2)-(1.3), where satisfies for and satisfies the Wolfe line search, then

Lemma 3.2. Suppose that Assumption H holds. Consider the method (1.2)-(1.3), where , and satisfies the Wolfe line search and (2.6). If there exists a constant , such that then one has where .

Proof. From (2.1) and (3.4), we get By (2.6) and (3.6), we know that for each .
Define the quantities By (1.3), one has Since is unit vector, we get From and the above equation, one has By (2.1), (3.4), and (3.6), one has From (3.3), (2.6), (3.4), and (3.11), one has so By (3.10) and the above inequality, one has

Lemma 3.3. Suppose that Assumption H holds. If (3.4) holds, then has property (*), that is, (1)there exists a constant , such that ,(2)there exists a constant , such that .

Proof. From Assumption (ii), we know that (3.2) holds. By (2.1), (3.2), and (3.4), one has Define . If , then from (2.1), (3.1), (3.2), and (3.4), one has

Lemma 3.4 (see [19]). Suppose that Assumption H holds. Let and be generated by (1.2)-(1.3), in which satisfies the Wolfe line search and (2.6). If has the property (*) and (3.4) holds, then there exits , for any and , for all , such that where , denotes the number of the .

Theorem 3.5. Suppose that Assumption H holds. Let and be generated by (1.2)-(1.3), in which satisfies the Wolfe line search and (2.6), , then one has

Proof. To obtain this result, we proceed by contradiction. Suppose that (3.18) does not hold, which means that there exists such that so, we know that Lemmas 3.2 and 3.4 hold.
We also define , then for all , one has where , that is, From Assumption H, we know that there exists a constant such that From (3.21) and the above inequality, one has Let be a positive integer and where has been defined in Lemma 3.4. From Lemma 3.2, we know that there exists such that From the Cauchy-Schwartz inequality and (3.24), , one has By Lemma 3.4, we know that there exists such that It follows from (3.23), (3.25), and (3.26) that From (3.27), one has , which is a contradiction with the definition of . Hence, which completes the proof.

4. Numerical Results

In this section, we compare the modified PRP conjugate gradient method, denoted the MPRP method, to VPRP method, CG-DESCENT method, and DL⁺ method under the strong Wolfe line search about problems [20] with the given initial points and dimensions. The parameters are chosen as follows: , , , , and . If is satisfied, we will stop the program. The program will be also stopped if the number of iteration is more than ten thousands. All codes were written in Matlab 7.0 and run on a PC with 2.0 GHz CPU processor and 512 MB memory and Windows XP operation system.

The numerical results of our tests with respect to the MPRP method, VPRP method, CG-DESCENT method, and DL⁺ method are reported in Tables 1, 2, 3, 4, respectively. In the tables, the column “Problem” represents the problem’s name in [20], and “CPU,” “NI,” “NF,” and “NG” denote the CPU time in seconds, the number of iterations, function evaluations, gradient evaluations, respectively. “Dim” denotes the dimension of the tested problem. If the limit of iteration was exceeded, the run was stopped, and this is indicated by NaN.

In this paper, we will adopt the performance profiles by Dolan and Moré [21] to compare the MPRP method to the VPRP method, CG-DESCENT method, and DL⁺ method in the CPU time, the number of iterations, function evaluations, and gradient evaluations performance, respectively (see Figures 1, 2, 3, 4). In figures,

Figures 1–4 show the performance of the four methods relative to CPU time, the number of iterations, the number of function evaluations, and the number of gradient evaluations, respectively. For example, the performance profiles with respect to CPU time means that for each method, we plot the fraction of problems for which the method is within a factor of the best time. The left side of the figure gives the percentage of the test problems for which a method is the fastest; the right side gives the percentage of the test problems that are successfully solved by each of the methods. The top curve is the method that solved of the most problems in a time that was within a factor of the best time.

Obviously, Figure 1 shows that MPRP method outperforms VPRP method, CG-DESCENT method, and DL⁺ method for the given test problems in the CPU time. Figures 2–4 show that the MPRP method also has the best performance with respect to the number of iterations and function and gradient evaluations since it corresponds to the top curve. So, the MPRP method is computationally efficient.

5. Conclusions

We have proposed a modified PRP method on the basic of the PRP method, which can generate sufficient descent directions with inexact line search. Moreover, we proved that the proposed modified method converge globally for general nonconvex functions. The performance profiles showed that the proposed method is also very efficient.

Acknowledgments

The authors wish to express their heartfelt thanks to the referees and Professor Piermarco Cannarsa for their detailed and helpful suggestions for revising the paper. This work was supported by The Nature Science Foundation of Chongqing Education Committee (KJ091104) and Chongqing Three Gorge University (09ZZ-060).

References

E. Polak and G. Ribire, “Note sur la xonvergence de directions conjugees,” Rev Francaise Informat Recherche Operatinelle 3e Annee, vol. 16, pp. 35–43, 1969.
View at: Google Scholar
B. T. Polak, “The conjugate gradient method in extreme problems,” USSR Computational Mathematics and Mathematical Physics, vol. 9, pp. 94–112, 1969.
View at: Google Scholar
M. Al-Baali, “Descent property and global convergence of the Fletcher-Reeves method with inexact line search,” IMA Journal of Numerical Analysis, vol. 5, no. 1, pp. 121–124, 1985.
View at: Publisher Site | Google Scholar | Zentralblatt MATH
W. W. Hager and H. Zhang, “A new conjugate gradient method with guaranteed descent and an efficient line search,” SIAM Journal on Optimization, vol. 16, no. 1, pp. 170–192, 2005.
View at: Publisher Site | Google Scholar | Zentralblatt MATH
Y. H. Dai and L. Z. Liao, “New conjugacy conditions and related nonlinear conjugate gradient methods,” Applied Mathematics and Optimization, vol. 43, no. 1, pp. 87–101, 2001.
View at: Publisher Site | Google Scholar | Zentralblatt MATH
W. W. Hager and H. Zhang, “A survey of nonlinear conjugate gradient methods,” Pacific Journal of Optimization, vol. 2, no. 1, pp. 35–58, 2006.
View at: Google Scholar | Zentralblatt MATH
M. J. D. Powell, “Nonconvex minimization calculations and the conjugate gradient method,” in Numerical Analysis (Dundee, 1983), vol. 1066 of Lecture Notes in Mathematics, pp. 122–141, Springer, Berlin, Germany, 1984.
View at: Publisher Site | Google Scholar | Zentralblatt MATH
M. J. D. Powell, “Convergence properties of algorithms for nonlinear optimization,” SIAM Review, vol. 28, no. 4, pp. 487–500, 1986.
View at: Publisher Site | Google Scholar | Zentralblatt MATH
J. C. Gilbert and J. Nocedal, “Global convergence properties of conjugate gradient methods for optimization,” SIAM Journal on Optimization, vol. 2, no. 1, pp. 21–42, 1992.
View at: Publisher Site | Google Scholar | Zentralblatt MATH
L. Zhang, W. Zhou, and D. Li, “A descent modified Polak-Ribière-Polyak conjugate gradient method and its global convergence,” IMA Journal of Numerical Analysis, vol. 26, no. 4, pp. 629–640, 2006.
View at: Publisher Site | Google Scholar | Zentralblatt MATH
Z. Wei, G. Y. Li, and L. Qi, “Global convergence of the Polak-Ribière-Polyak conjugate gradient method with an Armijo-type inexact line search for nonconvex unconstrained optimization problems,” Mathematics of Computation, vol. 77, no. 264, pp. 2173–2193, 2008.
View at: Publisher Site | Google Scholar | Zentralblatt MATH
L. Grippo and S. Lucidi, “A globally convergent version of the Polak-Ribière conjugate gradient method,” Mathematical Programming, vol. 78, no. 3, pp. 375–391, 1997.
View at: Publisher Site | Google Scholar | Zentralblatt MATH
G. Yu, Y. Zhao, and Z. Wei, “A descent nonlinear conjugate gradient method for large-scale unconstrained optimization,” Applied Mathematics and Computation, vol. 187, no. 2, pp. 636–643, 2007.
View at: Publisher Site | Google Scholar | Zentralblatt MATH
J. Zhang, Y. Xiao, and Z. Wei, “Nonlinear conjugate gradient methods with sufficient descent condition for large-scale unconstrained optimization,” Mathematical Problems in Engineering, vol. 2009, Article ID 243290, 16 pages, 2009.
View at: Publisher Site | Google Scholar | Zentralblatt MATH
Y. Xiao, Q. Wang, and D. Wang, “Notes on the Dai-Yuan-Yuan modified spectral gradient method,” Journal of Computational and Applied Mathematics, vol. 234, no. 10, pp. 2986–2992, 2010.
View at: Publisher Site | Google Scholar | Zentralblatt MATH
Z. Wan, Z. Yang, and Y. Wang, “New spectral PRP conjugate gradient method for unconstrained optimization,” Applied Mathematics Letters, vol. 24, no. 1, pp. 16–22, 2011.
View at: Publisher Site | Google Scholar | Zentralblatt MATH
L. Zhang, W. Zhou, and D. Li, “Global convergence of a modified Fletcher-Reeves conjugate gradient method with Armijo-type line search,” Numerische Mathematik, vol. 104, no. 4, pp. 561–572, 2006.
View at: Publisher Site | Google Scholar | Zentralblatt MATH
G. Zoutendijk, “Nonlinear programming, computational methods,” in Integer and Nonlinear Programming, J. A. badie, Ed., pp. 37–86, North-Holland, Amsterdam, Netherlands, 1970.
View at: Google Scholar | Zentralblatt MATH
Y. H. Dai and Y. Yuan, Nonlinear Conjugate Gradient Method, vol. 10, Shanghai Scientific & Technical, Shanghai, China, 2000.
J. J. Moré, B. S. Garbow, and K. E. Hillstrom, “Testing unconstrained optimization software,” Association for Computing Machinery Transactions on Mathematical Software, vol. 7, no. 1, pp. 17–41, 1981.
View at: Publisher Site | Google Scholar | Zentralblatt MATH
E. D. Dolan and J. J. Moré, “Benchmarking optimization software with performance profiles,” Mathematical Programming, vol. 91, no. 2, pp. 201–213, 2002.
View at: Publisher Site | Google Scholar | Zentralblatt MATH

Copyright

Copyright © 2011 Liu Jin-kui et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

PDF Download Citation

Download other formats

Order printed copies

Views

943

Downloads

818

Citations