Table of Contents Author Guidelines Submit a Manuscript
Journal of Applied Mathematics
Volume 2012, Article ID 569795, 10 pages
http://dx.doi.org/10.1155/2012/569795
Research Article

A Mixed Spectral CD-DY Conjugate Gradient Method

1School of Mathematics and Statistics, Chongqing Three Gorges University, Wanzhou 404100, China
2College of Mathematics and Physics, Chongqing University, Chongqing 401331, China

Received 14 November 2011; Revised 22 January 2012; Accepted 23 January 2012

Academic Editor: Shan Zhao

Copyright © 2012 Liu Jinkui et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Abstract

A mixed spectral CD-DY conjugate descent method for solving unconstrained optimization problems is proposed, which combines the advantages of the spectral conjugate gradient method, the CD method, and the DY method. Under the Wolfe line search, the proposed method can generate a descent direction in each iteration, and the global convergence property can be also guaranteed. Numerical results show that the new method is efficient and stationary compared to the CD (Fletcher 1987) method, the DY (Dai and Yuan 1999) method, and the SFR (Du and Chen 2008) method; so it can be widely used in scientific computation.

1. Introduction

The purpose of this paper is to study the global convergence properties and practical computational performance of a mixed spectral CD-DY conjugate gradient method for unconstrained optimization without restarts, and with appropriate conditions.

Consider the following unconstrained optimization problem: min𝑥𝑅𝑛𝑓(𝑥),(1.1) where 𝑓𝑅𝑛𝑅 is continuously differentiable and its gradient 𝑔(𝑥)=𝑓(𝑥) is available. Generally, we use the iterative method to solve (1.1), and the iterative formula is given by 𝑥𝑘+1=𝑥𝑘+𝛼𝑘𝑑𝑘,(1.2) where 𝑥𝑘 is the current iteration, 𝛼𝑘 is a positive scalar and called the step-size, which is determined by some line search, 𝑑𝑘 is the search direction defined by 𝑑𝑘=𝑔𝑘,for𝑘=1,𝑔𝑘+𝛽𝑘𝑑𝑘1,for𝑘2,(1.3) where 𝑔𝑘=𝑓(𝑥𝑘) and 𝛽𝑘 is a scalar which determines the different conjugate gradient methods [1, 2].

There are many kinds of iterative method that include the steepest descent method, Newton method, and conjugate gradient method. The conjugate direction method is a commonly used and effective method in optimization, and it only needs to use the information of the first derivative. However, it overcomes the shortcoming of the steepest descent method in the slow convergence and avoids the defects of Newton method in storage and computing the second derivative.

The original CD method was proposed by Fletcher [3], in which 𝛽𝑘 is defined by 𝛽CD𝑘𝑔=𝑘2𝑑𝑇𝑘1𝑔𝑘1,(1.4) where denotes the Euclidean norm of vectors. An important property of the CD method is that the method in each iteration will produce a descent direction under the strong Wolfe line search: 𝑓𝑥𝑘+𝛼𝑘𝑑𝑘𝑥𝑓𝑘+𝛿𝛼𝑘𝑔𝑇𝑘𝑑𝑘,(1.5)|||𝑔𝑥𝑘+𝛼𝑘𝑑𝑘𝑇𝑑𝑘|||𝜎𝑔𝑇𝑘𝑑𝑘,(1.6) where 0<𝛿<𝜎<1. Dai and Yuan [4] first proposed the DY method, in which 𝛽𝑘 is defined by 𝛽𝑘=𝑔𝑘2𝑑𝑇𝑘1𝑔𝑘𝑔𝑘1.(1.7) Dai and Yuan [4] also strictly proved that the DY method in each iteration produces a descent direction under the Wolfe line search (1.5) and 𝑔𝑥𝑘+𝛼𝑘𝑑𝑘𝑇𝑑𝑘𝜎𝑔𝑇𝑘𝑑𝑘.(1.8) Some good results about the CD method and DY method have also been reported in recent years [511].

Quite recently, Birgin and Martinez [12] proposed a spectral conjugate gradient method by combining conjugate gradient method and spectral gradient method. Unfortunately, the spectral conjugate gradient method [12] cannot guarantee to generate descent directions. So, based on the FR formula, Zhang et al. [13] modified the FR method such that the direction generated is always a descent direction. Based on the modified FR conjugate gradient method [13], Du and Chen [14] proposed a new spectral conjugate gradient method: 𝑑𝑘=𝑔𝑘,for𝑘=1,𝜃𝑘𝑔𝑘+𝛽FR𝑘𝑑𝑘1,for𝑘2,(1.9) where 𝛽FR𝑘=𝑔𝑘2𝑔𝑘12,𝜃𝑘=𝑑𝑇𝑘𝑔𝑘𝑔𝑘1𝑔𝑘12.(1.10) And they proved the global convergence of the modified spectral FR method (In this paper, we call it SFR method) with the mild conditions.

The observation of the above formula motivates us to construct a new formula; which combines the advantage of the spectral gradient method, CD method, and DY method as follows: 𝑑𝑘=𝑔𝑘,for𝑘=1,𝜃𝑘𝑔𝑘+𝛽𝑘𝑑𝑘1,for𝑘2,(1.11) where 𝛽𝑘 is specified by 𝛽𝑘=𝛽CD𝑘+min0,𝜑𝑘𝛽CD𝑘,(1.12)𝜃𝑘𝑔=1𝑇𝑘𝑑𝑘1𝑔𝑇𝑘1𝑑𝑘1,(1.13)𝜑𝑘𝑔=𝑇𝑘𝑑𝑘1𝑑𝑇𝑘1𝑔𝑘𝑔𝑘1.(1.14) And under some mild conditions, we give the global convergence of the mixed spectral CD-DY conjugate gradient method with the Wolfe line search.

This paper is organized as follows. In Section 2, we propose the corresponding algorithm and give some assumptions and lemmas, which are usually used in the proof of the global convergence properties of nonlinear conjugate gradient methods. In Section 3, global convergence analysis is provided with suitable conditions. Preliminary numerical results are presented in Section 4.

2. Algorithm and Lemmas

In order to establish the global convergence of the proposed method, we need the following assumption on objective function, which have been used often in the literature to analyze the global convergence of nonlinear conjugate gradient methods with inexact line search.

Assumption 2.1. (i) The level set Ω={𝑥𝑓(𝑥)𝑓(𝑥1)} is bounded, where 𝑥1 is the starting point.(ii) In some neighborhood 𝑁 of Ω, the objective function is continuously differentiable, and its gradient is Lipchitz continuous, that is, there exists a constant 𝐿>0 such that 𝑔(𝑥)𝑔(𝑦)𝐿𝑥𝑦,𝑥,𝑦𝑁.(2.1)Now we give the mixed spectral CD-DY conjugate gradient method as follows.

Algorithm 2.2. Step 1. Data 𝑥1𝑅𝑛, 𝜀0. Set 𝑑1=𝑔1; if 𝑔1𝜀, then stop.Step 2. Compute 𝛼𝑘 by the strong Wolfe line search (1.5) and (1.8).Step 3. Let 𝑥𝑘+1=𝑥𝑘+𝛼𝑘𝑑𝑘,𝑔𝑘+1=𝑔(𝑥𝑘+1); if 𝑔𝑘+1𝜀, then stop.Step 4. Compute 𝛽𝑘+1 by (1.12), and generate 𝑑𝑘+1 by (1.11).Step 5. Set 𝑘=𝑘+1; go to Step 2.

The following lemma shows that Algorithm 2.2 produces a descent direction in each iteration with the Wolfe line search.

Lemma 2.3. Let the sequences {𝑔𝑘} and {𝑑𝑘} be generated by Algorithm 2.2, and let the step-size 𝛼𝑘 be determined by the Wolfe line search (1.5) and (1.8), then 𝑔𝑇𝑘𝑑𝑘<0.(2.2)

Proof. The conclusion can be proved by induction. Since 𝑔𝑇1𝑑1=𝑔12, the conclusion holds for 𝑘=1. Now we assume that the conclusion is true for 𝑘1,𝑘2. Then from (1.8), we have 𝑑𝑇𝑘1𝑔𝑘𝑔𝑘1(𝜎1)𝑔𝑇𝑘1𝑑𝑘1>0.(2.3) If 𝑔𝑇𝑘𝑑𝑘10, then from (1.14) and (2.3), we have 𝛽𝑘=𝛽CD𝑘.
From (1.4), (1.11), and (1.13), we have 𝑔𝑇𝑘𝑑𝑘𝑔=1𝑇𝑘𝑑𝑘1𝑔𝑇𝑘1𝑑𝑘1𝑔𝑘2𝑔𝑘2𝑔𝑇𝑘1𝑑𝑘1𝑔𝑇𝑘𝑑𝑘1𝑔=𝑘2<0.(2.4) If 𝑔𝑇𝑘𝑑𝑘1>0, then from (1.14) and (2.3), we have 𝛽𝑘=𝛽CD𝑘+𝜑𝑘𝛽CD𝑘=𝛽DY𝑘.
From (1.11), (1.7), and (1.13), we have 𝑔𝑇𝑘𝑑𝑘=𝜃𝑘𝑔𝑘2+𝛽DY𝑘𝑔𝑇𝑘𝑑𝑘1=𝛽DY𝑘𝜃𝑘𝑑𝑇𝑘1𝑔𝑘𝑔𝑘1+𝑔𝑇𝑘𝑑𝑘1=𝛽DY𝑘𝑔𝑇𝑘1𝑑𝑘1+𝑔𝑇𝑘𝑑𝑘1𝑔𝑇𝑘1𝑑𝑘1𝑑𝑇𝑘1𝑔𝑘𝑔𝑘1𝛽DY𝑘𝑔𝑇𝑘1𝑑𝑘1<0.(2.5) From the above inequality (2.4) and (2.5), we obtain that the conclusion holds for 𝑘.

The conclusion of the following lemma, often called the Zoutendijk condition, is used to prove the global convergence properties of nonlinear conjugate gradient methods. It was originally given by Zoutendijk [15].

Lemma 2.4 (see [15]). Suppose that Assumption 2.1 holds. Let the sequences {𝑔𝑘} and {𝑑𝑘} be generated by Algorithm 2.2, and let the step-size 𝛼𝑘 be determined by the Wolfe line search (1.5) and (1.8), and Lemma 2.3 holds. Then 𝑘1𝑔𝑇𝑘𝑑𝑘2𝑑𝑘2<+.(2.6)

Lemma 2.5. Let the sequences {𝑔𝑘} and {𝑑𝑘} be generated by Algorithm 2.2, and let the step-size 𝛼𝑘 be determined by the Wolfe line search (1.5) and (1.8), and Lemma 2.3 holds. Then 𝛽𝑘𝑔𝑇𝑘𝑑𝑘𝑔𝑇𝑘1𝑑𝑘1.(2.7)

Proof. If 𝑔𝑇𝑘𝑑𝑘10, then from Lemma 2.3, we have 𝛽𝑘=𝛽CD𝑘. From (1.11), (1.4), and (1.13), we have 𝑔𝑇𝑘𝑑𝑘𝑔=1𝑇𝑘𝑑𝑘1𝑔𝑇𝑘1𝑑𝑘1𝑔𝑘2+𝛽CD𝑘𝑔𝑇𝑘𝑑𝑘1=𝛽CD𝑘𝑔𝑇𝑘1𝑑𝑘1𝑔𝑇𝑘𝑑𝑘1+𝛽CD𝑘𝑔𝑇𝑘𝑑𝑘1=𝛽CD𝑘𝑔𝑇𝑘1𝑑𝑘1.(2.8) From the above equation, we have 𝛽CD𝑘=𝑔𝑇𝑘𝑑𝑘𝑔𝑇𝑘1𝑑𝑘1.(2.9) If 𝑔𝑇𝑘𝑑𝑘1>0, from (2.5), we have 𝛽DY𝑘𝑔𝑇𝑘𝑑𝑘𝑔𝑇𝑘1𝑑𝑘1.(2.10) From (1.12), (2.9), and (2.10), we obtain that the conclusion (2.7) holds.

3. Global Convergence Property

The following theorem proves the global convergence of the mixed spectral CD-DY conjugate gradient method with the Wolfe line search.

Theorem 3.1. Suppose that Assumption 2.1 holds. Let the sequences {𝑔𝑘} and {𝑑𝑘} be generated by Algorithm 2.2, and let the step-size 𝛼𝑘 be determined by the Wolfe line search (1.5) and (1.8), and Lemma 2.3 holds. Then liminf𝑘+𝑔𝑘=0.(3.1)

Proof. Suppose by contradiction that there exists a positive constant 𝜌>0, such that 𝑔𝑘𝜌(3.2) holds for all 𝑘1.

From (1.11), we have 𝑑𝑘+𝜃𝑘𝑔𝑘=𝛽𝑘𝑑𝑘1, and by squaring it, we get 𝑑𝑘2=𝛽2𝑘𝑑𝑘122𝜃𝑘𝑔𝑇𝑘𝑑𝑘𝜃2𝑘𝑔𝑘2.(3.3) From the above equation and Lemma 2.5, we have 𝑑𝑘2𝑔𝑇𝑘𝑑𝑘𝑔𝑇𝑘1𝑑𝑘12𝑑𝑘122𝜃𝑘𝑔𝑇𝑘𝑑𝑘𝜃2𝑘𝑔𝑘2.(3.4) Dividing the above inequality by (𝑔𝑇𝑘𝑑𝑘)2, we have 𝑑𝑘2𝑔𝑇𝑘𝑑𝑘2𝑑𝑘12𝑔𝑇𝑘1𝑑𝑘122𝜃𝑘𝑔𝑇𝑘𝑑𝑘𝜃2𝑘𝑔𝑘2𝑔𝑇𝑘𝑑𝑘2=𝑑𝑘12𝑔𝑇𝑘1𝑑𝑘12𝜃𝑘𝑔𝑘𝑔𝑇𝑘𝑑𝑘+1𝑔𝑘2+1𝑔𝑘2𝑑𝑘12𝑔𝑇𝑘1𝑑𝑘12+1𝑔𝑘2.(3.5) Using (3.5) recursively and noting that 𝑑12=𝑔𝑇1𝑑1=𝑔12, we get 𝑑𝑘2𝑔𝑇𝑘𝑑𝑘2𝑘𝑖=11𝑔𝑖2.(3.6) Then from (3.2) and (3.6), we have 𝑔𝑇𝑘𝑑𝑘2𝑑𝑘2𝜌2𝑘,(3.7) which indicates 𝑘11𝑔𝑘2=+.(3.8) The above equation contradicts the conclusion of Lemma 2.4. Therefore, the conclusion (3.1) holds.

4. Numerical Experiments

In this section, we report some numerical results. We used MATLAB 7.0 to test some problems that are from [16] and compare the performance of the mixed spectral CD-DY method (Algorithm 2.2) with the CD method, DY method, and SFR method. The global convergence of the CD method has not still been proved under the Wolfe line search, so our line search subroutine computes 𝛼𝑘 such that the strong Wolfe line search conditions hold with 𝛿=0.01 and 𝜎=0.1. We also use the condition ||𝑔𝑘||106 or It-max>9999 as the stopping criterion. It-max denotes the maximal number of iterations.

The numerical results of our tests are reported in the following table. The first column “Problem” represents the name of the tested problem in [16]. “Dim” denotes the dimension of the tested problem. The detailed numerical results are listed in the form NI/NF/NG, where NI, NF, and NG denote the number of iterations, function evaluations, and gradient evaluations respectively. If the limit of 9999 function evaluations was exceeded, the run was stopped; this is indicated by “—”.

In order to rank the average performance of all above methods, one can compute the total number of function and gradient evaluation by the formula 𝑁total=NF+𝑙NG,(4.1) where 𝑙 is some integer. According to the results on automatic differentiation [17, 18], the value of 𝑙 can be set to 5 𝑁total=NF+5NG.(4.2) That is to say, one gradient evaluation is equivalent to five function evaluations if automatic differentiation is used.

By making use of (4.2), we compare the mixed spectral CD-DY method as follows: for the 𝑖th problem, compute the total number of function evaluations and gradient evaluations required by the CD method, the DY method, the SFR method, and the mixed spectral CD-DY method by formula (4.2), and denote them by 𝑁total,𝑖(CD), 𝑁total,𝑖(DY), 𝑁total,𝑖(SFR), and 𝑁total,𝑖(CD-DY); then calculate the ratios:𝛾𝑖𝑁(CD)=total,𝑖(CD)𝑁total,𝑖,𝛾(CD-DY)𝑖𝑁(DY)=total,𝑖(DY)𝑁total,𝑖,𝛾(CD-DY)𝑖𝑁(SFR)=total,𝑖(SFR)𝑁total,𝑖.(CD-DY)(4.3)

From Table 1, we know that some problems are not run by some methods. So, if the 𝑖0th problem is not run by the given method, we use a constant 𝜏=max{𝛾𝑖(thegivenmethod)𝑖𝑆1} instead of 𝛾𝑖0(thegivenmethod), where 𝑆1 denotes the set of test problems, which can be run by the given method.

tab1
Table 1: The performance of the CD method, DY method, CD-DY method, and SFR method.

The geometric mean of these ratios for the CD method, the DY method and SFR method, over all the test problems is defined by 𝛾(CD)=𝑖𝑆𝛾𝑖(CD)1/|𝑆|,𝛾(DY)=𝑖𝑆𝛾𝑖(DY)1/|𝑆|,𝛾(SFR)=𝑖𝑆𝛾𝑖(SFR)1/|𝑆|,(4.4) where 𝑆denotes the set of the test problems and |𝑆| denotes the number of elements in 𝑆. One advantage of the above rule is that, the comparison is relative and hence is not be dominated a few problems for which the method requires a great deal of function evaluations and gradient functions.

According to the above rule, it is clear that 𝛾(CD-DY)=1. From Table 2, we can see that average performance of the mixed spectral CD-DY conjugate gradient method (Algorithm 2.2) works the best. So, the mixed spectral CD-DY conjugate gradient method has some practical values.

tab2
Table 2: Relative efficiency of the CD, DY, SFR, and the mixed spectral CD-DY methods.

Acknowledgments

The authors wish to express their heartfelt thanks to the referees and the editor for their detailed and helpful suggestions for revising the paper. This work was supported by The Nature Science Foundation of Chongqing Education Committee (KJ091104).

References

  1. G. H. Yu, Nonlinear self-scaling conjugate gradient methods for large-scale optimization problems, Doctorial thesis, Sun Yat-Sen University, Guangzhou, China, 2007. View at Zentralblatt MATH
  2. G. Yuan and Z. Wei, “New line search methods for unconstrained optimization,” Journal of the Korean Statistical Society, vol. 38, no. 1, pp. 29–39, 2009. View at Publisher · View at Google Scholar
  3. R. Fletcher, Practical Methods of Optimization. Volume 1: Unconstrained Optimization, John Wiley & Sons, New York, NY, USA, 2nd edition, 1987.
  4. Y. H. Dai and Y. Yuan, “A nonlinear conjugate gradient method with a strong global convergence property,” SIAM Journal on Optimization, vol. 10, no. 1, pp. 177–182, 1999. View at Publisher · View at Google Scholar · View at Zentralblatt MATH
  5. B. Qu, G. F. Hu, and X. C. Zhang, “A global convergence result for the conjugate descent method,” Journal of Dalian University of Technology, vol. 42, no. 1, pp. 13–16, 2002. View at Google Scholar
  6. X. W. Du, L. Q. Ye, and C. X. Xu, “Global convergence of a class of unconstrained optimal methods include the conjugate descent method,” Journal of Engineering Mathematics, vol. 18, no. 2, pp. 120–122, 2001. View at Google Scholar · View at Zentralblatt MATH
  7. C. Y. Pan and L. P. Chen, “A class of efficient new descent methods,” Acta Mathematicae Applicatae Sinica, vol. 30, no. 1, pp. 88–98, 2007. View at Google Scholar · View at Zentralblatt MATH
  8. Y. Dai and Y. Yuan, “Convergence properties of the conjugate descent method,” Advances in Mathematics, vol. 25, no. 6, pp. 552–562, 1996. View at Google Scholar · View at Zentralblatt MATH
  9. Y. H. Dai, “New properties of a nonlinear conjugate gradient method,” Numerische Mathematik, vol. 89, no. 1, pp. 83–98, 2001. View at Publisher · View at Google Scholar · View at Zentralblatt MATH
  10. Y. H. Dai and Y. Yuan, “A class of globally convergent conjugate gradient methods,” Research Report ICM-98-030, Institute of Computational Mathematics and Scientific/Engineering Computing, Chinese Academy of Sciences, Bejing, China, 1998. View at Google Scholar
  11. Y.-H. Dai, “Convergence of nonlinear conjugate gradient methods,” Journal of Computational Mathematics, vol. 19, no. 5, pp. 539–548, 2001. View at Google Scholar · View at Zentralblatt MATH
  12. E. G. Birgin and J. M. Martinez, “A spectral conjugate gradient method for unconstrained optimization,” Applied Mathematics and Computation, vol. 180, pp. 46–52, 2006. View at Publisher · View at Google Scholar
  13. L. Zhang, W. Zhou, and D. Li, “Global convergence of a modified Fletcher-Reeves conjugate gradient method with Armijo-type line search,” Numerische Mathematik, vol. 104, no. 4, pp. 561–572, 2006. View at Publisher · View at Google Scholar · View at Zentralblatt MATH
  14. S.-Q. Du and Y.-Y. Chen, “Global convergence of a modified spectral FR conjugate gradient method,” Applied Mathematics and Computation, vol. 202, no. 2, pp. 766–770, 2008. View at Publisher · View at Google Scholar · View at Zentralblatt MATH
  15. G. Zoutendijk, “Nonlinear programming, computational methods,” in Integer and Nonlinear Programming, J. Abadie, Ed., pp. 37–86, North-Holland, 1970. View at Google Scholar · View at Zentralblatt MATH
  16. J. J. More, B. S. Garbow, and K. E. Hillstrom, “Testing unconstrained optimization software,” ACM Transactions on Mathematical Software, vol. 7, no. 1, pp. 17–41, 1981. View at Publisher · View at Google Scholar · View at Zentralblatt MATH
  17. Y.-H. Dai and Q. Ni, “Testing different conjugate gradient methods for large-scale unconstrained optimization,” Journal of Computational Mathematics, vol. 21, no. 3, pp. 311–320, 2003. View at Google Scholar · View at Zentralblatt MATH
  18. A. Griewank, “On automatic differentiation,” in Mathematical Programming: Recent Developments and Applications, M. Iri and K. Tannabe, Eds., pp. 84–108, Kluwer Academic Publishers, 1989. View at Google Scholar