Two Modified Three-Term Type Conjugate Gradient Methods and Their Global Convergence for Unconstrained Optimization

Sun, Zhongbo; Tian, Yantao; Li, Hongyang

doi:https://doi.org/10.1155/2014/394096

Mathematical Problems in Engineering

On this page

Abstract Introduction Conclusions Acknowledgments References Copyright Related Articles

Research Article | Open Access

Volume 2014 | Article ID 394096 | https://doi.org/10.1155/2014/394096

Two Modified Three-Term Type Conjugate Gradient Methods and Their Global Convergence for Unconstrained Optimization

Zhongbo Sun,^1,2Yantao Tian,^1,3and Hongyang Li¹

Academic Editor: Masoud Hajarian

Received22 Jul 2014

Revised05 Nov 2014

Accepted06 Nov 2014

Published23 Nov 2014

Abstract

Two modified three-term type conjugate gradient algorithms which satisfy both the descent condition and the Dai-Liao type conjugacy condition are presented for unconstrained optimization. The first algorithm is a modification of the Hager and Zhang type algorithm in such a way that the search direction is descent and satisfies Dai-Liao’s type conjugacy condition. The second simple three-term type conjugate gradient method can generate sufficient decent directions at every iteration; moreover, this property is independent of the steplength line search. Also, the algorithms could be considered as a modification of the MBFGS method, but with different . Under some mild conditions, the given methods are global convergence, which is independent of the Wolfe line search for general functions. The numerical experiments show that the proposed methods are very robust and efficient.

1. Introduction

We will consider the following optimization problem: where is continuously differentiable function whose gradient is , and is denoted by .

Conjugate gradient method is very efficient for large-scale optimization problems. Generally, this method generates a sequence of iterations: where the step size is obtained by carrying out some line search rules. The line search in conjugate gradient algorithms is often based on the following Wolfe conditions: where . The direction is defined by where is conjugate gradient parameter.

Some conjugate gradient methods include the Fletcher-Reeves method (FR), the Hestenes-Stiefel method (HS), the Polak-Ribiere-Polyak method (PRP), the conjugate descent method (CD), the Liu-Storey method (LS), and the Dai-Yuan method (DY) [2–8]. In these methods, the difference is parameter ; the parameters of these methods are specified as follows: where , , and . Throughout this paper, we always use to mean the Euclidean norm.

2. Motivation

In this paper, the motivation will be described as follows. Firstly, we will introduce some modified three-term conjugate gradient methods. One of the first three-term conjugate gradient methods was proposed by Beale [9] as where , and is a restart direction. McGuire and Wolfe [10] and Powell [11] made further study on Beale’s three-term conjugate gradient method. Dai and Yuan [12] studied the general three-term conjugate gradient method: where is the number of the th restart iteration, showing that under some mild conditions the algorithm was global convergence. As we know, Dai and Liao [13] extended the classical conjugate condition , suggesting the following one: where is a scalar. Recently, Andrei [14, 15] developed two simple three-term conjugate gradient methods for unconstrained optimization problems. In [14], for the descent condition and Dai-Liao’s type conjugacy condition, the three-term conjugate gradient algorithm was satisfied at every step. The direction was computed as where and were parameters. Similarly, Andrei [15] presented another project three-term conjugate gradient algorithm. The search direction of the algorithms from this class had three terms and was computed as modifications of the classical conjugate gradient algorithms. The search direction also satisfied both descent property and Dai-Liao’s type conjugacy conditions. Yao and Qin [16] proposed a hybrid of DL and WYL conjugate gradient methods. The given method [17] possessed the sufficient descent condition under the Wolfe-Powell line search and is global convergence for general functions. Thereafter, they proposed a new conjugacy condition and nonlinear conjugate gradient method; the given method [17] was global convergence under the strong Wolfe-Powell line search. By modifying the HS method or Hager and Zhang method [18], the above methods with satisfying the conjugacy condition type were concluded.

In this paper, we present two modified simple three-term type conjugate gradient methods which are obtained by a modified BFGS (MBFGS) updating scheme of the inverse approximation of the Hessian of the function restart as the identity matrix at every step. Firstly, in order to interpret the idea of this paper, it is necessary to introduce the MBFGS method [1]. If the objective function is nonconvex, the classical Newton direction may not be a descent direction of at since the Hessian matrix is not necessarily positive definite. To overcome this drawback, Li and Fukushima [1] generated a direction from where was the unit matrix and a positive constant was chosen so that was a positive definite matrix. In order to obtain the global convergence, the sequence was bounded above. For the sake of the superlinear convergence, the positive constants should satisfy as . If the MBFGS method was quadratic convergence, the positive constants should meet for all . In other words, the global and superlinear convergence of MBFGS method depends on the choice of . Therefore, it was important to select appropriately so that it was practicable and satisfied the above conditions. How to select the optimal is very difficult for us. Actually, the MBFGS method was a modified quasi-Newton method for nonconvex unconstrained optimization problems. In order to better introduce our method, let us simply recall the MBFGS quasi-Newton method. The direction in the MBFGS method is given by where is obtained by the MBFGS formula: From the above MBFGS formula, is computed as the following two cases. On the one hand, where and ; is a constant. On the other hand, where and .

The MBFGS updating of the inverse approximate of the Hessian of function is

In order to compass our object we take a little modification of the inverse MBFGS matrix , namely, ; then (16) will be formulated as but with different . In this paper, we compute as follows: From the above formula, we can see that it does not include the , so we need not choose the parameters . Furthermore, there is more information in , for instance, the gradient of at , that is, and the constants . The most important property of is that satisfies the Dai-Liao condition; that is, where . Moreover, also satisfies the following inequality: However, Li and Fukushima [1] proved that the could satisfy the following inequality: The purpose of this paper is to overcome these drawbacks. Observe that the direction can be written as therefore, the three-term type conjugate gradient algorithm is given by (2), where the direction is computed as where

We organize the paper as follows. In the next section, we describe the three-term type conjugate gradient method and its global convergence. In Section 4, we discuss another modified sufficient descent three-term type conjugate gradient method for unconstrained optimization problems and its global convergence with Wolfe line search. In Section 5, some numerical results are given. Some conclusions and future works will be proposed in the last section.

3. Three-Term Conjugate Gradient Method and Its Global Convergence

In this section, we will introduce the three-term type conjugate gradient method.

Algorithm 1.
Step 1 (initialization and date). Given . Let .
Step 2. Test a criterion for stopping the iterations. If the test criterion is satisfied, then STOP! Otherwise, continue with Step 3.
Step 3. The direction is computed as follows: where , and are constants.
Step 4. The Wolfe line search is to find a stepsize satisfying where are constants.
Step 5. Update the variables: and then go to Step 2.

3.1. Global Convergence

Throughout this section, we assume that(H1)the level set is bounded;(H2)in a neighborhood of the function is continuously differentiable. Its gradient is Lipschitz continuous; namely, there exists a constant , such that Since is decreasing, it is clear that the sequence generated by Algorithm 1 is contained in . In addition, we can obtain from Algorithm 1 that there is constant , such that

Lemma 2. Let assumptions (H1) and (H2) hold; the line search satisfies the Wolfe conditions (28); then one has

Proof. By direct computation,

Lemma 3. Let assumptions (H1) and (H2) hold; the line search satisfies the Wolfe conditions (28); the search direction is computed by (25); then one has

Proof. We will divide (25) into two cases as follows.
Case 1. If , then Lemma 3 is true.
Case 2. If , combined with Lemma 2, then we have The proof of Lemma 3 is completed.

Lemma 4. Let assumptions (H1) and (H2) hold; the line search satisfies the Wolfe conditions (28); the search direction is computed by (25); then one has Dai-Liao’s type conjugate condition where .

Proof. By direct computation, we get where since .

Remark 5. Observe that the direction satisfies descent property. Besides, the direction satisfies Dai-Liao’s type conjugacy condition, where at every iteration.

Theorem 6. Let assumptions (H1) and (H2) hold; the line search satisfies the Wolfe conditions (28); the search direction is computed by (25); then one has

Proof. By direct computation, we have where . Consider where . Consider where . The proof is completed.

Theorem 7. Let assumptions (H1) and (H2) hold; the line search satisfies the Wolfe conditions (28); the search direction is computed by (25); then one has

Proof. According to Algorithm 1, we will divide (28) into two cases as follows.
Case 1. If , we get , and then where .
Case 2. If , we have and then The proof is true.

Theorem 8. Let assumption (H1) hold and let and be generated by the three-term conjugate gradient method. If stepsize is obtained by the Wolfe line search, then one has

Proof. By the use of Theorem 7 and the same argument as Theorem 4.1 in [19], we will omit the proof here.

4. Another Modified Three-Term Type Conjugate Gradient Method and Its Global Convergence

Recently, Zhang et al. proposed a sufficient descent modified PRP conjugate gradient method with three terms [20] as and a sufficient descent modified HS conjugate gradient method with three terms [21] as A property of these methods is that they produce sufficient direction; that is, In the same context, Sun et al. proposed another sufficient conjugate gradient [22, 23] as where and where

Similar to Zhang and Sun’s methods, in order to obtain the sufficient descent property, we will propose a modified three-term type conjugate gradient method; that is, where

Algorithm 9 (a sufficient decent three-term conjugate gradient method).
Step 1. Step 1 is initialization and date.
Step 2 (termination condition and computation ). The direction is computed by (54).
Step 3. The Wolfe line search is to find a stepsize .
Step 4. Update the variables, , and then go to Step 2.

Lemma 10. Suppose that is a starting point for which assumptions hold. Let be generated by Algorithm 9; then one has

Proof. According to Algorithm 9, we will divide (54) into two cases as follows.
Case 1. If , then Lemma 10 is true.
Case 2. If , we get Lemma 10 is true. The sufficient descent property is independent of the line search.

Lemma 11. Suppose that is a starting point for which assumptions hold. Let be generated by Algorithm 9. In addition, there are constants , such that then one has

Proof. By direct computation, we get where

Theorem 12. Let assumption (H1) hold and let and be generated by the three-term conjugate gradient Algorithm 9. If stepsize is obtained by the Wolfe line search, then one has

Proof. By the use of Lemmas 10 and 11 and the same argument as Theorem 4.1 in [19], we will omit the proof here.

5. Numerical Experiments

Now, let us report some numerical results attained by our sufficient descent conjugate gradient methods. We compare the performance of Algorithms 1 and 9 with MBFGS method. The algorithm is implemented by MATLAB 7.0 code in double precision arithmetic. The tests are performed on a PC computer with CPU Pentium 4, 2.40 GHz, and Windows XP operation system.

On the one hand, the type of objective function and the character of the problems being tested are listed in Tables 1, 2, and 3. In the experiments, for easily comparing with other codes, we use the gradient errors to measure the quality of the solutions; that is, we force the iteration to stop when where is the gradient of objective function. represents the number of dimensions. represents the number of functions. Nf represents the number of function evaluations. Ng represents the number of gradient evaluations. represents the actual CPU-time costed in procedure operation. represents the problems approximate solutions which are allowed in error range region. represents the problems in [24]. First, we compare Algorithm 1, Algorithm 9, and MBFGS method and the numerical test reports and the results of comparison are listed in Tables 1–3. We choose some test problems as our numerical examples and numerical results can be seen in Tables 1–3. The test problems with the given initial points can be found at http://camo.ici.ro/neculai/ansoft.htm which were collected by Neculai Andrei. We can see that the problems of [22] are solved by our method. Table 1 shows the numerical results of the three-term conjugate gradient Algorithm 1. In Table 2, by adopting Algorithm 9, these problems have better solutions. Table 3 shows the numerical results of the MBFGS method. In Table 2, the CPU-time is less than 130 seconds and lots of the problems are less than 70 seconds. In Table 1, the iterations of the Penalty function II and Power singular are more than 110, but, in Table 2, the Penalty function II and Power singular can be solved in less than 86 iterations and the time is less than 72 seconds. Table 3 shows the performance of the MBFGS method relative to CPU-time, the number of iterations, the number of function evaluations, and the number of gradient evaluations, respectively. From Table 1 to Table 3, as can be easily seen, Table 3 is better than Table 1 but worse than Table 2 with respect to the number of iterations and CPU-time. Tables 1 and 2 show that the three-term type conjugate gradient methods also have better performance with respect to the number of iterations and gradient evaluations. From Tables 1, 2, and 3, a conclusion is made that Algorithm 9 is better than Algorithm 1 and MBFGS method; that is, the sufficient descent direction is most important for the unconstrained optimization.

On the other hand, some of the test problems are from the CUTE collection established by Bongartz, Conn, Gould, and Toint. In Figure 1, we adopt the performance profiles proposed by Dolan and Moré [25] to compare the CPU-time of Algorithm 1, Algorithm 9, MBFGS method, Dai’s method, and MPRP method. That is, for each method, we plot the fraction of problems for which the method is within a factor of the best time. The left side of the figure gives the percentage of the test problems for which method is the fastest. The right side gives the percentage of the test problems that are successfully solved by each of the methods. The top curve is the method that solved most of the problems in a time that is within a factor of the best time. From Figure 1, we can see that Algorithm 1 method and Algorithm 9 method perform better than the MBFGS method and Dai’s method and MPRP method in [1, 19, 20]. Hence, the proposed methods not only possess better global convergence but also are superior to the three-term type conjugate gradient methods [1, 19, 20] in the numerical performance.

6. Conclusions

In this paper, on the one hand, we improve a three-term type conjugate gradient method which is obtained by a MBFGS. On the other hand, we show another sufficient decent three-term type conjugate gradient method. In addition, under appropriate conditions, we indicate that the two methods are global convergence. Finally, some numerical experiments manifest the efficiency of the proposed methods.

Certainly, we should further investigate more useful, powerful, and practical algorithms for solving large-scale unconstrained optimization problems, for instance, the hybrid conjugate gradient-GA and conjugate gradient-PSO methods and so on.

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

Acknowledgments

The authors wish to thank the reviewers for their constructive and pertinent suggestions for improving the presentation of the work. The authors would like to thank Dr. Zhifeng Dai and Boshi Tian and Xiaoliang Dong for several helpful discussions regarding numerical experiments. This work is supported by the National High Technology Research and Development Program of China (863 Program), Grant no. 2006AA04Z251, the National Natural Fund Project, Grant no. 60974067, and the Funds of Jilin Province Science and Technology, Grant nos. 2013577, 2013267, and 2013287.

References

D.-H. Li and M. Fukushima, “A modified BFGS method and its global convergence in nonconvex minimization,” Journal of Computational and Applied Mathematics, vol. 129, no. 1-2, pp. 15–35, 2001.
View at: Publisher Site | Google Scholar | MathSciNet
R. Fletcher and C. M. Reeves, “Function minimization by conjugate gradients,” The Computer Journal, vol. 7, pp. 149–154, 1964.
View at: Publisher Site | Google Scholar | Zentralblatt MATH | MathSciNet
M. R. Hestenes and E. Stiefel, “Methods of conjugate gradients for solving linear systems,” Journal of Research of the National Bureau of Standards, vol. 49, pp. 409–436, 1952.
View at: Publisher Site | Google Scholar | MathSciNet
M. R. Polak and G. Ribiere, “Note surla convergence des methodes de directions conjuguees,” Revue Française d’Automatique, Informatique, Recherche Opérationnelle, vol. 16, pp. 35–43, 1969.
View at: Google Scholar
B. T. Polyak, “The conjugate gradient method in extremal problems,” USSR Computational Mathematics and Mathematical Physics, vol. 9, no. 4, pp. 94–112, 1969.
View at: Publisher Site | Google Scholar
R. Fletcher, Unconstained Optimization: Practical Methods of Optimization, vol. 1, John Wiley & Sons, New York, NY, USA, 1987.
Y. Liu and C. Storey, “Efficient generalized conjugate gradient algorithms, Part 1,” Journal of Optimization Theory and Applications, vol. 69, no. 1, pp. 129–137, 1991.
View at: Publisher Site | Google Scholar | MathSciNet
Y. H. Dai and Y. Yuan, “An efficient hybrid conjugate gradient method for unconstrained optimization,” Annals of Operations Research, vol. 103, pp. 33–47, 2001.
View at: Publisher Site | Google Scholar | Zentralblatt MATH | MathSciNet
E. M. L. Beale, “A derivative of conjugate gradients,” in Numerical Methods for Nolinear Optimization, F. A. Lootsma, Ed., pp. 39–43, Academic Press, London, UK, 1972.
View at: Google Scholar
M. F. McGuire and P. Wolfe, “Evaluating a restart procedure for conjugate gradients,” Tech. Rep. RC-4382, IBM Research Center, Yorktown Heights, NY, USA, 1973.
View at: Google Scholar
M. J. D. Powell, “Nonconvex minimization calculations and the conjugate gradient method,” in Numerical Analysis, vol. 1066 of Lecture Notes in Mathematics, pp. 122–141, Springer, Berlin, Germany, 1984.
View at: Publisher Site | Google Scholar | MathSciNet
Y. H. Dai and Y. Yuan, “A nonlinear conjugate gradient method with a strong global convergence property,” SIAM Journal on Optimization, vol. 10, no. 1, pp. 177–182, 1999.
View at: Publisher Site | Google Scholar | MathSciNet
Y.-H. Dai and L.-Z. Liao, “New conjugacy conditions and related nonlinear conjugate gradient methods,” Applied Mathematics and Optimization, vol. 43, no. 1, pp. 87–101, 2001.
View at: Publisher Site | Google Scholar | MathSciNet
N. Andrei, “A simple three-term conjugate gradient algorithm for unconstrained optimization,” Journal of Computational and Applied Mathematics, vol. 241, pp. 19–29, 2013.
View at: Publisher Site | Google Scholar | MathSciNet
N. Andrei, “On three-term conjugate gradient algorithms for unconstrained optimization,” Applied Mathematics and Computation, vol. 219, no. 11, pp. 6316–6327, 2013.
View at: Publisher Site | Google Scholar | Zentralblatt MATH | MathSciNet
S. Yao and B. Qin, “A hybrid of DL and WYL nonlinear conjugate gradient methods,” Abstract and Applied Analysis, vol. 2014, Article ID 279891, 9 pages, 2014.
View at: Publisher Site | Google Scholar | MathSciNet
S. Yao, X. Lu, and B. Qin, “A modified conjugacy condition and related nonlinear conjugate gradient method,” Mathematical Problems in Engineering, vol. 2014, Article ID 710376, 9 pages, 2014.
View at: Publisher Site | Google Scholar | MathSciNet
W. W. Hager and H. Zhang, “A new conjugate gradient method with guaranteed descent and an efficient line search,” SIAM Journal on Optimization, vol. 16, no. 1, pp. 170–192, 2005.
View at: Publisher Site | Google Scholar | Zentralblatt MATH | MathSciNet
Y. H. Dai, “On the nonmonotone line search,” Journal of Optimization Theory and Applications, vol. 112, no. 2, pp. 315–330, 2002.
View at: Publisher Site | Google Scholar | MathSciNet
L. Zhang, W. Zhou, and D.-H. Li, “A descent modified POLak-Ribière-POLyak conjugate gradient method and its global convergence,” IMA Journal of Numerical Analysis, vol. 26, no. 4, pp. 629–640, 2006.
View at: Publisher Site | Google Scholar | MathSciNet
L. Zhang, W. Zhou, and D. Li, “Global convergence of a modified Fletcher-Reeves conjugate gradient method with Armijo-type line search,” Numerische Mathematik, vol. 104, no. 4, pp. 561–572, 2006.
View at: Publisher Site | Google Scholar | Zentralblatt MATH | MathSciNet
Z. Sun, T. Zhu, S. Weng, and H. Gao, “A new family sufficient descent conjugate gradient methods for unconstrained optimization,” in Proceedings of the 23rd Chinese Control Conference (CCC ’12), pp. 2532–2536, July 2012.
View at: Google Scholar
Z. B. Sun, T. Zhu, and H. Gao, “A sufficient descent hybrid conjugate gradient method and its global convergence for unconstrained optimization,” in Proceedings of the 24th Chinese Control and Decision Conference (CCDC ’12), pp. 735–739, May 2012.
View at: Publisher Site | Google Scholar
J. J. Moré, B. S. Garbow, and K. E. Hillstrom, “Testing unconstrained optimization software,” ACM Transactions on Mathematical Software, vol. 7, no. 1, pp. 17–41, 1981.
View at: Publisher Site | Google Scholar | MathSciNet
E. D. Dolan and J. J. Moré, “Benchmarking optimization software with performance profiles,” Mathematical Programming, vol. 91, no. 2, pp. 201–213, 2002.
View at: Publisher Site | Google Scholar | MathSciNet

Copyright

Copyright © 2014 Zhongbo Sun et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

PDF Download Citation

Download other formats

Order printed copies

Views

1193

Downloads

993

Citations