Two New Conjugate Gradient Methods for Unconstrained Optimization

Liu, Meixing; Ma, Guodong; Yin, Jianghua

doi:https://doi.org/10.1155/2020/9720653

Complexity

On this page

Abstract Introduction Methods Conclusion Data Availability Conflicts of Interest Authors’ Contributions Acknowledgments References Copyright Related Articles

Research Article | Open Access

Volume 2020 | Article ID 9720653 | https://doi.org/10.1155/2020/9720653

Two New Conjugate Gradient Methods for Unconstrained Optimization

Meixing Liu,^1,2Guodong Ma ,²and Jianghua Yin^2,3

Academic Editor: Quanmin Zhu

Received14 Nov 2019

Accepted09 Mar 2020

Published22 Apr 2020

Abstract

The conjugate gradient method is very effective in solving large-scale unconstrained optimal problems. In this paper, on the basis of the conjugate parameter of the conjugate descent (CD) method and the second inequality in the strong Wolfe line search, two new conjugate parameters are devised. Using the strong Wolfe line search to obtain the step lengths, two modified conjugate gradient methods are proposed for general unconstrained optimization. Under the standard assumptions, the two presented methods are proved to be sufficient descent and globally convergent. Finally, preliminary numerical results are reported to show that the proposed methods are promising.

1. Introduction

The conjugate gradient method (CGM for short) plays an important role in obtaining the numerical solution of the optimal control problem for nonlinear dynamic systems and other mathematical models [1, 2]. In this paper, we study the nonlinear CGM for the following unconstrained optimization problem:where is smooth and its gradient is available.

The iterates of the classic CGMs for solving problem (1) are generated by . First, the step length is usually yielded by a suitable inexact line search along the search direction , such as the Wolfe line searchor the strong Wolfe line searchwhere parameters and satisfy . Second, the search direction is computed bywhere and is the conjugate parameter. To the best of our knowledge, different choices for the scalar lead to different CGMs works. In particular, the well-known formulas for include the Hestenes–Stiefel (HS) [3], Fletcher–Reeves (FR) [4], Polak–Ribiere (PRP) [5], Conjugate Descent (CD) [6], Liu–Story (LS) [7], and Dai–Yuan (DY) [8] formulas, and they are specified as follows:where stands for the Euclidean norm. A great number of CGMs with good convergence properties and effective numerical performance are deuterogenic by the six methods above, see e.g., [9–15]. As we know, Jiang et al. [12] studied a CGM work called JMJ method, where the parameter is correspondingly specified by

This further can be integrated and rewritten as . It is obvious that the abovementioned formula reduces to the DY formula under the exact line search. Moreover, the JMJ method keeps the descent property at each iteration and converges globally for general nonconvex functions under the Wolfe line search.

In this paper, we focus our attention on the ideas of the JMJ and CD methods as well as the strong Wolfe line search. In particular, by the second inequality of the strong Wolfe line search, it follows that if . Thus, based on the formula , making full use of the characteristics of the JMJ and CD methods, and to ensure that our proposed methods possess nice convergent properties and to improve the numerical performance, two new formulas for are constructed in this paper. The first one is generated by replacing the term in with the CD formula , namely,

On the contrary, replacing the denominator in with , the second one is presented with

From (7) and (8), it is not difficult to know that the former formula reduces to the DY formula when the exact line search is used, and accordingly, in the same condition, the later one reduces to the FR formula.

The rest of the paper is organized as follows. In Section 2, two modified methods and the sufficient descent properties are presented. Global convergence properties of the proposed methods are analyzed in Section 3. Some numerical results are reported in Section 4. Finally, we draw a conclusion in Section 5.

2. Methods and Sufficient Descent Properties

In this section, we first describe the details of the two proposed methods, which for convenience are called the LMYCD1 and LMYCD2 methods in Algorithm 1 and Algorithm 2, respectively.

	Initialization. Given constants and , as well as . Let , .
	Step 1. If , then stop. Otherwise, go to Step 2.
	Step 2. Determine a step length by the strong Wolfe line search (3).
	Step 3. Let , compute and .
	Step 4. Compute . Set , and go back to Step 1.

	Initialization. Given constants , and . Let , .
	Step 1 and Step 2 are the same as the step 1 and step 2 of LMYCD1 method.
	Step 3. Let , compute and .
	Step 4 is the same as the step 4 of LMYCD1 method.

2.1. Sufficient Descent Condition

If there exists a constant such that , then, we say that the search direction of the method satisfies the sufficient descent condition, which is often used to analyze the convergence properties of CGMs for this kind of problem (1) under inexact line search, see e.g., [10–14].

The next lemmas show that the search directions yielded by the two proposed methods always satisfy the sufficient descent condition.

Lemma 1. Suppose that is generated by the LMYCD1 method and . Then,

Proof. We prove (9) by induction. For , it is easy to know from that . Suppose that (9) is satisfied for , namely, ,then obviously holds by its definition. Furthermore, from the strong Wolfe line search, we haveand . This, together with and relation (10), further implies thatNow, we prove that (9) holds for via the following three cases:(i)If , then, by the definition of in Algorithm 1, we have(ii)If , then, similarly from definitions of and , taking , and (10), we obtain(iii)If , then, using and , we also obtainTherefore, the assertion is satisfied for all .

Lemma 2. Let the search direction be yielded by the LMYCD2 method and . Then,Moreover, relation holds.

Proof. For , one has , so (15) clearly holds. Suppose that relation (15) is satisfied for . Now, we continue to prove that (15) holds for . By the strong Wolfe line search and , it is clear that , and henceThus, using , we obtainFurthermore, recalling in Algorithm 2, and , one hasThis together with the right-hand side of (16) further shows thatNext, based on the strong Wolfe line search, from the left-hand side of (19) and assertion (15) for , we haveSimilarly, from the right-hand side of (19), we also obtainThus, the proof is completed.

3. Convergence Results

Throughout this paper, we make the following elementary assumptions for the objective function:(H1)The level set is bounded(H2)In a neighborhood of , is differentiable and its gradient is Lipschitz continuous, namely, there exists a constant such that

To proceed, the well-known Zoutendijk condition [16] is reviewed in the following.

3.1. Zoutendijk Condition

Suppose that assumptions (H1)–(H2) hold, the search direction is a descent direction and the step length satisfies the Wolfe line search condition, then we have . In particular, if the sufficient descent condition is satisfied, then .

Now, before establishing the global convergence of the LMYCD1 method, we show that the LMYCD1 method has similar properties to that of the DY method, which is very important to analyze the global convergence property of the method.

Lemma 3. Let the sequence be generated by the LMYCD1 method, then always hold for .

Proof. From relation (11), it is clear that . If , then using relation (10), one has. In the case that , we prove by the following two cases:(i)If , from relation (13), we have Then, dividing this inequality by the negative term , it follows that , then combining with (11), we have .(ii)If , from definitions of and , by the strong Wolfe line search, we haveand hence since . Again, from , we haveThis implies that , and the proof is completed.

Subsequently, based on Lemma 1 and Lemma 3, we can prove the global convergence of the LMYCD1 method.

Theorem 1. Suppose that Assumptions (H1)–(H2) hold. Let the sequence be generated by the LMYCD1 method, then the LMYCD1 method is globally convergent by the way of .

Proof. By contradiction, we suppose that the conclusion is not true, then there exists a constant such that . Since , it follows from Lemma 3 thatDividing this by , it is easy to get thatNotice that , using the abovementioned formula, we haveThus, , which further implies that . It contradicts the Zoutendijk condition, and it follows that .

Next, based on Lemma 2 and the Zoutendijk condition, we also obtain the global convergence for the LMYCD2 method.

Theorem 2. Suppose that Assumptions (H1)–(H2) hold. Let the sequence be generated by the LMYCD2 method, then the LMYCD2 method is globally convergent by the way of .

Proof. Suppose by contradiction that assertion is not true, then there exists a constant such that . Squaring both sides of , one gets . From Lemma 2 and the strong Wolfe line search, we haveThus, we obtain . Again, dividing this inequality by , we have . Taking and , utilizing the abovementioned formula, one hasThis implies that ; thus, , which also contradicts the Zoutendijk condition. Therefore, we can conclude that the LMYCD2 method is globally convergent by the way of .

4. Numerical Results

In this section, we study the computational efficiency of the LMYCD1 and LMYCD2 methods, in contrast to the NHS [11], NPRP [11], JMJ [12] methods, the hybrid Dai-Yuan (hDY) method with [10], and a family of modified Dai-Liao CGMs [17] including MDL1, MDL2, MDL3, and MDL4. The 171 test problems in all are the unconstrained problems with dimensions ranging from 2 to 1500000, in which some are obtained from the CUTEr library [18] including test examples 1–46 in Table 1 and 1–49 in Table2and others from [19,20]. Moreover, the step length is yielded by using the strong Wolfe line search. All the considered methods were coded in Matlab R2016a and ran on a PC with 3.6 GHz CPU processor and 8 GB RAM and Windows 10 operating system. We terminated the iteration when one of the following conditions was satisfied: (i) and (ii) the total number of iterations . If condition (ii) occurs, the method is deemed to fail for solving the corresponding test problem and denote it by “NaN.”

Table 1 lists the detailed numerical results of the first group methods, which contain the LMYCD1, hDY, NHS, JMJ, MDL1, and MDL2 methods. Here, parameters and . Notice that “nItrNFNGTcpu” denote the dimension of the test problems, the total number of iterations, function evaluations, gradient evaluations, and the CPU time in seconds, respectively. For the second group methods consisting of the LMYCD2, NPRP, JMJ, MDL3, and MDL4 methods, the corresponding numerical results are reported in Table 2. Here, parameters and .

In addition, we make use of the performance profiles of Dolan and Moré in [21] to compare the performance of the tested methods listed above, and readers refer to this literature for details about the introduction of the performance profiles. It is worth noting that the left side of each performance profile figure indicates the percentage of the test problems, which is the fastest among these methods, whereas the right side gives the percentage of the test problems that are successfully solved by each method. The top curve means that the corresponding method implements best in contrast to other methods.

Figures 1 and 2 illustrate the performance profiles for the LMYCD1, hDY, NHS, JMJ, MDL1, and MDL2 methods, by the total number of iterations, function evaluations, gradient evaluations, and the CPU time (s), respectively. In Figures 3 and 4, the performance profiles of the LMYCD2, NPRP, JMJ, MDL3, and MDL4 methods are described.

(a)

(b)

(a)

(b)

(a)

(b)

(a)

(b)

Observing from all Figures 1–4, the LMYCD1 and LMYCD2 methods, are competitive and both of them outperform the tested methods in each group, with respect to the characteristics Itr, NF, NG, and Tcpu, respectively. In addition, the two proposed methods in this paper ultimately solve of the respective test problems successfully. All numerical results show that the efficiency of the LMYCD1 and LMYCD2 methods is encouraging.

5. Conclusion

In this paper, we construct two new formulas for setting parameter by using substantially the information of the JMJ and CD formulas as well as the second inequality in the strong Wolfe line search (3). Under the usual assumptions, the presented methods are proved to be sufficient descent and globally convergent. Elementary numerical experiments demonstrate that the two proposed methods perform effectively.

Data Availability

No data were used to support this study.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.

Authors’ Contributions

The study was carried out in collaboration with all authors. All authors read and approved the final manuscript.

Acknowledgments

The authors were supported financially by the Natural Science Foundation of Guangxi Province (2018GXNSFAA281099), Middle-aged and Young Teachers’ Basic Ability Promotion Project of Guangxi (2017KY0537 and 2018KY0700), and Open Project of Guangxi Colleges and Universities Key Laboratory of Complex System Optimization and Big Data Processing (2017CSOBDP0105).

References

B. Balaram, M. D. Narayanan, and P. K. Rajendrakumar, “Optimal design of multi-parametric nonlinear systems using a parametric continuation based genetic algorithm approach,” Nonlinear Dynamics, vol. 67, no. 4, pp. 2759–2777, 2012.
View at: Publisher Site | Google Scholar
L. M. Zhang, H. T. Gao, and Z. Q. Chen, “Multi-objective global optimal parafoil homing trajectory optimization via Gauss pseudospectral method,” Nonlinear Dynamics, vol. 72, no. 1-2, pp. 1–8, 2013.
View at: Publisher Site | Google Scholar
M. R. Hestenes and E. Stiefel, “Methods of conjugate gradients for solving linear systems,” Journal of Research of the National Bureau of Standards, vol. 49, no. 6, pp. 409–436, 1952.
View at: Publisher Site | Google Scholar
R. Fletcher and C. M. Reeves, “Function minimization by conjugate gradients,” The Computer Journal, vol. 7, no. 6, pp. 163–168, 1964.
View at: Publisher Site | Google Scholar
B. T. Polyak, “The conjugate gradient method in extremal problems,” USSR Computational Mathematics and Mathematical Physics, vol. 9, no. 4, pp. 94–112, 1969.
View at: Publisher Site | Google Scholar
R. Fletcher, “Practical methods of optimization,” I: Unconstrained Optimization, Wiley, New York, NY, USA, 1987.
View at: Google Scholar
Y. Liu and C. Storey, “Efficient generalized conjugate gradient algorithms, part 1: Theory,” Journal of Optimization Theory and Applications, vol. 69, no. 1, pp. 129–137, 1991.
View at: Publisher Site | Google Scholar
Y. H. Dai and Y. Yuan, “A nonlinear conjugate gradient method with a strong global convergence property,” SIAM Journal on Optimization, vol. 10, no. 1, pp. 177–182, 1999.
View at: Publisher Site | Google Scholar
C. M. Tang, Z. X. Wei, and G. Y. Li, “A new version of the Liu-Storey conjugate gradient method,” Applied Mathematics and Computation, vol. 189, no. 1, pp. 302–313, 2007.
View at: Publisher Site | Google Scholar
Y. H. Dai and Y. X. Yuan, “An efficient hybrid conjugate method for unconstrained optimization,” Annals of Operations Research, vol. 103, no. 1–4, pp. 33–47, 2001.
View at: Google Scholar
L. Zhang, “An improved Wei-Yao-Liu nonlinear conjugate gradient method for optimization computation,” Applied Mathematics and Computation, vol. 215, no. 6, pp. 2269–2274, 2009.
View at: Publisher Site | Google Scholar
X. Z. Jiang, G. D. Ma, and J. B. Jian, “A new global convergent conjugate gradient method with Wolfe line search,” Chinese Journal of Engineering Mathematics, vol. 28, no. 6, pp. 779–786, 2011.
View at: Google Scholar
X. Z. Jiang and J. B. Jian, “Two modified conjugate gradient methods with disturbance factors for unconstrained optimization,” Nonlinear Dynamics, vol. 77, no. 1-2, pp. 387–397, 2014.
View at: Publisher Site | Google Scholar
X. Z. Jiang and J. B. Jian, “Improved Fletcher-Reeves and Dai-Yuan conjugate gradient methods with the strong Wolfe line search,” Journal of Computational and Applied Mathematics, vol. 348, pp. 525–534, 2019.
View at: Publisher Site | Google Scholar
N. Andrei, “Accelerated adaptive Perry conjugate gradient algorithms based on the self-scaling memoryless BFGS update,” Journal of Computational and Applied Mathematics, vol. 325, pp. 149–164, 2017.
View at: Publisher Site | Google Scholar
G. Zoutendijk, “Nonlinear programming, computational methods,” Integer Nonlinear Programming, vol. 143, no. 1, pp. 37–86, 1970.
View at: Google Scholar
S. Babaie-Kafaki and R. Ghanbari, “Two optimal Dai-Liao conjugate gradient methods,” Optimization, vol. 64, no. 11, pp. 2277–2287, 2015.
View at: Publisher Site | Google Scholar
I. Bongartz, A. R. Conn, N. Gould, and P. L. Toint, “CUTE: constrained and unconstrained testing environment,” ACM Transactions on Mathematical Software, vol. 21, no. 1, pp. 123–160, 1995.
View at: Publisher Site | Google Scholar
J. J. Moré, B. S. Garbow, and K. E. Hillstrom, “Testing unconstrained optimization software,” ACM Transactions on Mathematical Software (TOMS), vol. 7, no. 1, pp. 17–41, 1981.
View at: Publisher Site | Google Scholar
N. Andrei, “An unconstrained optimization test functions collection,” Advanced Modeling and Optimization, vol. 10, no. 1, pp. 147–161, 2008.
View at: Google Scholar
E. D. Dolan and J. J. Moré, “Benchmarking optimization software with performance profiles,” Mathematical Programming, vol. 91, no. 2, pp. 201–213, 2002.
View at: Publisher Site | Google Scholar

Copyright

Copyright © 2020 Meixing Liu et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

PDF Download Citation

Download other formats

Order printed copies

Views

1236

Downloads

1149

Citations