Research Article  Open Access
Bakhtawar Baluch, Zabidin Salleh, Ahmad Alhawarat, "A New Modified ThreeTerm Hestenes–Stiefel Conjugate Gradient Method with Sufficient Descent Property and Its Global Convergence", Journal of Optimization, vol. 2018, Article ID 5057096, 13 pages, 2018. https://doi.org/10.1155/2018/5057096
A New Modified ThreeTerm Hestenes–Stiefel Conjugate Gradient Method with Sufficient Descent Property and Its Global Convergence
Abstract
This paper describes a modified threeterm Hestenes–Stiefel (HS) method. The original HS method is the earliest conjugate gradient method. Although the HS method achieves global convergence using an exact line search, this is not guaranteed in the case of an inexact line search. In addition, the HS method does not usually satisfy the descent property. Our modified threeterm conjugate gradient method possesses a sufficient descent property regardless of the type of line search and guarantees global convergence using the inexact Wolfe–Powell line search. The numerical efficiency of the modified threeterm HS method is checked using 75 standard test functions. It is known that threeterm conjugate gradient methods are numerically more efficient than twoterm conjugate gradient methods. Importantly, this paper quantifies how much better the threeterm performance is compared with twoterm methods. Thus, in the numerical results, we compare our new modification with an efficient twoterm conjugate gradient method. We also compare our modification with a stateoftheart threeterm HS method. Finally, we conclude that our proposed modification is globally convergent and numerically efficient.
1. Introduction
In the field of optimization conjugate gradient methods are a wellknown approach for solving largescale unconstrained optimization problems. The conjugate gradient (CG) methods are simple and have relatively modest storage requirements. This class of methods has a vast number of applications in different areas, especially in the field of engineering [1–3].
Consider the unconstrained optimization problem:where is continuously differentiable and its gradient is . Normally CG methods generate a sequence defined byIn (2), is a general line search and is a search direction given by where is a parameter of the CG method. The six pioneering forms of are defined in [4–10].
Line searches may be exact or inexact. Exact line searches are time consuming, computationally expensive, and difficult and require large amounts of storage [11–13]. Thus, inexact line search techniques are often adopted because of their efficiency and global convergence properties. Wellknown inexact line search methods include the Wolfe and strong Wolfe techniques, which can be written as where , andRecently, Alhawarat and Salleh [14], Salleh and Alhawarat [15], and Alhawarat et al. [16, 17] proposed efficient CG and hybrid CG methods that fulfill the required global convergence properties. To improve the existing methods, a threeterm CG technique has been introduced. Several different researchers have suggested various modifications to the threeterm CG method. For instance, Beale [18] and Nazareth [19] proposed CG methods based on three terms that possess the finite termination property, but these do not perform well in practice [20, 21]. Furthermore, reports by McGuire and Wolfe [22], Deng and Li [23], Zhang et al. [24, 25], Cheng [26], AlBayati and Sharif [27], Zhang Xiao and Wei [28], Andrei [29–31], Sugiki et al. [32], Narushima et al. [33], BabaieKafaki and Ghanbari [34], AlBaali et al. [35], Sun and Liu [36], and Baluch et al. [37] discuss the global convergence and numerical results of modified threeterm CG methods.
In this paper, a modified threeterm Hestenes–Stiefel (HS) method is proposed. The general formula of the HS method [4] is This is known to be the first of all the CG parameters. This method ensures the global convergence of the exact line search. A nice property of the HS method is that it satisfies the conjugacy condition, regardless of whether the line search is exact or inexact [38]. However, this method does not satisfy the global convergence property when used with an inexact line search.
In this paper, the method of Zhang et al. [25] is modified with the help of another efficient CG parameter proposed by Wei et al. [39]. An attractive feature of the new threeterm HS method is that it satisfies the sufficient descent condition regardless of the line search used. Furthermore, our modification is globally convergent for both convex and nonconvex functions when using an inexact line search. Numerical experiments show that the new modification is more efficient and robust than the MTTHS algorithm proposed by Zhang et al. [25]. The second aspect of this paper is to quantify the improvement of the threeterm CG method over twoterm approaches. To do this, we consider the efficient twoterm CG method [40] given byThis DHS [40] method is one of the more efficient CG techniques, as it possesses the sufficient descent property and offers global convergence under Wolfe–Powell line search conditions. The numerical results given by this method are also convincing. Therefore, this twoterm CG method is compared with our new modification to quantify the improvement offered by threeterm CG methods.
The remainder of this paper is organized as follows. In Section 2, the motivation for and construction of the threeterm HS CG method is discussed, and the general form is presented in Algorithm A. Section 3 is divided into two subsections, with Section 3.1 covering the sufficient descent condition and the global convergence properties for convex and nonconvex functions and Section 3.2 presenting detailed numerical results to evaluate the proposed method. Finally, Section 4 concludes this paper.
2. Motivation and Formulas
Zhang et al. [25] proposed the first threeterm HS (TTHS) method. This can be written as and .
TTHS satisfies the descent property; if an exact line search is used, then it reduces to the original HS method. Further, to guarantee the global convergence properties of the search direction given by (8), a modified (MTTHS) algorithm was introduced with the search direction: where and .
As MTTHS was introduced to prove the global convergence properties of the search direction in (8), the question arises as to why (8) is not used to prove the global convergence properties. Instead of ignoring (8), it should be made efficient and globally convergent. Thus, there is room to modify (8) so as to satisfy the global convergence properties. It is expected that such a modification would outperform the MTTHS algorithm numerically.
Wei et al. [39] proposed an efficient CG parameter given byIn this parameter, the term plays an important role in satisfying the sufficient descent and global convergence properties. Thus, we take from the denominator of the above parameter and use it with (8) to construct a new modified threeterm HS method. Hence,It is known that the HS method does not converge globally when the objective function is nonconvex. Further, Gilbert and Nocedal [41] showed that the parameter must be nonnegative to achieve convergence for nonconvex or nonlinear functions, i.e.,Applying the same technique to our parameter giveswhere If the line search is exact, then the parameters , , and reduce to the original parameters [4], [41], and TTHS [25]. The procedure of our proposed threestep CG method is described in Algorithm A.
Algorithm A.
Step 0. Choose an initial point , and set ,
Step 1. For convergence, if (), then the algorithm terminates; otherwise, go to step 2.
Step 2. Compute
, and are given in (11) and (14).
Step 3. Determine the step size by the Wolfe line search (4).
Step 4. Compute the new point
Step 5. Set and go to step 1.
3. Results and Discussion
This section contains a theoretical discussion and numerical results. The first subsection considers the global convergence properties of our proposed method and the second presents the results from numerical computations.
3.1. Global Convergence Properties
Assumptions
(A1) The level set is bounded.
(A2) In some neighborhoodof , the gradient is Lipschitz continuous on an open convex set that contains , i.e., there exists a positive constant such thatAssumptions (A1) and (A2) imply that there exist positive constants and such thatWe now prove the sufficient descent condition independent of the line search and also . From (15), (11), and (14), we can writethat is,Hence, the sufficient descent condition holds regardless of the line search. Now, we prove thatAs we have , taking the modulus on both sides givesBy the Schwartz inequality, we havesoorHence, we haveThe HS method is well known for its conjugacy conditions, such asBy [15], CG methods that inherit (27) will be more efficient than other CG parameters that do not inherit this property. Dai and Liao [42] proposed the following conjugacy condition for an inexact line search: Using the exact line search , (28) reduces to the conjugacy condition in (27).
Lemma 1 (see [43]). Suppose there is an initial point for which Assumptions (A1) and (A2) hold. Now, consider the method in the form of (2), in which is a descent direction and satisfies the Wolfe line search condition (4). Then This is known as Zoutendijk’s condition and is used for proving the global convergence of a CG method. This condition together with (26) shows that
Definition 2. The function is called uniformly convex [36] on if there exists a positive constant such that We now show the global convergence of Algorithm A for uniformly convex functions.
Lemma 3. Let the sequences and be generated by Algorithm A and suppose that (31) holds. Then,where .
Proof. For details, see Lemma 2.1 of [44].
Theorem 4. Let the conditions in Assumptions (A1) and (A2) hold and the function be uniformly convex. Then,
Proof. AsThen, using the second Wolfe condition (4) and the sufficient descent condition,we haveFrom (11), (32), and (36) and Assumption (A2),Let us suppose that , where and so that . Thus, Now,Combining (38) and (39) with (15), we obtainNow, let so thatand we get . This implies thatHence, by (30), we have
We are now going to prove the global convergence of Algorithm A for nonconvex functions.
Lemma 5. Suppose that Assumptions (A1) and (A2) hold. Let the sequence be generated by Algorithm A. If there exists a constant such that for every , then where
Proof. As and , and also for all , then for all . Hence, is well defined. Ifthen , where and are unit vectors. Therefore,As ,Now, from Assumption (A2), (14), and (18),From (17), (18), and (48), there exists a constant such thatFrom (30) and (49), we obtainCombining this with (44) completes the proof.
Theorem 6. Let Assumptions (A1) and (A2) hold. Then, the sequence () generated by Algorithm A satisfies
Proof. Suppose that . Then, there exists a constant such that
The proof has two parts.
Part 1. See Theorem 2.2, step 1 in [36].
Part 2. From (15) and (49), we have In the beginning of the proof, we suppose that . Then, there exist a positive constant and some such that . Thus, which contradicts Assumption (A2), (30), and (52). Therefore,
3.2. Numerical Discussion
We now report the results of several numerical experiments. Zhang et al. [25] demonstrated the superior numerical efficiency of the MTTHS algorithm with respect to PRP+ [41], CG_DESCENT [45], and LBFGS [46] using the Wolfe line search, while Dai and Wen [40] reported the numerical efficiency of the DHS method. Thus, we compare the efficient threeterm HS method proposed in this paper (named the Bakhtawar–Zabidin–Ahmad method, BZA) with MTTHS [25] and DHS [40]. The BZA method was implemented using the Wolfe–Powell line search (4) with , , and
All codes were written in MATLAB 7.1 and run on an Intel Core i5 system with 8.0 GB RAM and a 2.60 GHz processor. Table 1 lists the numerical results given by BZA, MTTHS, and DHS for a number of test functions. In the Table 1, NI/CT/GE/FE represents number of iterations, CPU time, number of gradient evaluations and number of function evaluations.

According to Moré et al. [47], the efficiency of any method can be determined by its performance on a number of test functions. The number of test functions should not be too large or too small, with 75 considered ideal for testing the efficiency of any method. The test functions in Table 1 were taken from Andrei’s test function collection [48] with standard initial points and dimensions ranging from 2 to 10000.
If the solution had not converged after 500 seconds, the program was terminated. Generally, convergence was achieved within this time limit; functions for which the time limit was exceeded are denoted by “F” for Fail in Table 1.
The Sigma plotting software was used to graph the data. We adopt the performance profiles given by Dolan and Moré [49]. Thus, MTTHS, DHS, and BZA are compared in terms of NI/CT/GE/FE in Figures 1–4. For each method, we plotted the fraction of problems that were solved correctly within a factor of the best time. In the figures, the uppermost curve is the method that solves the most problems within a factor t of the best time. From Table 1 and Figures 1–4, the BZA method outperforms the MTTHS algorithm and DHS method in terms of NI, CT, GE, and FE.
The BZA method solves around 99.5% of the problems, and the performance of BZA is 85% better than that of DHS and 77% better than that of MTTHS. We can also conclude that, on average, threeterm conjugate gradient methods are 85% better than twoterm conjugate gradient methods (DHS).
4. Conclusion
We have proposed a modified threeterm HS conjugate gradient method. An attractive property of the proposed method is that it produces a sufficient descent condition , regardless of the line search. The global convergence properties of the proposed method have been established under Wolfe line search conditions. Numerical results show that the proposed method is more efficient and robust than stateoftheart three term (MTTHS) and twoterm (DHS) CG methods.
Data Availability
No data were used to support this study.
Conflicts of Interest
The authors declare that there are no conflicts of interest regarding the publication of this paper.
References
 E. Polak, Optimization: Algorithms and Consistent Approximations, Springer, New York, NY, USA, 1997. View at: MathSciNet
 J. Nocedal, “Conjugate gradient methods and nonlinear optimization,” in Linear and Nonlinear Conjugate Gradient Related Methods, L. Adams and J. L. Nazareth, Eds., pp. 9–23, SIAM, Philadelphia, PA, USA, 1995. View at: Google Scholar  MathSciNet
 R. Ziadi, R. Ellaia, and A. BencherifMadani, “Global optimization through a stochastic perturbation of the PolakRibière conjugate gradient method,” Journal of Computational and Applied Mathematics, vol. 317, pp. 672–684, 2017. View at: Publisher Site  Google Scholar
 M. R. Hestenes and E. Stiefel, “Methods of conjugate gradients for solving linear systems,” Journal of Research of the National Bureau of Standards, vol. 49, pp. 409–436, 1952. View at: Publisher Site  Google Scholar  MathSciNet
 R. Fletcher and C. M. Reeves, “Function minimization by conjugate gradients,” The Computer Journal, vol. 7, pp. 149–154, 1964. View at: Publisher Site  Google Scholar  MathSciNet
 E. Polak and G. Ribière, “Note sur la convergence de méthodes de directions conjuguées,” Revue Française d'Informatique et de Recherche Opérationnelle, vol. 3, no. 16, pp. 35–43, 1969. View at: Publisher Site  Google Scholar  MathSciNet
 B. T. Polyak, “The conjugate gradient method in extreme problems, USSR Comput,” USSR Computational Mathematics and Mathematical Physics, vol. 9, pp. 94–112, 1969. View at: Publisher Site  Google Scholar
 R. Fletcher, Practical Methods of Optimization, vol. I: Unconstrained Optimization, John Wiley & Sons, New York, NY, USA, 2nd edition, 1987. View at: MathSciNet
 Y. Liu and C. Storey, “Efficient generalized conjugate gradient algorithms, Part 1,” Journal of Optimization Theory and Applications, vol. 69, no. 1, pp. 129–137, 1991. View at: Publisher Site  Google Scholar  MathSciNet
 Y. H. Dai and Y. Yuan, “A nonlinear conjugate gradient method with a strong global convergence property,” SIAM Journal on Optimization, vol. 10, no. 1, pp. 177–182, 1999. View at: Publisher Site  Google Scholar  MathSciNet
 Z.J. Shi and J. Guo, “A new family of conjugate gradient methods,” Journal of Computational and Applied Mathematics, vol. 224, no. 1, pp. 444–457, 2009. View at: Publisher Site  Google Scholar  MathSciNet
 G. Yuan, X. Lu, and Z. Wei, “A conjugate gradient method with descent direction for unconstrained optimization,” Journal of Computational and Applied Mathematics, vol. 233, no. 2, pp. 519–530, 2009. View at: Publisher Site  Google Scholar  MathSciNet
 Z.J. Shi, S. Wang, and Z. Xu, “The convergence of conjugate gradient method with nonmonotone line search,” Applied Mathematics and Computation, vol. 217, no. 5, pp. 1921–1932, 2010. View at: Publisher Site  Google Scholar  MathSciNet
 A. Alhawarat and Z. Salleh, “Modification of nonlinear conjugate gradient method with weak WolfePowell line search,” Abstract and Applied Analysis, Article ID 7238134, 6 pages, 2017. View at: Google Scholar
 Z. Salleh and A. Alhawarat, “An efficient modification of the HestenesStiefel nonlinear conjugate gradient method with restart property,” Journal of Inequalities and Applications, vol. 2016, no. 1, Article ID 110, 2016. View at: Publisher Site  Google Scholar
 A. Alhawarat, Z. Salleh, M. Mamat, and M. Rivaie, “An efficient modified PolakRibièrePolyak conjugate gradient method with global convergence properties,” Optimization Methods and Software, vol. 32, no. 6, pp. 1299–1312, 2017. View at: Google Scholar
 A. Alhawarat, M. Mamat, M. Rivaie, and Z. Salleh, “An efficient hybrid conjugate gradient method with the strong wolfepowell line search,” Mathematical Problems in Engineering, vol. 2015, Article ID 103517, 7 pages, 2015. View at: Publisher Site  Google Scholar  MathSciNet
 E. M. L. Beale, “A derivative of conjugate gradients,” in Numerical Methods for Nonlinear Optimization, F. A. Lootsma, Ed., pp. 39–43, Academic Press, London, UK, 1972. View at: Google Scholar  MathSciNet
 L. Nazareth, “A conjugate direction algorithm without line searches,” Journal of Optimization Theory and Applications, vol. 23, no. 3, pp. 373–387, 1977. View at: Publisher Site  Google Scholar  MathSciNet
 Y. H. Dai and Y. Yuan, Nonlinear Conjugate Gradient Methods, Shanghai Science and Technology Publisher, Shanghai, China, 2000.
 W. W. Hager and H. Zhang, “A survey of nonlinear conjugate gradient methods,” Pacific Journal of Optimization. An International Journal, vol. 2, no. 1, pp. 35–58, 2006. View at: Google Scholar  MathSciNet
 M. F. McGuire and P. Wolfe, “Evaluating a restart procedure for conjugate gradients,” Report RC4382, IBM Research Center, Yorktown Heights, 1973. View at: Google Scholar
 N. Y. Deng and Z. Li, “Global convergence of three terms conjugate gradient methods,” Optimization Methods and Software, vol. 4, pp. 273–282, 1995. View at: Google Scholar
 L. Zhang, W. Zhou, and D. H. Li, “A descent modified PolakRibierePolyak conjugate gradient method and its global convergence,” IMA Journal of Numerical Analysis (IMAJNA), vol. 26, no. 4, pp. 629–640, 2006. View at: Publisher Site  Google Scholar  MathSciNet
 L. Zhang, W. Zhou, and D. Li, “Some descent threeterm conjugate gradient methods and their global convergence,” Optimization Methods and Software, vol. 22, no. 4, pp. 697–711, 2007. View at: Publisher Site  Google Scholar  MathSciNet
 W. Cheng, “A twoterm PRPbased descent method,” Numerical Functional Analysis and Optimization, vol. 28, no. 11, pp. 1217–1230, 2007. View at: Publisher Site  Google Scholar  MathSciNet
 A. Y. AlBayati and W. H. Sharif, “A new threeterm conjugate gradient method for unconstrained optimization,” Canadian Journal on Science and Engineering Mathematics, vol. 1, no. 5, pp. 108–124, 2010. View at: Google Scholar
 J. Zhang, Y. Xiao, and Z. Wei, “Nonlinear conjugate gradient methods with sufficient descent condition for largescale unconstrained optimization,” Mathematical Problems in Engineering, vol. 2009, Article ID 243290, 16 pages, 2009. View at: Publisher Site  Google Scholar  MathSciNet
 N. Andrei, “A modified PolakRibièrePolyak conjugate gradient algorithm for unconstrained optimization,” Optimization. A Journal of Mathematical Programming and Operations Research, vol. 60, no. 12, pp. 1457–1471, 2011. View at: Publisher Site  Google Scholar  MathSciNet
 N. Andrei, “On threeterm conjugate gradient algorithms for unconstrained optimization,” Applied Mathematics and Computation, vol. 219, no. 11, pp. 6316–6327, 2013. View at: Publisher Site  Google Scholar  MathSciNet
 N. Andrei, “A simple threeterm conjugate gradient algorithm for unconstrained optimization,” Journal of Computational and Applied Mathematics, vol. 241, pp. 19–29, 2013. View at: Publisher Site  Google Scholar  MathSciNet
 K. Sugiki, Y. Narushima, and H. Yabe, “Globally convergent threeterm conjugate gradient methods that use secant conditions and generate descent search directions for unconstrained optimization,” Journal of Optimization Theory and Applications, vol. 153, no. 3, pp. 733–757, 2012. View at: Publisher Site  Google Scholar  MathSciNet
 Y. Narushima, H. Yabe, and J. A. Ford, “A threeterm conjugate gradient method with sufficient descent property for unconstrained optimization,” SIAM Journal on Optimization, vol. 21, no. 1, pp. 212–230, 2011. View at: Publisher Site  Google Scholar  MathSciNet
 S. BabaieKafaki and R. Ghanbari, “Two modified threeterm conjugate gradient methods with sufficient descent property,” Optimization Letters, vol. 8, no. 8, pp. 2285–2297, 2014. View at: Publisher Site  Google Scholar  MathSciNet
 M. AlBaali, Y. Narushima, and H. Yabe, “A family of threeterm conjugate gradient methods with sufficient descent property for unconstrained optimization,” Computational Optimization and Applications, vol. 60, no. 1, pp. 89–110, 2015. View at: Publisher Site  Google Scholar  MathSciNet
 M. Sun and J. Liu, “Three modified PolakRibièrePolyak conjugate gradient methods with sufficient descent property,” Journal of Inequalities and Applications, vol. 2015, no. 1, 2015. View at: Publisher Site  Google Scholar
 B. Baluch, Z. Salleh, A. Alhawarat, and U. A. M. Roslan, “A New Modified ThreeTerm Conjugate Gradient Method with Sufficient Descent Property and Its Global Convergence,” Journal of Mathematics, Article ID 2715854, 12 pages, 2017. View at: Publisher Site  Google Scholar  MathSciNet
 Z.F. Dai, “Two modified HS type conjugate gradient methods for unconstrained optimization problems,” Nonlinear Analysis: Theory, Methods & Applications, vol. 74, no. 3, pp. 927–936, 2011. View at: Publisher Site  Google Scholar  MathSciNet
 Z. Wei, G. Li, and L. Qi, “New nonlinear conjugate gradient formulas for largescale unconstrained optimization problems,” Applied Mathematics and Computation, vol. 179, no. 2, pp. 407–430, 2006. View at: Publisher Site  Google Scholar
 Z. Dai and F. Wen, “Another improved Wei–Yao–Liu nonlinear conjugate gradient method with sufficient descent property,” Applied Mathematics and Computation, vol. 218, no. 14, pp. 7421–7430, 2012. View at: Publisher Site  Google Scholar
 J. C. Gilbert and J. Nocedal, “Global convergence properties of conjugate gradient methods for optimization,” SIAM Journal on Optimization, vol. 2, no. 1, pp. 21–42, 1992. View at: Publisher Site  Google Scholar
 Y.H. Dai and L.Z. Liao, “New conjugacy conditions and related nonlinear conjugate gradient methods,” Applied Mathematics & Optimization, vol. 43, no. 1, pp. 87–101, 2001. View at: Publisher Site  Google Scholar  MathSciNet
 G. Zoutendijk, “Nonlinear programming, computational methods,” in Integer and Nonlinear Programming, J. Abadie, Ed., pp. 37–86, NorthHolland, Amsterdam, The Netherlands, 1970. View at: Google Scholar  MathSciNet
 Z.f. Dai and B.S. Tian, “Global convergence of some modified PRP nonlinear conjugate gradient methods,” Optimization Letters, vol. 5, no. 4, pp. 615–630, 2011. View at: Publisher Site  Google Scholar  MathSciNet
 W. W. Hager and H. Zhang, “A new conjugate gradient method with guaranteed descent and an efficient line search,” SIAM Journal on Optimization, vol. 16, no. 1, pp. 170–192, 2005. View at: Publisher Site  Google Scholar  MathSciNet
 D. C. Liu and J. Nocedal, “On the limited memory BFGS method for large scale optimization,” Mathematical Programming, vol. 45, no. 1–3, pp. 503–528, 1989. View at: Publisher Site  Google Scholar  MathSciNet
 J. J. Moré, B. S. Garbow, and K. E. Hillstrom, “Testing unconstrained optimization software,” ACM Transactions on Mathematical Software, vol. 7, no. 1, pp. 17–41, 1981. View at: Publisher Site  Google Scholar  MathSciNet
 N. Andrei, “An unconstrained optimization test functions collection,” Advanced Modeling and Optimization, vol. 10, no. 1, pp. 147–161, 2008. View at: Google Scholar  MathSciNet
 E. D. Dolan and J. J. Moré, “Benchmarking optimization software with performance profiles,” Mathematical Programming, vol. 91, no. 2, pp. 201–213, 2002. View at: Publisher Site  Google Scholar  MathSciNet
Copyright
Copyright © 2018 Bakhtawar Baluch et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.