Nonlinear Conjugate Gradient Coefficients with Exact and Strong Wolfe Line Searches Techniques
Nonlinear conjugate gradient (CG) methods are very important for solving unconstrained optimization problems. These methods have been subjected to extensive researches in terms of enhancing them. Exact and strong Wolfe line search techniques are usually used in practice for the analysis and implementation of conjugate gradient methods. For better results, several studies have been carried out to modify classical CG methods. The method of Fletcher and Reeves (FR) is one of the most well-known CG methods. It has strong convergence properties, but it gives poor numerical results in practice. The main goal of this paper is to enhance this method in terms of numerical performance via a convexity type of modification on its coefficient . We ensure that with this modification, the method is still achieving the sufficient descent condition and global convergence via both exact and strong Wolfe line searches. The numerical results show that this modified FR is more robust and effective.
An unconstrained optimization problem is solved using the nonlinear conjugate gradient method to obtain the minimal value of a given function. The CG technique is usually expressed as follows: such that , , and are given function denoting its gradient. To apply CG methods in solving (1), we start with an initial point and use the iterative form such that is the current iterative point, is a step-size determined by some line searches, and is the search direction defined by where and are scalar.
Different choices of the scalar relate to different conjugate gradient algorithms. Over the years, several versions of this approach have been presented, some of which are now extensively utilized. has at least six well-known formulas, which are listed below (Dai and Yaxiang , Conjugate descent , Fletcher and Colin , Liu and Storey , Hestenes and Stiefel , and Polak and Ribiere-Polyak ):
Many authors have examined the behavior of global convergence for the ’s formulas with several line search for several years (see, for example, [1–4, 7–18]). When the objective function is a strongly convex quadratic and the line search is exact, these methods are identical, since the gradients are mutually orthogonal, and the scalars in these methods are equal. When applied for general nonlinear functions with inexact line searches, the behavior of these methods is clearly distinct (see [11, 17, 19–21]). One of the most important properties of the CG methods is global convergence. Zoutendijk  proved the global convergence of the FR method via the exact line search. Although FR, DY, and CD methods have strong convergence properties, they may not perform well in practice . The CD and DY methods were proved to have a global convergence under strong Wolfe line search [3, 15]. Moreover, under the exact and strong Wolfe line searches, up to our knowledge, the global convergence and sufficient descent property of some CG methods such as the PRP and the HS have not been established [2, 14]. Andrei  classified the CG into groups, these are scaled CG method, classical CG method, and hybrid and parameterized CG methods.
Formulas (4) to (9) are in the classical group. One of the most important among them is the FR method, in which when a bad direction and a little step from to are generated, the next search direction and the next step will be delayed unless a rest along the gradient direction is performed. Despite this defect, it is well known that the FR method is globally convergent for general nonlinear functions with exact or inexact line search .
Al-Baali  proved the global convergence of the FR method if the strong Wolfe line search is used and the parameter is restricted in . Also, Liu and Li  extended Al-Baali’s result to the case that . Moreover, Gilbert and Jorge  investigated global convergence properties of the dependent FR conjugate gradient method with satisfying , provided that the line search satisfies the strong Wolfe line search. If satisfies (where ), they have given an example to indicate that even exact line search cannot guarantee the global convergence property of the dependent FR conjugate gradient method. Nosratipour and Keyvan  proved that the dependent FR conjugate gradient method is globally convergent with the strong Wolfe line search if satisfies where , , , and .
Zhang and Donghui  proposed a modified FR method (called MFR) in which the direction is defined by which is a descent direction independent of the line search.
Competitive numerical and global convergence results are obtained by newly introduced or modified techniques; see for instance  and the references therein.
This paper is organized as follows: Section 2 is devoted obtain the modification and introduce the algorithm for the modified FR method. In Section 3, we have determined sufficient descent and global convergence properties under exact and strong Wolfe line searches. Section 4 provides preliminary numerical results and considerations. Section 5 shows the conclusions.
2. Motivation and Properties
Several authors have attempted to modify classical CG methods such as FR, PRP, HS, and LS in order to produce new modifications with sufficient descent and global convergence properties. In addition to that, the new modifications are expected to have an efficient numerical performance. It is well known that the FR method has poor numerical performance but has strong convergence properties. The main aim from this paper is to overcome this flaw, using the following modification to the FR formula: where is a scalar parameter, which is to be determined later. Note that if , then and if , then . On the other hand, if , then is the modification of , where means the norm in . (12) satisfies the following inequalities:
Algorithm 1. Step 1. Initialization. Given , , is tolerance, , when stop.
Step 2. Evaluate according to (12).
Step 3. Evaluate according to (3); if , then stop; otherwise, go to the next step.
Step 4. Evaluate an using exact line search, i.e., that satisfies and inexact line searches, i.e., that satisfies where .
Step 5. Renew the point according to (2), if , then stop
Step 6. Set , and go to step 2.
The ideal method to choose the step length is via exact line search. But since it is too expensive to choose it via exact line search in practice, some approximation methods called inexact line search such as strong Wolfe are used to define the step length that give suitable reductions in the objective function with minimal cost. However, the convergence properties of some CG methods such as FR, RMIL, and RMIL+ have been established under exact line search (see [3, 6, 25]).
3. Convergent Analysis
In this section, we will examine the convergence properties of . The main feature of Algorithm 1 is achieving sufficient descent conditions and global convergence properties according exact and strong Wolfe line searches.
3.1. Convergent Analysis via the Exact Line Search
In this subsection, we show that our modification (12) will possess sufficient descent condition and global convergence properties according to the exact line search.
Theorem 1. Assume that (2) and (3) be generated by Algorithm 1 and be determined by the exact line search (14), where is given as (12); there exists , such that The proof of this Theorem 1 is obvious, from (3), and multiply by ; then, When , then (17) holds true for all and becomes
3.1.1. Global Convergence Properties
In this subsection, we show that our modified (12) coefficient satisfies global convergence according to the exact line search.
Assumption 2. (i)There exists some positive constant such that for all and for some neighborhood of . Also, assume that there exists a positive constant such that(ii)There exists positive constant From the above assumption, we can easily see that
The following lemma will be used in our analysis (see Zoutendijk ).
Theorem 4. Assume that Assumption 2 holds, for conjugate gradient method as in (2) and (3) such that is achieved via exact line search. Furthermore, assume that the sufficient descent condition is satisfied. Then,
Proof. The proof will be conducted by contradiction argument. So, assume that the statement of Theorem 4 is false. Thus, there exists some positive constant , where We rewrite (3) as Taking the square on both sides of above equation, we get Dividing both sides of (27) by , we obtain From (19) and (13), we obtain Recursively using (29) and noticing that , we get Hence, As a result of (32) and (25), it is clear that which contradicts Lemma 3’s Zoutendijk condition; hence, the proof is completed.
3.2. Convergent Analysis according to Strong Wolf Line Searches
3.2.1. Sufficient Descent Condition
Proof. Multiplying (3) by and (13) and taking the absolute value of the second term in (36), we obtain From (16) and the Cauchy-Schwartz inequality, we get Dividing both sides in the above inequality by , we obtain By repeating this process and the fact that , we get Therefore, from (39), we can deduce that (17) holds for . The proof is completed.
3.2.2. Global Convergence
This subsection is devoted to the global convergence in the case of inexact line search technique. The following lemma is collected from .
Proof. The proof will be conducted by contradiction argument. So, assume that the statement of Theorem 7 is not true; then, there exists a constant , such that
Equation (3) can be written as
and multiplying (45) on both sides by , we obtain
Dividing both sides of (46) by with the help of (13), we get
From Cauchy-Schwartz inequality, we get
So we come to
Referring to (39), using Cauchy-Schwartz inequality, we get
Substituting (51) into (49), we get
Recursively using (52) and noticing that , we have As a result of (53) and (44), it can be concluded that which contradicts (43); hence, the proof is completed.
4. Numerical Results and Discussion
Most of the test problems used in this study are taken from Andrei , and they are used to evaluate the efficiency of the NMFR method to that of FR and PRP under exact line search and to that of FR, PRP, and CD under strong Wolfe line search. The step-size is computed using the exact and strong Wolfe line search techniques, and numerical results are compared based on the number of iterations and CPU time. For all test problems, the stopping criteria are set to be , where ; for each test problem, various starting points are used, as suggested by Hillstrom . All runs are performed on a PC ACER (Intel® Core™ i3-3217u CPU @ 1.8 GHZ, with 4.00 GB RAMS, Windows 7 Ultimate). Every problem mentioned in Table 1 is solved using Matlab10 subroutine programming. The performance results are shown in Figures 1–4, respectively, using a performance profile introduced by Dolan and Jorge .
In the list of problem functions, IN indicates the number of iterations, and CPU indicates the CPU time as shown in Table 1. F indicates that the test problem function is failing. Also, in some cases, it means that the computation came to a halt when a line search failed to locate a positive step-size, and it was deemed a failure.
We offer the concept of a way of evaluating and examine the effectiveness of set solvers on a test set using the performance profile. Assuming there are solvers and problems; they characterized as the computing time (computational time, CPU time, or other factors) needed to tackle problems by solver . They used the performance ratio to compare solver performance on problem to the best performance by any solver on this problem.
We let for each chosen and such that , whenever is not a solution of problem . The solution for performance of presented problems has to be reliable, and we wish to have the entire evaluation of solution for performance . Thus, we define , where stands for the probability of solution for performance and is a cumulative distribution function for . The value is the probability that the solver will win over the rest of the solvers. In all, a solver with high values of or at the top right of the figures is preferable or represents robust solver.
Clearly from Figures 1 and 2, we see that NMFR has better performance, since it solved all the test problems and reached 100% percentage. These CG coefficients could also be divided into three categories: the first category of which consists of NMFR, while the second consists of FR, and the third consists of PRP. It is easy to see that their performance is much better. Although the performance of the third category seems to be much better than NMFR, it could only solve 86% of the problems, whereas the performance of the second category only reached 89%. Hence, we considered NMFR as an efficient performance and robust method with the others because it can solve all the problems.
Also, Figures 3 and 4 show that the curve of NMFR is higher than that of PRP, CD, and FR. This implies that the NMFR approach outperforms the other three methods significantly. Furthermore, the NMFR approach solves all problems; meanwhile, the PRP method solves about 86 percent of problems, and the CD and FR methods solve about 89 percent. As a result, we can infer that NMFR is the preferred approach because it has the highest curve and solves all problems.
In this article, we have proposed a new and simple modification for that is easy to implement, known as . Numerical results show that has efficient performance compared to other standard CG methods. In contrast to , we have seen that shows good numerical performance at each step. We have also proved that converges globally based on the exact and strong Wolfe line searches.
The data used to support the findings of this study are available from the corresponding author upon request.
Conflicts of Interest
The authors declare that they have no conflicts of interest.
R. Fletcher, Practical Methods of Optimization, Unconstrained Optimization John Wiley & Sons (New York), vol. 1, 1987.
N. Andrei, “An unconstrained optimization test functions collection,” Advanced Modeling and Optimization, vol. 10, no. 1, pp. 147–161, 2008.View at: Google Scholar
N. Andrei, “Open problems in nonlinear conjugate gradient algorithms for unconstrained optimization,” Bulletin of the Malaysian Mathematical Sciences Society, vol. 34, no. 2, 2011.View at: Google Scholar
A. Abdelrahman, O. O. O. Yousif, M. Mhammed, and M. K. Elbashir, “Global convergence of nonlinear conjugate gradient coefficients with inexact line search,” The Scientific Journal of King Faisal University: Basic and Applied Sciences, vol. 22, no. 2, pp. 86–89, 2021.View at: Google Scholar
Y. H. Dai and Y. Yaxiang, Nonlinear Conjugate Gradient Methods, Shanghai Science and Technology Publisher, Shanghai, 2000.
Y. Dai, “Analyses of conjugate gradient methods (Ph.D. thesis),” Institute of Computational Mathematics and Scientific/Engineering Computing, Chinese Academy of Sciences, 1997.View at: Google Scholar
G. Zoutendijk, “Nonlinear programming computational methods,” Integer and Nonlinear Programming, North Holland, Amsterdam, pp. 37–86, 1970.View at: Google Scholar
H. Nosratipour and A. Keyvan, “A descent PRP conjugate gradient method for unconstrained optimization,” TWMS Journal of Applied and Engineering Mathematics, vol. 9, no. 3, pp. 535–548, 2019.View at: Google Scholar