Modification of Nonlinear Conjugate Gradient Method with Weak Wolfe-Powell Line Search

Alhawarat, Ahmad; Salleh, Zabidin

doi:https://doi.org/10.1155/2017/7238134

Abstract and Applied Analysis

On this page

Abstract Introduction Conclusion References Copyright Related Articles

Research Article | Open Access

Volume 2017 | Article ID 7238134 | https://doi.org/10.1155/2017/7238134

Modification of Nonlinear Conjugate Gradient Method with Weak Wolfe-Powell Line Search

Ahmad Alhawarat¹and Zabidin Salleh²

Academic Editor: Patricia J. Y. Wong

Received21 Nov 2016

Revised05 Feb 2017

Accepted12 Feb 2017

Published05 Mar 2017

Abstract

Conjugate gradient (CG) method is used to find the optimum solution for the large scale unconstrained optimization problems. Based on its simple algorithm, low memory requirement, and the speed of obtaining the solution, this method is widely used in many fields, such as engineering, computer science, and medical science. In this paper, we modified CG method to achieve the global convergence with various line searches. In addition, it passes the sufficient descent condition without any line search. The numerical computations under weak Wolfe-Powell line search shows that the efficiency of the new method is superior to other conventional methods.

1. Introduction

The nonlinear CG method is a useful tool to find the minimum value of function for unconstrained optimization problems. Let us consider the following formwhere is continuously differentiable and its gradient is denoted by . The method to find a sequence of points starting from initial point is given by the iterative formula:where is the current iteration point and is the step size obtained by some line search. The search direction is defined bywhere and is known as the conjugate gradient coefficient.

Strong Wolfe-Powell (SWP) line search is the most popular inexact line search, which is depending on a reduction in function and decreasing the search area to find step length. In addition, it forces the step length to be closed to stationary point or local minimum of function, so it is useful method to find the step size.where . In fact, SWP line search is modified from weak Wolfe-Powell (WWP), so we find that the step length satisfies (4) and However, WWP line search may accept the step length far from stationary or local minimum of function. Dai [1] proposed two Armijo type line searches: the first one matches the global convergence for any using methods (2) and (3). By this line search, the global convergence for FR, nonnegative PRP, and CD methods have been established. To match the global convergence of original PRP method, he designed another line search proposed as follows.

Given a constant , and determine the smallest integer , if it defines , then the vectors and given by (2) and (3) satisfy (4) withwhere and are two constants.

The most popular formulas for are as follows: Hestenes-Stiefel (HS) [2], Fletcher-Reeves (FR) [3], Polak-Ribière-Polyak (PRP) [4], Conjugate Descent (CD) [5], Liu-Storey (LS) [6], Dai-Yuan (DY) [7], Wei et al. (WYL) [8], and Hager and Zhang (HZ) [9].where , with and being a constant.

The global convergence of FR method with exact line search was achieved by Zoutendijk [10], Al-Baali [11] proved that FR method is globally convergent under strong Wolfe condition when , and later Liu et al. [12] extended the result to . Its behavior on numerical computation is unpredictable. In few cases, it is as efficient as PRP method. However, generally, it is very slow. In addition, DY and CD have the same performance as FR method under exact line search with strong global convergence. Global convergence of PRP method for convex objective function under exact line search was proved by Polak and Ribière in 1969 [4]. Later, Powell gave out a counterexample showing that there exists nonconvex function, which PRP method does not converge globally, although the exact line search is used. Powell suggested the importance of achieving the global convergence of PRP method, and it should not be negative. Gilbert and Nocedal [13] proved that nonnegative PRP method is globally convergent with the Wolfe-Powel line search. HS method and LS method have the same performance as PRP with exact line search. Therefore, PRP method is the most efficient method when it is compared to the other conjugate gradient methods. For more, the reader can see the following references [14–19].

In 2006, Wei et al. [8] gave a new positive CG method, and it seems like original PRP method which has been studied in both exact line search and inexact line search, and many modifications have appeared, such as the following [20–23], respectively.

A little modification from , Zhang [21] presented the following CG method: In the same manner, construct the following CG by using the denominator of :In addition, is constructed by using the numerator of :where and .

The descent condition plays important rule in CG method given by If we extend (12) to the following form, then the search direction satisfies the sufficient descent condition.

In this paper, we will present the new formula and the algorithm in Section 2. Furthermore, we will establish the global convergence of our method with several line searches in Section 3. Numerical results with conclusion will be presented in Sections 4 and 5, respectively.

2. The Modified Formula

In this section, is presented which is extended to and method; that is,where means the Euclidean norm, and .

Algorithm 1.
Step 1 (initialization). Given , set .
Step 2. Compute based on (14).
Step 3. Compute based on (3). If , then stop.
Step 4. Compute based on some line search; we use in numerical section WWP line search with and .
Step 5. Update new point based on (2).
Step 6. Convergent test and stopping criteria: if and then stop; otherwise, go to Step 1 with .

3. The Global Convergence Analysis for Method

The following assumption is needed to be used in following theorems.

Assumption 2. (I) is bounded from below on the level set , where is the starting point.
(II) In some neighborhood of , is continuous and differentiable, and its gradient is Lipschitz continuous; that is, for any , there exists a constant such that .

Lemma 3. Let Assumption 2 hold. Consider any method in form (2), (3), and satisfies the WWP line search (4) and (6), in which the search direction is descent. Then, the following condition holds:Substituting (13) into (15), it follows that

3.1. The Sufficient Descent Condition with Convergence Properties for SWP Line Search

Theorem 4. Let sequences and be generated by methods (2), (3), and (14); then (13) holds, where .

Proof. We use proof by induction. From (3), we know that for it is hold. Suppose that it is true until ; that is, thenNow multiply (3) by :where . Take and complete the proof.

3.2. Global Convergence under WWP Line Search

Gilbert and Nocedal [13] present an important theorem to find the global convergence for a nonnegative part of PRP method; it is summarized by Theorem 5. In addition, [13] presents a nice property called Property , which plays strong roles in studies of CG methods.

Property . Consider a method of form (1) and (2), and suppose ; we say that the method possesses Property if there exists constant and , where for all , and we get , and if , then

Theorem 5 (see [13]). Consider that any CG method of form (2) and (3) achieves the following conditions that hold:(I)(II)The sufficient descent condition (13)(III)Zoutendijk condition(IV)Property (V)Assumption 2Then the iterates are globally convergent.

Lemma 6. Suppose that Assumption 2 holds with Algorithm 1; then satisfy Property .

Proof. Since and since satisfies Property , also achieves Property ; for more we suggest that the reader reads Lemma 3.6 [24]. The proof is completed.

The following corollary is a result from Theorem 5 and Lemma 3.

Corollary 7. Let sequences be generated by Algorithm 1. If Assumption 2 holds true, then any line search satisfies Zoutendijk condition; we have .

3.3. Global Convergence Properties for Armijo Type Line Search

Theorem 8. Suppose Assumption 2 is true. Consider the methods of form (2) and (3) with , and is obtained by (4) and (7). Then we have .

Proof. By using Lemma 2.8 in [1], we achieve Using (2) and (7), then From (2), (4), (7), and (20), we have From Assumption 2 and (21), we obtain From (3), Using (23), (13), (14), and (24), thenwhere . Take the limit and use (22), and then we have . The proof is completed.

4. Numerical Results and Discussions

To analyze the efficiency of the new method, we selected some of the test functions in Table 1 from CUTEr [25], Andrei [26], and Adorio and Diliman [24]. We performed a comparison with other CG methods, including NPRP and DPRP methods using weak Wolfe-Powell line search with . The tolerance is selected to for all algorithms to investigate the rapidity of the iteration methods towards the optimal. The gradient value is taken as the stopping criteria. Here, the stopping criteria considered . Since the parameters NPRP and DPRP are tested based on weak Wolfe-Powell line search, the modified parameters are tested based on weak Wolfe line search with values of and . In addition, the values of and are for and DPRP parameters, respectively.

We used Matlab 7.9 subroutine program, with CPU processor Intel (R) Core (TM), i3 CPU, and 2 GB DDR2 RAM under strong Wolfe line search. The performance results are shown in Figures 1 and 2, respectively, using a performance profile introduced by Dolan and Moré [27]. This performance measure was introduced to compare a set of solvers on a set of problems . Assuming solvers and problems in and , respectively, the measure is defined as the computation time (e.g., the number of iterations or the CPU time) required for solver to solve problem .

To create a baseline for comparison, the performance of solver on problem is scaled by the best performance of any solver in on the problem using the ratio:Let the parameter for all be selected, and further assume that if and only if the solver does not solve problem . As we would like to obtain an overall assessment of the performance of a solver, we defined the measure:Thus, is the probability for solver that the performance ratio is within a factor of the best possible ratio. If we define the function as the cumulative distribution function for the performance ratio, then the performance measure for a solver is nondecreasing and piecewise continuous function from the right. The value of is the probability that the solver achieves the best performance of all of the solvers. In general, a solver with high values of , which would appear in the upper right corner of the figure, is preferable.

It is clear that parameter is strong competitive with NPRP parameter and slightly better in some cases for all graphs in Figures 1, 2, 3, and 4 which include the number of iterations, CPU times, gradient evaluations, and function evaluations. On the other hand, it is clear that parameter outperforms DPRP parameter in all performance profiles.

5. Conclusion

In this paper, we proposed a new modification of conjugate gradient method extended from NPRP methods. Our numerical results had shown that the new coefficient is comparable compared to other conventional CG methods. This method converges globally with several line searches with descent direction. However, in future, we will focus on speed using hybrid methods. Additionally, we will try to compare several line searches with modern CG method.

Competing Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

References

Y.-H. Dai, “Conjugate gradient methods with Armijo-type line searches,” Acta Mathematicae Applicatae Sinica, vol. 18, no. 1, pp. 123–130, 2002.
View at: Publisher Site | Google Scholar | MathSciNet
M. R. Hestenes and E. Stiefel, Methods of Conjugate Gradients for Solving Linear Systems, vol. 49, National Bureau of Standards, Washington, DC, USA, 1952.
R. Fletcher and C. M. Reeves, “Function minimization by conjugate gradients,” The Computer Journal, vol. 7, no. 2, pp. 149–154, 1964.
View at: Publisher Site | Google Scholar | MathSciNet
E. Polak and G. Ribière, “Note sur la convergence de méthodes de directions conjuguées,” Revue Française d'Automatique, Informatique, Recherche Opérationnelle, vol. 3, no. 16, pp. 35–43, 1969.
View at: Google Scholar
R. Fletcher, Practical methods of optimization, Wiley-Interscience John Wiley & Sons, New York, NY, USA, 2nd edition, 2001.
View at: MathSciNet
Y. Liu and C. Storey, “Efficient generalized conjugate gradient algorithms. I. Theory,” Journal of Optimization Theory and Applications, vol. 69, no. 1, pp. 129–137, 1991.
View at: Publisher Site | Google Scholar | Zentralblatt MATH | MathSciNet
Y. H. Dai and Y. Yuan, Nonlinear Conjugate Gradient Methods, Shanghai Science and Technology Publisher, Shanghai, China, 2000.
Z. Wei, S. Yao, and L. Liu, “The convergence properties of some new conjugate gradient methods,” Applied Mathematics and Computation, vol. 183, no. 2, pp. 1341–1350, 2006.
View at: Publisher Site | Google Scholar | Zentralblatt MATH | MathSciNet
W. W. Hager and H. Zhang, “A new conjugate gradient method with guaranteed descent and an efficient line search,” SIAM Journal on Optimization, vol. 16, no. 1, pp. 170–192, 2005.
View at: Publisher Site | Google Scholar | Zentralblatt MATH | MathSciNet
G. Zoutendijk, “Nonlinear programming, computational methods,” Integer and Nonlinear Programming, pp. 37–86, 1970.
View at: Google Scholar | MathSciNet
M. Al-Baali, “Descent property and global convergence of the Fletcher-Reeves method with inexact line search,” IMA Journal of Numerical Analysis, vol. 5, no. 1, pp. 121–124, 1985.
View at: Publisher Site | Google Scholar | Zentralblatt MATH | MathSciNet
G. H. Liu, J. Y. Han, and H. X. Yin, “Global convergence of the Fletcher-Reeves algorithm with inexact linesearch,” Applied Mathematics, vol. 10, no. 1, pp. 75–82, 1995.
View at: Publisher Site | Google Scholar | MathSciNet
J. C. Gilbert and J. Nocedal, “Global convergence properties of conjugate gradient methods for optimization,” SIAM Journal on Optimization, vol. 2, no. 1, pp. 21–42, 1992.
View at: Publisher Site | Google Scholar | Zentralblatt MATH | MathSciNet
A. Alhawarat, M. Mamat, M. Rivaie, and I. Mohd, “A new modification of nonlinear conjugate gradient coefficients with global convergence properties, World Academy of Science, Engineering and Technology, International Science Index 85,” International Journal of Mathematical, Computational, Physical and Quantum Engineering, vol. 8, no. 1, pp. 54–60, 2014.
View at: Google Scholar
A. Alhawarat, M. Mamat, M. Rivaie, and Z. Salleh, “An efficient hybrid conjugate gradient method with the strong Wolfe-Powell line search,” Mathematical Problems in Engineering, vol. 2015, Article ID 103517, 7 pages, 2015.
View at: Publisher Site | Google Scholar
Z. Salleh and A. Alhawarat, “An efficient modification of the Hestenes-Stiefel nonlinear conjugate gradient method with restart property,” Journal of Inequalities and Applications, vol. 2016, no. 1, article no. 110, 2016.
View at: Publisher Site | Google Scholar
A. Alhawarat, Z. Salleh, M. Mamat, and M. Rivaie, “An efficient modified Polak–Ribière–Polyak conjugate gradient method with global convergence properties,” Optimization Methods and Software, pp. 1–14, 2016.
View at: Publisher Site | Google Scholar
M. Al-Baali, Y. Narushima, and H. Yabe, “A family of three-term conjugate gradient methods with sufficient descent property for unconstrained optimization,” Computational Optimization and Applications, vol. 60, no. 1, pp. 89–110, 2015.
View at: Publisher Site | Google Scholar | MathSciNet
W. W. Hager and H. Zhang, “The limited memory conjugate gradient method,” SIAM Journal on Optimization, vol. 23, no. 4, pp. 2150–2168, 2013.
View at: Publisher Site | Google Scholar | Zentralblatt MATH | MathSciNet
Y. Shengwei, Z. Wei, and H. Huang, “A note about WYL's conjugate gradient method and its applications,” Applied Mathematics and Computation, vol. 191, no. 2, pp. 381–388, 2007.
View at: Publisher Site | Google Scholar | MathSciNet
L. Zhang, “An improved Wei-Yao-Liu nonlinear conjugate gradient method for optimization computation,” Applied Mathematics and Computation, vol. 215, no. 6, pp. 2269–2274, 2009.
View at: Publisher Site | Google Scholar | MathSciNet
Z. Dai and F. Wen, “Another improved Wei-Yao-Liu nonlinear conjugate gradient method with sufficient descent property,” Applied Mathematics and Computation, vol. 218, no. 14, pp. 7421–7430, 2012.
View at: Publisher Site | Google Scholar | Zentralblatt MATH | MathSciNet
H. Huang and S. Lin, “A modified Wei–Yao–Liu conjugate gradient method for unconstrained optimization,” Applied Mathematics and Computation, vol. 231, pp. 179–186, 2014.
View at: Publisher Site | Google Scholar | MathSciNet
E. P. Adorio and U. P. Diliman, “Mvf-multivariate test functions library in C for unconstrained global optimization,” 2005.
View at: Google Scholar
I. Bongartz, A. R. Conn, N. Gould, and P. L. Toint, “CUTE: constrained and unconstrained testing environment,” ACM Transactions on Mathematical Software, vol. 21, no. 1, pp. 123–160, 1995.
View at: Publisher Site | Google Scholar
N. Andrei, “An unconstrained optimization test functions collection,” Advanced Modeling and Optimization, vol. 10, no. 1, pp. 147–161, 2008.
View at: Google Scholar | MathSciNet
E. D. Dolan and J. J. Moré, “Benchmarking optimization software with performance profiles,” Mathematical Programming, vol. 91, no. 2, pp. 201–213, 2002.
View at: Publisher Site | Google Scholar | MathSciNet

Copyright

Copyright © 2017 Ahmad Alhawarat and Zabidin Salleh. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

PDF Download Citation

Download other formats

Order printed copies

Views

2260

Downloads

1528

Citations