An Efficient Hybrid Conjugate Gradient Method with the Strong Wolfe-Powell Line Search

Alhawarat, Ahmad; Mamat, Mustafa; Rivaie, Mohd; Salleh, Zabidin

doi:https://doi.org/10.1155/2015/103517

Mathematical Problems in Engineering

On this page

Abstract Introduction Conclusion Acknowledgments References Copyright Related Articles

Research Article | Open Access

Volume 2015 | Article ID 103517 | https://doi.org/10.1155/2015/103517

An Efficient Hybrid Conjugate Gradient Method with the Strong Wolfe-Powell Line Search

Ahmad Alhawarat,¹Mustafa Mamat,²Mohd Rivaie,³and Zabidin Salleh¹

Academic Editor: Haipeng Peng

Received11 Mar 2015

Revised06 Jul 2015

Accepted07 Jul 2015

Published22 Jul 2015

Abstract

Conjugate gradient (CG) method is an interesting tool to solve optimization problems in many fields, such as design, economics, physics, and engineering. In this paper, we depict a new hybrid of CG method which relates to the famous Polak-Ribière-Polyak (PRP) formula. It reveals a solution for the PRP case which is not globally convergent with the strong Wolfe-Powell (SWP) line search. The new formula possesses the sufficient descent condition and the global convergent properties. In addition, we further explained about the cases where PRP method failed with SWP line search. Furthermore, we provide numerical computations for the new hybrid CG method which is almost better than other related PRP formulas in both the number of iterations and the CPU time under some standard test functions.

1. Introduction

The nonlinear conjugate gradient (CG) method is a useful tool to find the minimum value for unconstrained optimization problems. Consider the following form:where is continuously differentiable function and its gradient is denoted by . The CG method is to find a sequence of points , , starting from initial point which is given as the following iterative formula:where is the current iteration point and is the step length obtained by some line search. The search direction is defined bywhere and is known as CG method, formula, or coefficient.

To find the step length () we can use the exact line search which is given byIn fact, (4) is not an effective line search since it needs heavy computations in function and gradient evaluations. Therefore, we prefer to use inexpensive line search. The strong Wolfe-Powell (SWP) line search [1, 2], which is given as follows, where , is to find an approximation of where the descent property (see (14)) must be satisfied and no longer searching in the direction when is far from the solution. Thus, by using SWP line search we inherit the advantages of exact line search with inexpensive and low computational cost. However, different choices of and imply different CG methods. In fact the SWP line search is a strong version of weak Wolfe-Powell (WWP) line search where the latter is given by (5) and The CG method has been developed recently based on its simplicity, numerical efficiency, and low memory requirements. Thus, it is used widely in engineering medical science and other fields. As an application in engineering we can use CG method to solve some real life problem similar to that mentioned in [3]. The CG method is limited for the functions where their gradient is available. Thus, the heuristic algorithm [4] can be used as an alternative method to find the solution for general functions. A heuristic algorithm is to find an approximation solution for the objective functions with accepted time. In addition, the heuristic algorithms can be applied without using computers. We refer the reader to see some applications for this algorithm in [5–7].

The most popular formulas for are Hestenes-Stiefel (HS) [8], Fletcher-Reeves (FR) [9], Polak-Ribière-Polyak (PRP) [10], and Wei et al. (WYL) [11], respectively, as follows:where .

Hestenes-Stiefel [8] proposed the first formula for solving the quadratic functions in 1952. Fletcher and Reeves [9] presented the first formula (9) for nonlinear functions in 1964. The convergence properties of FR method with exact line search were obtained by Zoutendijk [12]. Al-Baali [13] proved that FR method is globally convergent with the SWP line search when . Later on Guanghui et al. [14] extended the result to . The global convergence of PRP method (10) with the exact line search was proved by Elijah and Ribiere in [10]. Powell [15] gave out a counterexample showing that there exists nonconvex function, where PRP method does not converge globally, even when the exact line search is used. Powell suggested using nonnegative PRP method to reveal this problem. Gilbert and Nocedal [16] proved that nonnegative PRP (PRP+) method, that is, , is globally convergent under complicated line searches. However, there is no guarantee that PRP+ is convergent with SWP line search for general nonlinear functions. Touati-Ahmed and Storey [17] suggest the following hybrid method:In 2006 Wei et al. [11] presented a new positive CG method (11), which is quite similar to original PRP method which has been studied in both exact and inexact line search. Many modifications have appeared, such as the following [18–20], respectively:Recently many CG formulas were constructed in order to get the efficiency and robustness. For more about the latest CG methods we refer the reader to see [21, 22].

One of the important rules in CG methods is the descent condition; that is, if one can prove then we have a guarantee for . If we extended (14) to the following form,then (15) is called the sufficient descent condition.

This paper is organized as follows; in Section 2 we will present the current problem for PRP and nonnegative PRP method with SWP line search. Later on we will suggest the new hybrid CG formula and its simplifications. Furthermore, we will establish the global convergence properties with the SWP line search, in Section 3. Numerical results with conclusion will be presented in Sections 4 and 5, respectively.

2. Motivation and the Hybrid Formula

The PRP formula is one of the best CG methods in this century. However, as we mentioned before this method fails to solve some standard test problems for nonconvex functions; even the exact line search is used. Thus, the main contribution of this paper is to extend using PRP formula in several cases with SWP line search under mild condition and restart the CG algorithm by using NPRP CG formula when PRP failed to satisfy that condition.

The following discussion illustrates the cases in which PRP method fails and succeeds with SWP line search to obtain the convergence properties. The PRP method could be simplified as follows:Therefore we have the following cases.

Case A. If , then we have the following two possibilities.
Case A1. If , then we have . In this case, PRP method is efficient and has global convergence properties.
Case A2. If , then we have . In this case based on [16] we fail to obtain the global convergence properties for nonconvex functions.

Case B. If , then

In Case B there is no guarantee that this method will satisfy the sufficient descent condition.

For the next discussion, we will discuss the nonnegative PRP method which is given as follows:Therefore, we have a problem in Case B. To solve this problem Gilbert and Nocedal [16] used another line search to satisfy the convergence properties. In addition, if , the CG method returns to the steepest descent method which is sometimes a weak tool to find the optimum point for functions. Furthermore we can notice that So from PRP method we can use only Case A1.

To improve the above ideas, we suggest the following hybrid method: If , then under the condition , we obtainAnd if we obtainOne of the advantages for is that we can use Case A1 and Case B under the condition .

Choosing as a restart CG formula in (20), if the condition is not satisfied that means Thus, NPRP method is a suitable nonnegative value to use.

The following algorithm is an algorithm of CG method with the new coefficient .

Algorithm 1. Consider the following.
Step 1. Initialization: given , set .
Step 2. Compute based on (20).
Step 3. Compute based on (3).
Step 4. Compute based on (5) and (6).
Step 5. Update new point based on (2).
Step 6. Convergent test and stopping criteria: if then stop; otherwise go to Step 2 with .

3. The Global Convergence Properties for with SWP Line Search

The following standard assumptions are necessary for this work.

Assumption 1. The level set , with to the starting point of the iterative method (2), is bounded.

Assumption 2. In some open convex neighborhood of , is continuous and differentiable, and its gradient is Lipschitz continuous; that is, for any , , there exists a constant such that .

The following lemma is one of the most important lemmas which is used to prove the global convergence properties.

Lemma 2 (see [12]). Suppose Assumptions 1 and 2 are true. Consider any form of (2) and (3), with computed by WWP line search direction , is descent for all ; thenwhereEquation (25) is known as Zoutendijk condition.

The following discussion will discuss the global convergence properties for with SWP line search.

Case 1 (). Since the proof of the global convergence properties is similar to . We refer the reader to see [13, 14].

Case 2 (, where ). In this case we haveThe following theorem demonstrates in Case 2 satisfies the sufficient descent condition with SWP line search.

Theorem 3. Consider the sequences and are constructed by Algorithm 1 and is computed by (5) and (6) if . Then (15) holds.

Proof. From (3), for , it is obvious. Suppose it is true until ; that is, . Multiply (3) by ; we obtainDividing both sides by , and by using Case 2 and (6), then we obtainFrom (29) we obtainSincewe haveIf we getLetthenThuswhere . The proof is complete.

Gilbert and Nocedal [16] presented an important theorem (Theorem 4) to find the global convergence properties for nonnegative PRP method if the descent condition is satisfied. Furthermore [16] presented nice property called as follows.

. Consider a CG method of forms (2) and (3), and suppose ; we say that the CG method possesses if there exists constant , and , such that, for all , we get , and if , we obtain

Theorem 4. Consider any CG method of forms (2) and (3) achieves the following properties: (I).(II)The sufficient descent condition (15) holds.(III)The Zoutendijk condition (25) is satisfied by the line search.(IV) holds.(V)Assumptions 1 and 2 hold.Then the iterates are globally convergent.

The next lemma shows that if the gradients are bounded away from zero and holds, then a certain fraction of steps cannot be too small. The proof is given in [16]. However, we state it for readability.

Lemma 5. Consider a CG algorithm as defined in (2) and (3) with the parameter . If Assumptions 1 and 2 are satisfied, then holds.

Proof. Let and :Using , we haveBy Assumption 2,The proof is complete.

Lemma 6. The CG formula presented in Case 2 has the following properties:
(1), since the condition forces the CG formula in (20) to be nonnegative.(2) satisfies , based on Lemma 5.(3) satisfies the sufficient descent condition based on Theorem 3 and .

By using Theorems 3 and 4 and Lemma 6 we have the following convergence result. The proof is similar to Theorem 4.3 which is presented in [16].

Theorem 7. Suppose that Assumption 1 holds. Consider the CG method of forms (2) and (3) and as in Case 2, where is computed by (5) and (6) with ; then .

4. Numerical Results and Discussions

To evaluate the efficiency of the new method, we selected some of the test functions in Table 1 from CUTEr [24], Neculai [23], and Adorio and Diliman [25]. We performed a comparison with other CG methods, including VHS, NPRP, PRP+, FR, and formulas. The tolerance is selected to for all algorithms to investigate the rapidity of the iteration methods towards the optimal solution. The gradient value is used as the stopping criteria. Here, the stopping criteria considered . We considered the method failed if the number of iterations exceeds 1000 times.

In Table 1 we selected different initial points for every function. Thus, this demonstrated that this method can be used in several real life functions from other fields such as engineering and medical science as we mentioned before. In addition different dimensions from 500 until 10000 are used. We also choose from different group of functions. We used Matlab 7.9 subroutine program, with CPU processor Intel (R) Core (TM), i3 CPU, and 2 GB DDR2 RAM under SWP line search to find the optimum point. The efficiency comparisons results are shown in Figures 1 and 2, respectively, using a performance profile introduced by Dolan and Moré [26].

This performance measure was introduced to compare a set of solvers on a set of problems . Assuming solvers and problems in and , respectively, the measure is defined as the computation time (e.g., the number of iterations or the CPU time) required to solve problem by solver .

To create a baseline for comparison, the performance of solver on problem is scaled by the best performance of any solver in on the problem using the ratioSuppose that a parameter for all is chosen, and if and only if solver does not solve problem . Because we would like to obtain an overall assessment of the performance of a solver, we defined the measureThus, is the probability for solver that the performance ratio is within a factor of the best possible ratio. If we define the function as the cumulative distribution function for the performance ratio, then the performance measure for a solver is nondecreasing and piecewise continuous from the right. The value of is the probability that the solver has the best performance of all of the solvers. In general, a solver with high values of , which would appear in the upper right corner of the figure, is preferable.

Based on the left side of Figures 1 and 2 the formula is above the other curves. Therefore, it is the most efficient method among related PRP methods in terms of efficiency and robustness. In Figure 2 we see that the curve of is still the best, but the efficiency is not good as the number of iterations since we use the complicated hybrid algorithm leads to high CPU time. Thus, using high processors computers to find the solution will be more efficient since the number of iterations decreased rapidly under method.

5. Conclusion

In this paper, we proposed hybrid conjugate gradient method by using nonnegative PRP and NPRP formulas with the SWP line search which extended the cases of using PRP method under mild condition. The global convergence property is established and it is very simple. Our numerical results had shown that the hybrid method is the best when compared to other related PRP CG methods.

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

Acknowledgments

The authors are grateful to the editor and the anonymous reviewers for their comments and suggestions which improved this paper substantially. They would also like to thank The Ministry of Education Malaysia (MOE) for funding this research under The Fundamental Research Grant Scheme (Grant no. 59256).

References

P. Wolfe, “Convergence conditions for ascent methods,” SIAM Review, vol. 11, no. 2, pp. 226–235, 1969.
View at: Google Scholar
P. Wolfe, “Convergence conditions for ascent methods. II: some corrections,” SIAM Review, vol. 13, no. 2, pp. 185–188, 1971.
View at: Publisher Site | Google Scholar
J. Cai, Q. Li, L. Li, H. Peng, and Y. Yang, “A fuzzy adaptive chaotic ant swarm optimization for economic dispatch,” International Journal of Electrical Power and Energy Systems, vol. 34, no. 1, pp. 154–160, 2012.
View at: Publisher Site | Google Scholar
K. Natallia, An Introduction to Heuristic Slgorithms, Department of Informatics and Telecommunications, 2005.
L. Li, H. Peng, J. Kurths, Y. Yang, and H. J. Schellnhuber, “Chaos-order transition in foraging behavior of ants,” Proceedings of the National Academy of Sciences of the United States of America, vol. 111, no. 23, pp. 8392–8397, 2014.
View at: Publisher Site | Google Scholar
M. Wan, L. Li, J. Xiao, C. Wang, and Y. Yang, “Data clustering using bacterial foraging optimization,” Journal of Intelligent Information Systems, vol. 38, no. 2, pp. 321–341, 2012.
View at: Publisher Site | Google Scholar
M. Wan, C. Wang, L. Li, and Y. Yang, “Chaotic ant swarm approach for data clustering,” Applied Soft Computing Journal, vol. 12, no. 8, pp. 2387–2393, 2012.
View at: Publisher Site | Google Scholar
M. R. Hestenes and E. Stiefel, “Methods of conjugate gradients for solving linear systems,” Journal of Research of the National Bureau of Standards, vol. 49, no. 6, pp. 409–436, 1952.
View at: Google Scholar
R. Fletcher and C. M. Reeves, “Function minimization by conjugate gradients,” The Computer Journal, vol. 7, no. 2, pp. 149–154, 1964.
View at: Publisher Site | Google Scholar | MathSciNet
P. Elijah and G. Ribiere, “Note sur la convergence de méthodes de directions conjuguées,” Revue française d'informatique et de recherche opérationnelle, vol. 3, no. 1, pp. 35–43, 1969.
View at: Google Scholar
Z. Wei, S. Yao, and L. Liu, “The convergence properties of some new conjugate gradient methods,” Applied Mathematics and Computation, vol. 183, no. 2, pp. 1341–1350, 2006.
View at: Publisher Site | Google Scholar | MathSciNet
G. Zoutendijk, “Nonlinear programming, computational methods,” Integer and Nonlinear Programming, vol. 143, no. 1, pp. 37–86, 1970.
View at: Google Scholar
M. Al-Baali, “Descent property and global convergence of the Fletcher—Reeves method with inexact line search,” IMA Journal of Numerical Analysis, vol. 5, no. 1, pp. 121–124, 1985.
View at: Publisher Site | Google Scholar | MathSciNet
L. Guanghui, H. Jiye, and Y. Hongxia, “Global convergence of the fletcher-reeves algorithm with inexact linesearch,” Applied Mathematics-A Journal of Chinese Universities, vol. 10, no. 1, pp. 75–82, 1995.
View at: Publisher Site | Google Scholar
M. J. D. Powell, “Nonconvex minimization calculations and the conjugate gradient method,” in Numerical Analysis, vol. 1066 of Lecture Notes in Mathematics, pp. 122–141, Springer, Berlin, Germany, 1984.
View at: Publisher Site | Google Scholar
J. C. Gilbert and J. Nocedal, “Global convergence properties of conjugate gradient methods for optimization,” SIAM Journal on Optimization, vol. 2, no. 1, pp. 21–42, 1992.
View at: Publisher Site | Google Scholar
D. Touati-Ahmed and C. Storey, “Efficient hybrid conjugate gradient techniques,” Journal of Optimization Theory and Applications, vol. 64, no. 2, pp. 379–397, 1990.
View at: Publisher Site | Google Scholar | Zentralblatt MATH
Y. Shengwei, Z. Wei, and H. Huang, “A note about WYL's conjugate gradient method and its applications,” Applied Mathematics and Computation, vol. 191, no. 2, pp. 381–388, 2007.
View at: Publisher Site | Google Scholar | MathSciNet
L. Zhang, “An improved Wei-Yao-Liu nonlinear conjugate gradient method for optimization computation,” Applied Mathematics and Computation, vol. 215, no. 6, pp. 2269–2274, 2009.
View at: Publisher Site | Google Scholar | MathSciNet
Z. Dai and F. Wen, “Another improved Wei–Yao–Liu nonlinear conjugate gradient method with sufficient descent property,” Applied Mathematics and Computation, vol. 218, no. 14, pp. 7421–7430, 2012.
View at: Publisher Site | Google Scholar
M. Rivaie, M. Mamat, L. W. June, and I. Mohd, “A new class of nonlinear conjugate gradient coefficients with global convergence properties,” Applied Mathematics and Computation, vol. 218, no. 22, pp. 11323–11332, 2012.
View at: Publisher Site | Google Scholar | MathSciNet
A. Alhawarat, M. Mamat, M. Rivaie, and I. Mohd, “A new modification of nonlinear conjugate gradient coefficients with global convergence properties,” International Journal of Mathematical, Computational, Statistical, Natural and Physical Engineering, vol. 8, no. 1, pp. 54–60, 2014.
View at: Google Scholar
A. Neculai, “An unconstrained optimization test functions collection,” Advanced Modeling and Optimization, vol. 10, no. 1, pp. 147–161, 2008.
View at: Google Scholar
I. Bongartz, A. R. Conn, N. Gould, P. L. Toint, and I. Bongartz, Constrained and Unconstrained Testing Environment, Département de Mathématique, 1993.
E. P. Adorio and U. Diliman, “Mvf-multivariate test functions library in c for unconstrained global optimization,” 2005.
View at: Google Scholar
E. D. Dolan and J. J. Moré, “Benchmarking optimization software with performance profiles,” Mathematical Programming, vol. 91, no. 2, pp. 201–213, 2002.
View at: Publisher Site | Google Scholar

Copyright

Copyright © 2015 Ahmad Alhawarat et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

PDF Download Citation

Download other formats

Order printed copies

Views

1996

Downloads

990

Citations