The Hybrid BFGS-CG Method in Solving Unconstrained Optimization Problems

Ibrahim, Mohd Asrul Hery; Mamat, Mustafa; Leong, Wah June

doi:https://doi.org/10.1155/2014/507102

Abstract and Applied Analysis

On this page

Abstract Introduction Conclusion References Copyright Related Articles

Research Article | Open Access

Volume 2014 | Article ID 507102 | https://doi.org/10.1155/2014/507102

The Hybrid BFGS-CG Method in Solving Unconstrained Optimization Problems

Mohd Asrul Hery Ibrahim,¹Mustafa Mamat,^2,3and Wah June Leong⁴

Academic Editor: Lucas Jodar

Received22 Apr 2013

Accepted23 Jan 2014

Published04 Mar 2014

Abstract

In solving large scale problems, the quasi-Newton method is known as the most efficient method in solving unconstrained optimization problems. Hence, a new hybrid method, known as the BFGS-CG method, has been created based on these properties, combining the search direction between conjugate gradient methods and quasi-Newton methods. In comparison to standard BFGS methods and conjugate gradient methods, the BFGS-CG method shows significant improvement in the total number of iterations and CPU time required to solve large scale unconstrained optimization problems. We also prove that the hybrid method is globally convergent.

1. Introduction

The unconstrained optimization problem only requires the objective function as where is an -dimensional Euclidean space and is continuously differentiable. The iterative methods are used to solve (1). On the th iteration, an approximation point and the th iteration are given by where denotes the search direction and denotes the step size. The search direction must satisfy the relation , which guarantees that is a descent direction of at . The different choices of and yield the different convergence properties. Generally the first order condition is used to check for local convergence to stationary point . There are many ways to calculate the search direction depending on the method used, such as the steepest descent method, conjugate gradient (CG) method, Newton-Raphson method, and quasi-Newton method.

The different choices of the step size ensure that the sequence of iterates defined by (2) is globally convergent with some rates of convergence. There are two ways to determine the values of the step size, the exact line search, and the inexact line search. For the exact line search, is calculated by using the formula . However, it is difficult and often impossible to find the value of step size in practical computation using the exact line search. Hence, the inexact line search is proposed by previous researchers like Armijo [1], Wolfe [2, 3], and Goldstein [4] to overcome the problem. Recently Shi proposed a new inexact line search rule similar to the Armijo line search and analysed the global converge [5]. Shi also claimed that among several well-known inexact line search procedures published by previous researchers, the Armijo line search rule is one of the most useful and the easiest to be implemented in computational calculations. The Armijo line search rule can be described as follows: . Then, the sequence of is converged to the optimal point, , which minimises [6]. Hence, we will use the Armijo line search in this research associated with the Broyden-Fletcher-Goldfarb-Shanno (BFGS) method and the new hybrid method.

This paper is organised as follows. In Section 2, we elaborate the step size and search direction that are used in this research. Here, the BFGS method and CG method also will be presented. Then, the new hybrid method and convergence analysis will be discussed in Section 3. An explanation about the numerical results is provided in Section 4 and the paper ends with a short conclusion in Section 5.

2. The Search Direction

The different methods in solving unconstrained optimization problems depend on the calculation of search direction, in (2). In this paper, we will focus on the CG method and quasi-Newton methods. The CG method is useful for finding the minimum value of functions or unconstrained optimization problems, which are introduced by [7]. The search direction of the CG method is where and is known as the CG coefficient. There are many ways to calculate and some well-known formulas are where and are gradients of at points and , respectively, while is a norm of vectors and is a search direction for the previous iteration. The above corresponding coefficients are known as Fletcher-Reeves (CG-FR) [7], Polak-Ribière (CG-PR) [8–11], and Hestenes-Stiefel (CG-HS) [12].

In quasi-Newton methods, the search direction is the solution of linear system where is an approximation of Hessian. Initial matrix is chosen by the identity matrix, which subsequently updates by an update formula. There are a few update formulas that are widely used like Davidon-Fletcher-Powell (DFP), BFGS, and Broyden family formula. This research uses a BFGS formula in a classical algorithm and the new hybrid method. The update formula for BFGS is with and . The approximation that the Hessian must fulfil is This condition is required to hold for the updated matrix . Note that it is only possible to fulfil the secant equation if which is known as the curvature condition.

3. The New Hybrid Method

The modification of the quasi-Newton method based on a hybrid method has already been introduced by previous researchers. One of the studies is a hybridization of the quasi-Newton and Gauss-Seidel methods, aimed at solving the system of linear equations in [13]. Luo et al. [14] suggest the new hybrid method, which can solve the system of nonlinear equations by combining the quasi-Newton method with chaos optimization. Han and Neumann [6] combine the quasi-Newton methods and Cauchy descent method to solve unconstrained optimization problems, which is recognised as the quasi-Newton-SD method.

Hence, the modification of the quasi-Newton method by previous researchers spawned the new idea of hybridizing the classical method to yield the new hybrid method. Hence, this study proposes a new hybrid search direction that combines the concept of search direction of the quasi-Newton and CG methods. It yields a new search direction of the hybrid method which is known as the BFGS-CG method. The search direction for the BFGS-CG method is where and .

Hence, the complete algorithms for the BFGS method, CG-HS, CG-PR, and CG-FR methods, and the BFGS-CG method will be arranged in Algorithms 1, 2, and 3, respectively.

Algorithm 1 (BFGS method). States the following.

Step??0. Given a starting point and , choose values for , , and, and set .

Step??1. Terminate if or .

Step??2. Calculate the search direction by (6).

Step??3. Calculate the step size by (3).

Step??4. Compute the difference between and .

Step??5. Update by (7) to obtain .

Step??6. Set and go to Step 1.

Algorithm 2 (CG-HS, CG-PR, and CG-FR). States the following.

Step??0. Given a starting point , choose values for , , and and set .

Step??1. Terminate if or .

Step??2. Calculate the search direction by (4) with respect to the coefficient of CG.

Step??3. Calculate the step size by (3).

Step??4. Compute the difference between and .

Step??5. Set and go to Step 1.

Algorithm 3 (BFGS-CG method). States the following.

Step??0. Given a starting point and , choose values for , , and and set .

Step??1. Terminate if or .

Step??2. Calculate the search direction by (10).

Step??3. Calculate the step size by (3).

Step??4. Compute the difference between and .

Step??5. Update by (7) to obtain .

Step??6. Set and go to Step 1.

Based on Algorithms 1, 2, and 3 we assume that every search direction satisfied the descent condition for all . If there exists a constant such that for all , then the search directions satisfy the sufficient descent condition which can be proved in Theorem 6. Hence, we need to make a few assumptions based on the objective function.

Assumption 4. Consider the following.H1:the objective function is twice continuously differentiable.H2:the level set is convex. Moreover, positive constants and exist, satisfying ?for all and , where is the Hessian matrix for .H3:the Hessian matrix is Lipschitz continuous at the point ; that is, there exists the positive constant satisfying ?for all in a neighbourhood of .

Theorem 5 (see [15, 16]). Let be generated by the BFGS formula (8), where is symmetric and positive definite, and for all . Furthermore, assume that and are such that for some symmetric and positive definite matrix and for some sequence with the property . Then and the sequence , are bound.

Theorem 6. Suppose that Assumption 4 and Theorem 5 hold. Then condition (12) holds for all .

Proof. From (12), we see that Based on Powell [17], with , and where which is bound away from zero. Hence, holds. The proof is completed.

Lemma 7. Under Assumption 4, positive constants and exist, such that for any and any with , the step size produced by Algorithm 2 will satisfy either or

Proof. Suppose that , which means that (3) failed for a step size : Then, by using the mean value theorem, we obtain where , for some . Now, by the Cauchy-Schwartz inequality, we get Thus, from H3 which implies that Substituting this into (21), we have where , which gives (19).

Theorem 8 (global convergence). Suppose that Assumption 4 and Theorem 5 hold. Then

Proof. Combining descent property (12) and Lemma 7 gives Hence, from Theorem 6, we can define that . Then, (28) will be simplified as . Therefore, the proof is completed.

4. Numerical Result

In this section, we use the test problem considered by Andrei [18], Michalewicz [19], and Moré et al. [20] in Table 1 to analyse the improvement of the BFGS-CG method compared with the BFGS method and CG method. Each of the test problems is tested with dimensions varying from 2 to 1,000 variables. This represents a total of 159 test problems. As suggested by [20], for each of the test problems, the initial point will further subtract from the minimum point. In doing so, this leads us to test the global convergence properties and the robustness of our method. For the Armijo line search, we use , , and . The stopping criteria we use are and the number of iterations exceeds its limit, which is set to be 10,000. In our implementation, the numerical tests were performed on an Acer Aspire with a Windows 7 operating system and using Matlab 2012 languages.

The performance results will be shown in Figures 1 and 2, respectively, using the performance profile introduced by Dolan and Moré [21]. The performance profile seeks to find how well the solvers perform relative to the other solvers on a set of problems. In general, is the fraction of problems with performance ratio ; thus, a solver with high values of or one that is located at the top right of the figure is preferable.

Figures 1 and 2 show that the BFGS-CG method has the best performance since it can solve 99% of the test problems compared with the BFGS (84%), CG-HS (65%), CG-PR (80%), and CG-FR (75%) methods. Moreover, we can also say that the BFGS-CG is the fastest solver on approximately 68% of the test problems for iteration and 52% of CPU time.

5. Conclusion

We have presented a new hybrid method for solving unconstrained optimization problems. The numerical results for a broad class of test problems show that the BFGS-CG method is efficient and robust in solving the unconstrained optimization problem. We also note that, as the size and complexity of the problem increase, greater improvements could be realised by our BFGS-CG method. Our future research will be to try the BFGS-CG method with coefficients of CG like Fletcher-Reeves, Hestenes-Stiefel, and Polak-Ribiére.

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

Acknowledgment

This research was supported by Fundamental Research Grant Scheme (FRGS Vote no. 59256).

References

L. Armijo, “Minimization of functions having Lipschitz continuous first partial derivatives,” Pacific Journal of Mathematics, vol. 16, pp. 1–3, 1966.
View at: Publisher Site | Google Scholar | Zentralblatt MATH | MathSciNet
P. Wolfe, “Convergence conditions for ascent methods,” SIAM Review, vol. 11, no. 2, pp. 226–235, 1969.
View at: Publisher Site | Google Scholar | Zentralblatt MATH | MathSciNet
P. Wolfe, “Convergence conditions for ascent methods. II: some corrections,” SIAM Review, vol. 13, no. 2, pp. 185–188, 1971.
View at: Publisher Site | Google Scholar | Zentralblatt MATH | MathSciNet
A. A. Goldstein, “On steepest descent,” Journal of the Society for Industrial and Applied Mathematics A, vol. 3, no. 1, pp. 147–151, 1965.
View at: Publisher Site | Google Scholar | Zentralblatt MATH | MathSciNet
Z.-J. Shi, “Convergence of quasi-Newton method with new inexact line search,” Journal of Mathematical Analysis and Applications, vol. 315, no. 1, pp. 120–131, 2006.
View at: Publisher Site | Google Scholar | Zentralblatt MATH | MathSciNet
L. Han and M. Neumann, “Combining quasi-Newton and Cauchy directions,” International Journal of Applied Mathematics, vol. 12, no. 2, pp. 167–191, 2003.
View at: Google Scholar | Zentralblatt MATH | MathSciNet
R. Fletcher and C. M. Reeves, “Function minimization by conjugate gradients,” The Computer Journal, vol. 7, no. 2, pp. 149–154, 1964.
View at: Publisher Site | Google Scholar | Zentralblatt MATH | MathSciNet
N. Andrei, “Accelerated scaled memoryless BFGS preconditioned conjugate gradient algorithm for unconstrained optimization,” European Journal of Operational Research, vol. 204, no. 3, pp. 410–420, 2010.
View at: Publisher Site | Google Scholar | Zentralblatt MATH | MathSciNet
E. Polak and G. Ribière, “Note on the convergence of methods of conjugate directions,” Revue Française d’Informatique et de Recherche Opérationnelle, vol. 3, pp. 35–43, 1969.
View at: Google Scholar | Zentralblatt MATH | MathSciNet
Z.-J. Shi and J. Shen, “Convergence of the Polak-Ribiére-Polyak conjugate gradient method,” Nonlinear Analysis: Theory, Methods & Applications, vol. 66, no. 6, pp. 1428–1441, 2007.
View at: Publisher Site | Google Scholar | Zentralblatt MATH | MathSciNet
G. Yu, L. Guan, and Z. Wei, “Globally convergent Polak-Ribière-Polyak conjugate gradient methods under a modified Wolfe line search,” Applied Mathematics and Computation, vol. 215, no. 8, pp. 3082–3090, 2009.
View at: Publisher Site | Google Scholar | Zentralblatt MATH | MathSciNet
M. R. Hestenes and E. Stiefel, “Method of conjugate gradient for solving linear equations,” Journal of Research of the National Bureau of Standards, vol. 49, no. 6, pp. 409–436, 1952.
View at: Publisher Site | Google Scholar
A. Ludwig, “The Gauss-Seidel-quasi-Newton method: a hybrid algorithm for solving dynamic economic models,” Journal of Economic Dynamics and Control, vol. 31, no. 5, pp. 1610–1632, 2007.
View at: Publisher Site | Google Scholar | Zentralblatt MATH | MathSciNet
Y.-Z. Luo, G.-J. Tang, and L.-N. Zhou, “Hybrid approach for solving systems of nonlinear equations using chaos optimization and quasi-Newton method,” Applied Soft Computing, vol. 8, no. 2, pp. 1068–1073, 2008.
View at: Publisher Site | Google Scholar
R. H. Byrd and J. Nocedal, “A tool for the analysis of quasi-Newton methods with application to unconstrained minimization,” SIAM Journal on Numerical Analysis, vol. 26, no. 3, pp. 727–739, 1989.
View at: Publisher Site | Google Scholar | Zentralblatt MATH | MathSciNet
R. H. Byrd, J. Nocedal, and Y.-X. Yuan, “Global convergence of a class of quasi-Newton methods on convex problems,” SIAM Journal on Numerical Analysis, vol. 24, no. 5, pp. 1171–1191, 1987.
View at: Publisher Site | Google Scholar | Zentralblatt MATH | MathSciNet
M. J. D. Powell, “Restart procedures for the conjugate gradient method,” Mathematical Programming, vol. 12, no. 1, pp. 241–254, 1977.
View at: Publisher Site | Google Scholar | Zentralblatt MATH
N. Andrei, “An unconstrained optimization test functions collection,” Advanced Modeling and Optimization, vol. 10, no. 1, pp. 147–161, 2008.
View at: Google Scholar | Zentralblatt MATH | MathSciNet
Z. Michalewicz, Genetic Algorithms + Data Structures = Evolution Programs, Springer, Berlin, Germany, 1996.
View at: MathSciNet
J. J. Moré, B. S. Garbow, and K. E. Hillstrom, “Testing unconstrained optimization software,” ACM Transactions on Mathematical Software, vol. 7, no. 1, pp. 17–41, 1981.
View at: Publisher Site | Google Scholar | Zentralblatt MATH | MathSciNet
E. D. Dolan and J. J. Moré, “Benchmarking optimization software with performance profiles,” Mathematical Programming, vol. 91, no. 2, pp. 201–213, 2002.
View at: Publisher Site | Google Scholar | Zentralblatt MATH | MathSciNet

Copyright

Copyright © 2014 Mohd Asrul Hery Ibrahim et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

PDF Download Citation

Download other formats

Order printed copies

Views

2763

Downloads

1720

Citations