Machine Learning and its Applications in Image RestorationView this Special Issue
Research Article | Open Access
Pengyuan Li, Junyu Lu, Haishan Feng, "The Global Convergence of a Modified BFGS Method under Inexact Line Search for Nonconvex Functions", Mathematical Problems in Engineering, vol. 2021, Article ID 8342536, 9 pages, 2021. https://doi.org/10.1155/2021/8342536
The Global Convergence of a Modified BFGS Method under Inexact Line Search for Nonconvex Functions
Among the quasi-Newton algorithms, the BFGS method is often discussed by related scholars. However, in the case of inexact Wolfe line searches or even exact line search, the global convergence of the BFGS method for nonconvex functions is not still proven. Based on the aforementioned issues, we propose a new quasi-Newton algorithm to obtain a better convergence property; it is designed according to the following essentials: (1) a modified BFGS formula is designed to guarantee that inherits the positive definiteness of ; (2) a modified weak Wolfe–Powell line search is recommended; (3) a parabola, which is considered as the projection plane to avoid using the invalid direction, is proposed, and the next point is designed by a projection technique; (4) to obtain the global convergence of the proposed algorithm more easily, the projection point is used at all the next iteration points instead of the current modified BFGS update formula; and (5) the global convergence of the given algorithm is established under suitable conditions. Numerical results show that the proposed algorithm is efficient.
Considerwhere and . The multitudinous algorithms for (1) often use the following iterative formula:where is the current point, , is a step size, and is a search direction at . There exist many algorithms for (1) [1–9]. Davidon  pointed out that the quasi-Newton method is one of the most effective methods for solving nonlinear optimization problems. The idea of the quasi-Newton method is to use the first derivative to establish an approximate Hessian matrix in many iterations, and the approximation is updated by a low-rank matrix in each iteration. The primary quasi-Newton equation is as follows:
The search direction of the quasi-Newton method is generated by the following equation:where is any given symmetric positive-definite matrix, , the Hessian approximation matrix is the quasi-Newton update matrix, and is the gradient of at . The BFGS (Broyden , Fletcher , Goldfarb , and Shanno ) method is one of the quasi-Newton line search methods and has great numerical stability. The famous BFGS update formula iswhich is effective for solving (1) [15–18]. Powell  first proved that the BFGS method possesses global convergence for convex functions under Wolfe line search. Some global convergence results for the BFGS method for convex minimization problems can be found in [19–26]. However, Dai  proposed a counterexample to illustrate that the standard BFGS method may not be applicable to nonconvex functions with Wolfe line search, and Mascarenhas  demonstrated the nonconvergence of the standard BFGS method even with exact line search. To verify the global convergence of the BFGS method for general functions, some modified BFGS methods [28–31] have been also presented for nonconvex minimization problems. Aiming to obtain a better approximation of the objective function Hessian matrix, Wei et al.  proposed a new BFGS method, whose formula iswhere and , and the corresponding quasi-Newton equation is as follows:
For convex functions, convergence analysis of the new BFGS algorithm was given for weak Wolfe–Powell line search:where and .
Motivated by the above formula and other observations, Yuan and Wei  defined a modified quasi-Newton equation as follows:where and
It is obvious that if holds, then the quasi-Newton method is the method (7); otherwise, it is the standard BFGS method. Therefore, when holds, the modified quasi-Newton method (10) and the quasi-Newton method (6) have the same approximation of the Hessian matrix. Inspired by their views, we will demonstrate the global convergence of the modified BFGS (MBFGS) method (10) for nonconvex functions with the modified weak Wolfe–Powell (MWWP) line search , whose form is as follows:where , and . The parameter is different from that in paper .
This article is organized as follows: Section 2 introduces the motivation and states the given technique and algorithm. In Section 3, we prove the global convergence of the modified BFGS method with MWWP line search under some reasonable conditions. Section 4 reports the results of the numerical experiments to show the performance of the algorithms. The last section presents the conclusion. Throughout the article, and are replaced by and , and and are replaced by and . denotes the Euclidean norm.
2. Motivation and Algorithm
The global convergence of the BFGS algorithm has been established for the uniformly convex functions which have many advantages. It is worth considering whether we can use these properties of uniformly convex functions in the BFGS algorithm to obtain global convergence. This idea motivates us to propose a projection technique to acquire better convergence properties of the BFGS algorithm. Given a new numerical formula for (1):where is the next point generated by the classical BFGS formula. Moreover, a parabolic form is given as follows:where is a constant. It is not difficult to see that can be considered as the first two terms of the expansion of a quadratic function at , whose Hessian matrix is a diagonal matrix with eigenvalue . Therefore, the BFGS method is globally convergent. By projecting onto (14), we obtain the next step :
Remark 1. (i) is the defined projection point in Step 6, and vector is the same as vector in Step 5, where the projection point does not work in (10) but does in the next iteration.(ii)If holds in Step 5, then the global convergence of the algorithm can be obtained by the modified weak Wolfe–Powell line search, (11) and (12). If not, we can ensure the global convergence of the algorithm using the projection method (15).
3. Convergence Analysis
In this section, we concentrate on the global convergence of the modified projection BFGS algorithm. The following assumptions are required.
Assumption 1. (i)The level set is bounded.(ii)The function is twice continuously differentiable and bounded from below, and its gradient function is Lipschitz continuous, that is,holds, where is a constant.Assumption 1(ii) indicates that the relationholds.
Proof. The detailed proof of the rationality of the line search is given in paper .
Using the definition of , we obtain
Therefore, . The proof is complete.
Proof. According to (18), the relation is valid. Thus, the proof is complete.
Lemma 4. Let Assumption 1 and the inequality hold. If there exist constants , then the following hold:for at least values of for any positive integer .
Proof. The proof will be completed using the following two cases: Case 1. If is true, then and . By (16) and Assumption 1, Combining the above inequality with (19), we obtain Assumption 1(ii) and the definition of imply that By (31), (19), and the definition of , we have Relations (22), (30), and (32) indicate that Case 2. holds. From Step 6 of Algorithm 1, we obtain Together with (32), we have Integrating (35) with (20), we obtain Using the definition of , we have
All in all, the following formula always holds:
Theorem 2. If the conditions of Lemma 4 hold, then we obtain
Proof. By (23), we can obtain
Then, using Algorithm 1, we have
which means that holds for . Combining Lemma 4 and , we obtain
This implies that (40) holds. The proof is complete.
4. Numerical Results
In this section, we perform some numerical experiments to test Algorithm 1 with the modified weak Wolfe–Powell line search and compare its performance with that of the normal BFGS method. We call Algorithm 1 as MBFGS.
4.1. General Unconstrained Optimization Problems
Tested problems: the problems are obtained from [37, 38]. There are 74 test questions in total, which are listed in Table 1. Dimensionality: problem instances with 300, 900, and 2700 variables are considered. Himmelblau stop rule : if , then set , or let . If or holds, then the program is stopped, where and . Parameters: in Algorithm 1, , , , , and , which is the unit matrix. Experiment environment: all programs are written in MATLAB R2014a and run on a PC with an Inter(R) Core(TM) i5-4210U CPU @ 1.70 GHz, 8.00 GB of RAM, and the Windows 10 operating system. Symbol representation: some definitions of the notation used in Tables 1 and 2 are as follows: No: the test problem number. CPUTime: the CPUtime in seconds. NI: the number of iterations. NFG: the total number of function and gradient evaluations. Image description: Figures 1–3 show the profiles of CPUTime, NI, and NFG. It is easy to see from these figures that the MBFGS method possesses the best performance since its performance curves of CPUTime, NI, and NFG are better than those of the BFGS method. In addition, numerical results of the total CPUTime, NI, and NFG of the modified BFGS method are lower than those of the BFGS method.
4.2. The Muskingum Model in Engineering Problems
The Muskingum model, whose definition is as follows, is presented in this subsection. The key work is to numerically estimate the model using Algorithm 1.
Muskingum Model :whose symbolic representation is as follows: is the storage time constant, is the weight coefficient, is an extra parameter, is the observed inflow discharge, is the observed outflow discharge, is the time step at time (), and is the total time.
The observed data of the experiment are derived from the process of flood runoff from Chenggouwan and Linqing of Nanyunhe in the Haihe Basin, Tianjin, China. To obtain better numerical results, the initial point and the time step are selected. The specific values of and for the years 1960, 1961, and 1964 are stated in article . The test results are listed in Table 3.
The following three conclusions are apparent from Figures 4–6 and Table 3: (1) Combined with the Muskingum model, the MBFGS method has great numerical experiment performance, similar to the BFGS method and the HIWO method, and the efficiency of these three algorithms is fascinating. (2) The final points (, , and ) of the MBFGS method are emulative of those of other similar methods. (3) Owing to the final points of these three methods being distinct, the Muskingum model may have more approximation optimum points.
This paper gives a modified BFGS method and studies its global convergence under an inexact line search for nonconvex functions. A new algorithm is proposed, which has the following properties: (i) The search direction and its associated step size are accepted if a positive condition holds, and the next iterative point is designed; otherwise, a parabola is introduced, which is regarded as the projection surface to avoid using the failed direction, and the next point is designed by a projection technique. (ii) To easily obtain global convergence of the proposed algorithm, the projection point is used at all the next iteration points instead of the current modified BFGS update formula. The global convergence for nonconvex functions and the numerical results of the proposed algorithm indicated that the given method is competitive with other similar methods. As for future work, we have the following points to consider: (a) Is there a new projection technique suitable for the global convergence of the modified BFGS method? (b) Application of the modified BFGS method (10) to other line search techniques should be discussed. (c) Whether the combination of the projection technique mentioned and a conjugate gradient method, especially the PRP method, has good numerical experimental results is worthy of investigation.
All data supporting the findings are included in the paper.
Conflicts of Interest
There is no conflict of interests regarding the publication of this paper.
This work was supported by the National Natural Science Foundation of China (Grant No. 11661009), the High Level Innovation Teams and Excellent Scholars Program in Guangxi Institutions of Higher Education (Grant No. 52)), the Guangxi Natural Science Key Fund (No. 2017GXNSFDA198046), and the Special Funds for Local Science and Technology Development Guided by the Central Government (No. ZY20198003).
- Z. Dai, H. Zhou, J. Kang, and F. Wen, “The skewness of oil price returns and equity premium predictability,” Energy Economics, vol. 94, p. 105069, 2021.
- Z. Dai and J. Kang, “Some new efficient mean-variance portfolio selection models,” International Journal of Finance and Economics, vol. 2021, pp. 1–13, 2021.
- Z. Dai and H. Zhu, “A modified Hestenes-Stiefel-type derivative-free method for large-scale nonlinear monotone equations,” Mathematics, vol. 8, no. 2, p. 168, 2020.
- G. Yuan, T. Li, and W. Hu, “A conjugate gradient algorithm for large-scale nonlinear equations and image restoration problems,” Applied Numerical Mathematics, vol. 147, pp. 129–141, 2020.
- G. Yuan, J. Lu, and Z. Wang, “The PRP conjugate gradient algorithm with a modified WWP line search and its application in the image restoration problems,” Applied Numerical Mathematics, vol. 152, pp. 19–11, 2020.
- G. Yuan, X. Wang, and Z. Sheng, “The projection technique for two open problems of unconstrained optimization problems,” Journal of Optimization Theory and Applications, vol. 186, no. 2, pp. 590–619, 2020.
- G. Yuan, X. Wang, and Z. Sheng, “Family weak conjugate gradient algorithms and their convergence analysis for nonconvex functions,” Numerical Algorithms, vol. 84, no. 3, pp. 935–956, 2020.
- G. Yuan, Z. Wei, and Y. Yang, “The global convergence of the Polak-Ribière-Polyak conjugate gradient algorithm under inexact line search for nonconvex functions,” Journal of Computational and Applied Mathematics, vol. 362, pp. 262–275, 2019.
- G. Yuan, Z. Wang, and P. Li, “A modified Broyden family algorithm with global convergence under a weak Wolfe-Powell line search for unconstrained nonconvex problems,” Calcolo, vol. 57, pp. 1–21, 2020.
- W. C. Davidon, “Variable metric method for minimization,” SIAM Journal on Optimization, vol. 1, no. 1, pp. 1–17, 1991.
- C. G. Broyden, “The convergence of a class of double-rank minimization algorithms,” IMA Journal of Applied Mathematics, vol. 6, no. 3, pp. 222–231, 1970.
- R. Fletcher, “A new approach to variable metric algorithms,” The Computer Journal, vol. 13, no. 3, pp. 317–322, 1970.
- D. Goldfarb, “A family of variable-metric methods derived by variational means,” Mathematics of Computation, vol. 24, no. 109, p. 23, 1970.
- D. F. Shanno, “Conditioning of quasi-Newton methods for function minimization,” Mathematics of Computation, vol. 24, no. 111, p. 647, 1970.
- J. E. Dennis and R. B. Schnabel, “Numerical methods for unconstrained optimization and nonlinear equations,” 1983.
- Y. Dai, “Convergence properties of the BFGS algorithm,” SIAM Journal on Optimization, vol. 13, pp. 693–701, 2006.
- R. Fletcher, “Practical methods of optimization,” 2013.
- J. Han and G. Liu, “Global convergence analysis of a new nonmonotone BFGS algorithm on convex objective functions,” Computational Optimization and Applications, vol. 7, no. 3, pp. 277–289, 1997.
- M. J. D. Powell, “Some global convergence properties of a variable metric algorithm for minimization without exact line searches,” 1976.
- C. G. Broyden, J. E. Dennis Jr., and J. J. Moré, “On the local and superlinear convergence of quasi-Newton methods,” IMA Journal of Applied Mathematics, vol. 12, no. 3, pp. 223–245, 1973.
- R. H. Byrd, J. Nocedal, and Y.-X. Yuan, “Global convergence of a cass of quasi-Newton methods on convex problems,” SIAM Journal on Numerical Analysis, vol. 24, no. 5, pp. 1171–1190, 1987.
- L. C. W. Dixon, “Variable metric algorithms: necessary and sufficient conditions for identical behavior of nonquadratic functions,” Journal of Optimization Theory and Applications, vol. 10, no. 1, pp. 34–40, 1972.
- J. E. Dennis and J. J. Moré, “Quasi-Newton methods, motivation and theory,” SIAM Review, vol. 19, pp. 46–89, 1977.
- A. Griewank, “The “global” convergence of Broyden-like methods with suitable line search,” The Journal of the Australian Mathematical Society. Series B. Applied Mathematics, vol. 28, no. 1, pp. 75–92, 1986.
- A. Griewank and P. L. Toint, “Local convergence analysis for partitioned quasi-Newton updates,” Numerische Mathematik, vol. 39, no. 3, pp. 429–448, 1982.
- P. L. Toint, “Global convergence of the partitioned BFGS algorithm for convex partially separable optimization,” Mathematical Programming, vol. 36, no. 3, pp. 290–306, 1986.
- W. F. Mascarenhas, “The BFGS method with exact line searches fails for non-convex objective functions,” Mathematical Programming, vol. 99, no. 1, pp. 49–61, 2004.
- D.-H. Li and M. Fukushima, “A modified BFGS method and its global convergence in nonconvex minimization,” Journal of Computational and Applied Mathematics, vol. 129, no. 1-2, pp. 15–35, 2001.
- D.-H. Li and M. Fukushima, “On the global convergence of the BFGS method for nonconvex unconstrained optimization problems,” SIAM Journal on Optimization, vol. 11, no. 4, pp. 1054–1064, 2001.
- M. J. D. Powell, “A new algorithm for unconstrained optimization,” Nonlinear Programming, vol. 23, pp. 31–65, 1970.
- Z. Wei, L. Qi, and X. Chen, “An SQP-type method and its application in stochastic programs,” Journal of Optimization Theory and Applications, vol. 116, no. 1, pp. 205–228, 2003.
- Z. Wei, G. Yu, G. Yuan, and Z. Lian, “The superlinear convergence of a modified BFGS-type method for unconstrained optimization,” Computational Optimization and Applications, vol. 29, no. 3, pp. 315–332, 2004.
- G. Yuan and Z. Wei, “Convergence analysis of a modified BFGS method on convex minimizations,” Computational Optimization and Applications, vol. 47, no. 2, pp. 237–255, 2010.
- G. Yuan, Z. Wei, and X. Lu, “Global convergence of BFGS and PRP methods under a modified weak Wolfe-Powell line search,” Applied Mathematical Modelling, vol. 47, pp. 811–825, 2017.
- G. Yuan, Z. Sheng, B. Wang, W. Hu, and C. Li, “The global convergence of a modified BFGS method for nonconvex functions,” Journal of Computational and Applied Mathematics, vol. 327, pp. 274–294, 2018.
- R. H. Byrd and J. Nocedal, “A tool for the analysis of quasi-Newton methods with application to unconstrained minimization,” SIAM Journal on Numerical Analysis, vol. 26, no. 3, pp. 727–739, 1989.
- I. Bongartz, A. R. Conn, N. Gould, and P. L. Toint, “Cute,” ACM Transactions on Mathematical Software, vol. 21, no. 1, pp. 123–160, 1995.
- J. J. Moré, B. S. Garbow, and K. E. Hillstrom, “Testing unconstrained optimization software,” ACM Transactions on Mathematical Software, vol. 7, no. 1, pp. 17–41, 1981.
- Y. Yuan and W. Sun, “Theory and methods of optimization,” 1999.
- A. Ouyang, L.-B. Liu, Z. Sheng, and F. Wu, “A class of parameter estimation methods for nonlinear Muskingum model using hybrid invasive weed optimization algorithm,” Mathematical Problems in Engineering, vol. 2015, pp. 1–15, 2015.
- A. Ouyang, Z. Tang, K. Li, A. Sallam, and E. Sha, “Estimating parameters of Muskingum model using an adaptive hybrid PSO algorithm,” International Journal of Pattern Recognition and Artificial Intelligence, vol. 28, pp. 1–29, 2014.
- Z. W. Geem, “Parameter estimation for the nonlinear Muskingum model using the BFGS technique,” Journal of Irrigation and Drainage Engineering, vol. 132, no. 5, pp. 474–478, 2006.
Copyright © 2021 Pengyuan Li et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.