Special Issue

## Machine Learning and its Applications in Image Restoration

View this Special Issue

Research Article | Open Access

Volume 2021 |Article ID 8342536 | https://doi.org/10.1155/2021/8342536

Pengyuan Li, Junyu Lu, Haishan Feng, "The Global Convergence of a Modified BFGS Method under Inexact Line Search for Nonconvex Functions", Mathematical Problems in Engineering, vol. 2021, Article ID 8342536, 9 pages, 2021. https://doi.org/10.1155/2021/8342536

# The Global Convergence of a Modified BFGS Method under Inexact Line Search for Nonconvex Functions

Accepted22 Apr 2021
Published06 May 2021

#### Abstract

Among the quasi-Newton algorithms, the BFGS method is often discussed by related scholars. However, in the case of inexact Wolfe line searches or even exact line search, the global convergence of the BFGS method for nonconvex functions is not still proven. Based on the aforementioned issues, we propose a new quasi-Newton algorithm to obtain a better convergence property; it is designed according to the following essentials: (1) a modified BFGS formula is designed to guarantee that inherits the positive definiteness of ; (2) a modified weak Wolfe–Powell line search is recommended; (3) a parabola, which is considered as the projection plane to avoid using the invalid direction, is proposed, and the next point is designed by a projection technique; (4) to obtain the global convergence of the proposed algorithm more easily, the projection point is used at all the next iteration points instead of the current modified BFGS update formula; and (5) the global convergence of the given algorithm is established under suitable conditions. Numerical results show that the proposed algorithm is efficient.

#### 1. Introduction

Considerwhere and . The multitudinous algorithms for (1) often use the following iterative formula:where is the current point, , is a step size, and is a search direction at . There exist many algorithms for (1) [19]. Davidon [10] pointed out that the quasi-Newton method is one of the most effective methods for solving nonlinear optimization problems. The idea of the quasi-Newton method is to use the first derivative to establish an approximate Hessian matrix in many iterations, and the approximation is updated by a low-rank matrix in each iteration. The primary quasi-Newton equation is as follows:

The search direction of the quasi-Newton method is generated by the following equation:where is any given symmetric positive-definite matrix, , the Hessian approximation matrix is the quasi-Newton update matrix, and is the gradient of at . The BFGS (Broyden [11], Fletcher [12], Goldfarb [13], and Shanno [14]) method is one of the quasi-Newton line search methods and has great numerical stability. The famous BFGS update formula iswhich is effective for solving (1) [1518]. Powell [19] first proved that the BFGS method possesses global convergence for convex functions under Wolfe line search. Some global convergence results for the BFGS method for convex minimization problems can be found in [1926]. However, Dai [16] proposed a counterexample to illustrate that the standard BFGS method may not be applicable to nonconvex functions with Wolfe line search, and Mascarenhas [27] demonstrated the nonconvergence of the standard BFGS method even with exact line search. To verify the global convergence of the BFGS method for general functions, some modified BFGS methods [2831] have been also presented for nonconvex minimization problems. Aiming to obtain a better approximation of the objective function Hessian matrix, Wei et al. [32] proposed a new BFGS method, whose formula iswhere and , and the corresponding quasi-Newton equation is as follows:

For convex functions, convergence analysis of the new BFGS algorithm was given for weak Wolfe–Powell line search:where and .

Motivated by the above formula and other observations, Yuan and Wei [33] defined a modified quasi-Newton equation as follows:where and

It is obvious that if holds, then the quasi-Newton method is the method (7); otherwise, it is the standard BFGS method. Therefore, when holds, the modified quasi-Newton method (10) and the quasi-Newton method (6) have the same approximation of the Hessian matrix. Inspired by their views, we will demonstrate the global convergence of the modified BFGS (MBFGS) method (10) for nonconvex functions with the modified weak Wolfe–Powell (MWWP) line search [34], whose form is as follows:where , and . The parameter is different from that in paper [34].

This article is organized as follows: Section 2 introduces the motivation and states the given technique and algorithm. In Section 3, we prove the global convergence of the modified BFGS method with MWWP line search under some reasonable conditions. Section 4 reports the results of the numerical experiments to show the performance of the algorithms. The last section presents the conclusion. Throughout the article, and are replaced by and , and and are replaced by and . denotes the Euclidean norm.

#### 2. Motivation and Algorithm

The global convergence of the BFGS algorithm has been established for the uniformly convex functions which have many advantages. It is worth considering whether we can use these properties of uniformly convex functions in the BFGS algorithm to obtain global convergence. This idea motivates us to propose a projection technique to acquire better convergence properties of the BFGS algorithm. Given a new numerical formula for (1):where is the next point generated by the classical BFGS formula. Moreover, a parabolic form is given as follows:where is a constant. It is not difficult to see that can be considered as the first two terms of the expansion of a quadratic function at , whose Hessian matrix is a diagonal matrix with eigenvalue . Therefore, the BFGS method is globally convergent. By projecting onto (14), we obtain the next step :

The idea of the projection can also be found in [6, 8, 35]. Based on the above discussions, the modified algorithm is given in Algorithm 1.

 Step 0: given a point , constants , and an symmetric positive-definite matrix , set . Step 1: stop if . Step 2: obtain a search direction by solving Step 3: calculate using the inequalities (11) and (12). Step 4: set . Step 5: if holds, then let , , and go to Step 7; otherwise, go to Step 6. Step 6: let be defined by (15), , and . Step 7: update using formula (10). Step 8: let , and go to Step 1.

Remark 1. (i) is the defined projection point in Step 6, and vector is the same as vector in Step 5, where the projection point does not work in (10) but does in the next iteration.(ii)If holds in Step 5, then the global convergence of the algorithm can be obtained by the modified weak Wolfe–Powell line search, (11) and (12). If not, we can ensure the global convergence of the algorithm using the projection method (15).

#### 3. Convergence Analysis

In this section, we concentrate on the global convergence of the modified projection BFGS algorithm. The following assumptions are required.

Assumption 1. (i)The level set is bounded.(ii)The function is twice continuously differentiable and bounded from below, and its gradient function is Lipschitz continuous, that is,holds, where is a constant.Assumption 1(ii) indicates that the relationholds.

Theorem 1. Suppose that Assumption 1 and hold. Then, there exists a constant satisfying (11) and (12), where , and are constants.

Proof. The detailed proof of the rationality of the line search is given in paper [35].

Lemma 1. Let Assumption 1 and hold. If the sequence is generated by Algorithm 1, then we havewhere is a constant.

Proof. According to Lemma 1 of paper [35], the following relations are reasonable:

where parameter . Combining (19) and (20), we obtain the following formula with :

Using the definition of , we obtain

Therefore, . The proof is complete.

Lemma 2. If the sequence is generated by Algorithm 1 and Assumption 1 holds, then the matrix is positive definite for all .

Proof. According to (18), the relation is valid. Thus, the proof is complete.

Lemma 3. If the sequence is generated by Algorithm 1 and Assumption 1 holds, then

Proof. According (11) and Assumption 1(ii), the following formula obviously exists

Combining (12) with (16), we obtain

Thus,

Substituting the above inequality into (24), we have (23). The proof is complete.

Lemma 4. Let Assumption 1 and the inequality hold. If there exist constants , then the following hold:for at least values of for any positive integer .

Proof. The proof will be completed using the following two cases:Case 1. If is true, then and . By (16) and Assumption 1,Combining the above inequality with (19), we obtainAssumption 1(ii) and the definition of imply thatBy (31), (19), and the definition of , we haveRelations (22), (30), and (32) indicate thatCase 2. holds. From Step 6 of Algorithm 1, we obtainTogether with (32), we haveIntegrating (35) with (20), we obtainUsing the definition of , we have

where the second inequality follows from , and the last inequality follows from (16). Combining the above formula with (20), we obtain

All in all, the following formula always holds:

where and . Similar to the proof of Theorem 1 in [36], we obtain (27) and (28). The proof is complete.

Theorem 2. If the conditions of Lemma 4 hold, then we obtain

Proof. By (23), we can obtain

Then, using Algorithm 1, we have

The relationship between (27) and (28) indicates that

which means that holds for . Combining Lemma 4 and , we obtain

This implies that (40) holds. The proof is complete.

#### 4. Numerical Results

In this section, we perform some numerical experiments to test Algorithm 1 with the modified weak Wolfe–Powell line search and compare its performance with that of the normal BFGS method. We call Algorithm 1 as MBFGS.

##### 4.1. General Unconstrained Optimization Problems

Tested problems: the problems are obtained from [37, 38]. There are 74 test questions in total, which are listed in Table 1.Dimensionality: problem instances with 300, 900, and 2700 variables are considered.Himmelblau stop rule [39]: if , then set , or let . If or holds, then the program is stopped, where and .Parameters: in Algorithm 1, , , , , and , which is the unit matrix.Experiment environment: all programs are written in MATLAB R2014a and run on a PC with an Inter(R) Core(TM) i5-4210U CPU @ 1.70 GHz, 8.00 GB of RAM, and the Windows 10 operating system.Symbol representation: some definitions of the notation used in Tables 1 and 2 are as follows:No: the test problem number.CPUTime: the CPUtime in seconds.NI: the number of iterations.NFG: the total number of function and gradient evaluations.Image description: Figures 13 show the profiles of CPUTime, NI, and NFG. It is easy to see from these figures that the MBFGS method possesses the best performance since its performance curves of CPUTime, NI, and NFG are better than those of the BFGS method. In addition, numerical results of the total CPUTime, NI, and NFG of the modified BFGS method are lower than those of the BFGS method.

 No. Test problem 1 Extended Freudenstein and Roth function 2 Extended trigonometric function 3 Extended Rosenbrock function 4 Extended White and Holst function 5 Extended Beale function 6 Extended penalty function 7 Perturbed quadratic function 8 Raydan 1 function 9 Raydan 2 function 10 Diagonal 1 function 11 Diagonal 2 function 12 Diagonal 3 function 13 Hager function 14 Generalized tridiagonal-1 function 15 Extended tridiagonal-1 function 16 Extended three exponential terms function 17 Generalized tridiagonal-2 function 18 Diagonal 4 function 19 Diagonal 5 function 20 Extended Himmelblau function 21 Generalized PSC1 function 22 Extended PSC1 function 23 Extended Powell function 24 Extended block diagonal BD1 function 25 Extended Maratos function 26 Extended Cliff function 27 Quadratic diagonal perturbed function 28 Extended wood function 29 Extended Hiebert function 30 Quadratic function QF1 function 31 Extended quadratic penalty QP1 function 32 Extended quadratic penalty QP2 function 33 A quadratic function QF2 function 34 Extended EP1 function 35 Extended tridiagonal-2 function 36 BDQRTIC function (CUTE) 37 TRIDIA function (CUTE) 38 ARWHEAD function (CUTE) 39 NONDIA function (CUTE) 40 NONDQUAR function (CUTE) 41 DQDRTIC function (CUTE) 42 EG2 function (CUTE) 43 DIXMAANA function (CUTE) 44 DIXMAANB function (CUTE) 45 DIXMAANC function (CUTE) 46 DIXMAANE function (CUTE) 47 Partial perturbed quadratic function 48 Broyden tridiagonal function 49 Almost perturbed quadratic function 50 Tridiagonal perturbed quadratic function 51 EDENSCH function (CUTE) 52 VARDIM function (CUTE) 53 STAIRCASE S1 function 54 LIARWHD function (CUTE) 55 DIAGONAL 6 function 56 DIXON3DQ function (CUTE) 57 DIXMAANF function (CUTE) 58 DIXMAANG function (CUTE) 59 DIXMAANH function (CUTE) 60 DIXMAANI function (CUTE) 61 DIXMAANJ function (CUTE) 62 DIXMAANK function (CUTE) 63 DIXMAANL function (CUTE) 64 DIXMAAND function (CUTE) 65 ENGVAL1 function (CUTE) 66 FLETCHCR function (CUTE) 67 COSINE function (CUTE) 68 Extended DENSCHNB function (CUTE) 69 Extended DENSCHNF function (CUTE) 70 SINQUAD function (CUTE) 71 BIGGSB1 function (CUTE) 72 Partial perturbed quadratic PPQ2 function 73 Scaled quadratic SQ1 function 74 Scaled quadratic SQ2 function
 Algorithm CPUTime NI NFG BFGS 62299.85938 23833 57444 MBFGS 60990.64063 22357 51370
##### 4.2. The Muskingum Model in Engineering Problems

The Muskingum model, whose definition is as follows, is presented in this subsection. The key work is to numerically estimate the model using Algorithm 1.

Muskingum Model [40]:whose symbolic representation is as follows: is the storage time constant, is the weight coefficient, is an extra parameter, is the observed inflow discharge, is the observed outflow discharge, is the time step at time (), and is the total time.

The observed data of the experiment are derived from the process of flood runoff from Chenggouwan and Linqing of Nanyunhe in the Haihe Basin, Tianjin, China. To obtain better numerical results, the initial point and the time step are selected. The specific values of and for the years 1960, 1961, and 1964 are stated in article [41]. The test results are listed in Table 3.

 Algorithm BFGS [42] 10.8156 0.9826 1.0219 HIWO [40] 13.2813 0.8001 0.9933 MBFGS 11.1832 1.0000 0.9996

The following three conclusions are apparent from Figures 46 and Table 3: (1) Combined with the Muskingum model, the MBFGS method has great numerical experiment performance, similar to the BFGS method and the HIWO method, and the efficiency of these three algorithms is fascinating. (2) The final points (, , and ) of the MBFGS method are emulative of those of other similar methods. (3) Owing to the final points of these three methods being distinct, the Muskingum model may have more approximation optimum points.

#### 5. Conclusion

This paper gives a modified BFGS method and studies its global convergence under an inexact line search for nonconvex functions. A new algorithm is proposed, which has the following properties: (i) The search direction and its associated step size are accepted if a positive condition holds, and the next iterative point is designed; otherwise, a parabola is introduced, which is regarded as the projection surface to avoid using the failed direction, and the next point is designed by a projection technique. (ii) To easily obtain global convergence of the proposed algorithm, the projection point is used at all the next iteration points instead of the current modified BFGS update formula. The global convergence for nonconvex functions and the numerical results of the proposed algorithm indicated that the given method is competitive with other similar methods. As for future work, we have the following points to consider: (a) Is there a new projection technique suitable for the global convergence of the modified BFGS method? (b) Application of the modified BFGS method (10) to other line search techniques should be discussed. (c) Whether the combination of the projection technique mentioned and a conjugate gradient method, especially the PRP method, has good numerical experimental results is worthy of investigation.

#### Data Availability

All data supporting the findings are included in the paper.

#### Conflicts of Interest

There is no conflict of interests regarding the publication of this paper.

#### Acknowledgments

This work was supported by the National Natural Science Foundation of China (Grant No. 11661009), the High Level Innovation Teams and Excellent Scholars Program in Guangxi Institutions of Higher Education (Grant No. [2019]52)), the Guangxi Natural Science Key Fund (No. 2017GXNSFDA198046), and the Special Funds for Local Science and Technology Development Guided by the Central Government (No. ZY20198003).

#### References

1. Z. Dai, H. Zhou, J. Kang, and F. Wen, “The skewness of oil price returns and equity premium predictability,” Energy Economics, vol. 94, p. 105069, 2021. View at: Publisher Site | Google Scholar
2. Z. Dai and J. Kang, “Some new efficient mean-variance portfolio selection models,” International Journal of Finance and Economics, vol. 2021, pp. 1–13, 2021. View at: Google Scholar
3. Z. Dai and H. Zhu, “A modified Hestenes-Stiefel-type derivative-free method for large-scale nonlinear monotone equations,” Mathematics, vol. 8, no. 2, p. 168, 2020. View at: Publisher Site | Google Scholar
4. G. Yuan, T. Li, and W. Hu, “A conjugate gradient algorithm for large-scale nonlinear equations and image restoration problems,” Applied Numerical Mathematics, vol. 147, pp. 129–141, 2020. View at: Publisher Site | Google Scholar
5. G. Yuan, J. Lu, and Z. Wang, “The PRP conjugate gradient algorithm with a modified WWP line search and its application in the image restoration problems,” Applied Numerical Mathematics, vol. 152, pp. 19–11, 2020. View at: Publisher Site | Google Scholar
6. G. Yuan, X. Wang, and Z. Sheng, “The projection technique for two open problems of unconstrained optimization problems,” Journal of Optimization Theory and Applications, vol. 186, no. 2, pp. 590–619, 2020. View at: Publisher Site | Google Scholar
7. G. Yuan, X. Wang, and Z. Sheng, “Family weak conjugate gradient algorithms and their convergence analysis for nonconvex functions,” Numerical Algorithms, vol. 84, no. 3, pp. 935–956, 2020. View at: Publisher Site | Google Scholar
8. G. Yuan, Z. Wei, and Y. Yang, “The global convergence of the Polak-Ribière-Polyak conjugate gradient algorithm under inexact line search for nonconvex functions,” Journal of Computational and Applied Mathematics, vol. 362, pp. 262–275, 2019. View at: Publisher Site | Google Scholar
9. G. Yuan, Z. Wang, and P. Li, “A modified Broyden family algorithm with global convergence under a weak Wolfe-Powell line search for unconstrained nonconvex problems,” Calcolo, vol. 57, pp. 1–21, 2020. View at: Publisher Site | Google Scholar
10. W. C. Davidon, “Variable metric method for minimization,” SIAM Journal on Optimization, vol. 1, no. 1, pp. 1–17, 1991. View at: Publisher Site | Google Scholar
11. C. G. Broyden, “The convergence of a class of double-rank minimization algorithms,” IMA Journal of Applied Mathematics, vol. 6, no. 3, pp. 222–231, 1970. View at: Publisher Site | Google Scholar
12. R. Fletcher, “A new approach to variable metric algorithms,” The Computer Journal, vol. 13, no. 3, pp. 317–322, 1970. View at: Publisher Site | Google Scholar
13. D. Goldfarb, “A family of variable-metric methods derived by variational means,” Mathematics of Computation, vol. 24, no. 109, p. 23, 1970. View at: Publisher Site | Google Scholar
14. D. F. Shanno, “Conditioning of quasi-Newton methods for function minimization,” Mathematics of Computation, vol. 24, no. 111, p. 647, 1970. View at: Publisher Site | Google Scholar
15. J. E. Dennis and R. B. Schnabel, “Numerical methods for unconstrained optimization and nonlinear equations,” 1983. View at: Google Scholar
16. Y. Dai, “Convergence properties of the BFGS algorithm,” SIAM Journal on Optimization, vol. 13, pp. 693–701, 2006. View at: Google Scholar
17. R. Fletcher, “Practical methods of optimization,” 2013. View at: Google Scholar
18. J. Han and G. Liu, “Global convergence analysis of a new nonmonotone BFGS algorithm on convex objective functions,” Computational Optimization and Applications, vol. 7, no. 3, pp. 277–289, 1997. View at: Publisher Site | Google Scholar
19. M. J. D. Powell, “Some global convergence properties of a variable metric algorithm for minimization without exact line searches,” 1976. View at: Google Scholar
20. C. G. Broyden, J. E. Dennis Jr., and J. J. Moré, “On the local and superlinear convergence of quasi-Newton methods,” IMA Journal of Applied Mathematics, vol. 12, no. 3, pp. 223–245, 1973. View at: Publisher Site | Google Scholar
21. R. H. Byrd, J. Nocedal, and Y.-X. Yuan, “Global convergence of a cass of quasi-Newton methods on convex problems,” SIAM Journal on Numerical Analysis, vol. 24, no. 5, pp. 1171–1190, 1987. View at: Publisher Site | Google Scholar
22. L. C. W. Dixon, “Variable metric algorithms: necessary and sufficient conditions for identical behavior of nonquadratic functions,” Journal of Optimization Theory and Applications, vol. 10, no. 1, pp. 34–40, 1972. View at: Publisher Site | Google Scholar
23. J. E. Dennis and J. J. Moré, “Quasi-Newton methods, motivation and theory,” SIAM Review, vol. 19, pp. 46–89, 1977. View at: Google Scholar
24. A. Griewank, “The “global” convergence of Broyden-like methods with suitable line search,” The Journal of the Australian Mathematical Society. Series B. Applied Mathematics, vol. 28, no. 1, pp. 75–92, 1986. View at: Publisher Site | Google Scholar
25. A. Griewank and P. L. Toint, “Local convergence analysis for partitioned quasi-Newton updates,” Numerische Mathematik, vol. 39, no. 3, pp. 429–448, 1982. View at: Publisher Site | Google Scholar
26. P. L. Toint, “Global convergence of the partitioned BFGS algorithm for convex partially separable optimization,” Mathematical Programming, vol. 36, no. 3, pp. 290–306, 1986. View at: Publisher Site | Google Scholar
27. W. F. Mascarenhas, “The BFGS method with exact line searches fails for non-convex objective functions,” Mathematical Programming, vol. 99, no. 1, pp. 49–61, 2004. View at: Publisher Site | Google Scholar
28. D.-H. Li and M. Fukushima, “A modified BFGS method and its global convergence in nonconvex minimization,” Journal of Computational and Applied Mathematics, vol. 129, no. 1-2, pp. 15–35, 2001. View at: Publisher Site | Google Scholar
29. D.-H. Li and M. Fukushima, “On the global convergence of the BFGS method for nonconvex unconstrained optimization problems,” SIAM Journal on Optimization, vol. 11, no. 4, pp. 1054–1064, 2001. View at: Publisher Site | Google Scholar
30. M. J. D. Powell, “A new algorithm for unconstrained optimization,” Nonlinear Programming, vol. 23, pp. 31–65, 1970. View at: Publisher Site | Google Scholar
31. Z. Wei, L. Qi, and X. Chen, “An SQP-type method and its application in stochastic programs,” Journal of Optimization Theory and Applications, vol. 116, no. 1, pp. 205–228, 2003. View at: Publisher Site | Google Scholar
32. Z. Wei, G. Yu, G. Yuan, and Z. Lian, “The superlinear convergence of a modified BFGS-type method for unconstrained optimization,” Computational Optimization and Applications, vol. 29, no. 3, pp. 315–332, 2004. View at: Publisher Site | Google Scholar
33. G. Yuan and Z. Wei, “Convergence analysis of a modified BFGS method on convex minimizations,” Computational Optimization and Applications, vol. 47, no. 2, pp. 237–255, 2010. View at: Publisher Site | Google Scholar
34. G. Yuan, Z. Wei, and X. Lu, “Global convergence of BFGS and PRP methods under a modified weak Wolfe-Powell line search,” Applied Mathematical Modelling, vol. 47, pp. 811–825, 2017. View at: Publisher Site | Google Scholar
35. G. Yuan, Z. Sheng, B. Wang, W. Hu, and C. Li, “The global convergence of a modified BFGS method for nonconvex functions,” Journal of Computational and Applied Mathematics, vol. 327, pp. 274–294, 2018. View at: Publisher Site | Google Scholar
36. R. H. Byrd and J. Nocedal, “A tool for the analysis of quasi-Newton methods with application to unconstrained minimization,” SIAM Journal on Numerical Analysis, vol. 26, no. 3, pp. 727–739, 1989. View at: Publisher Site | Google Scholar
37. I. Bongartz, A. R. Conn, N. Gould, and P. L. Toint, “Cute,” ACM Transactions on Mathematical Software, vol. 21, no. 1, pp. 123–160, 1995. View at: Publisher Site | Google Scholar
38. J. J. Moré, B. S. Garbow, and K. E. Hillstrom, “Testing unconstrained optimization software,” ACM Transactions on Mathematical Software, vol. 7, no. 1, pp. 17–41, 1981. View at: Publisher Site | Google Scholar
39. Y. Yuan and W. Sun, “Theory and methods of optimization,” 1999. View at: Google Scholar
40. A. Ouyang, L.-B. Liu, Z. Sheng, and F. Wu, “A class of parameter estimation methods for nonlinear Muskingum model using hybrid invasive weed optimization algorithm,” Mathematical Problems in Engineering, vol. 2015, pp. 1–15, 2015. View at: Publisher Site | Google Scholar
41. A. Ouyang, Z. Tang, K. Li, A. Sallam, and E. Sha, “Estimating parameters of Muskingum model using an adaptive hybrid PSO algorithm,” International Journal of Pattern Recognition and Artificial Intelligence, vol. 28, pp. 1–29, 2014. View at: Publisher Site | Google Scholar
42. Z. W. Geem, “Parameter estimation for the nonlinear Muskingum model using the BFGS technique,” Journal of Irrigation and Drainage Engineering, vol. 132, no. 5, pp. 474–478, 2006. View at: Publisher Site | Google Scholar

Copyright © 2021 Pengyuan Li et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.