Machine Learning and its Applications in Image RestorationView this Special Issue
Global Convergence of a Modified Two-Parameter Scaled BFGS Method with Yuan-Wei-Lu Line Search for Unconstrained Optimization
The BFGS method is one of the most efficient quasi-Newton methods for solving small- and medium-size unconstrained optimization problems. For the sake of exploring its more interesting properties, a modified two-parameter scaled BFGS method is stated in this paper. The intention of the modified scaled BFGS method is to improve the eigenvalues structure of the BFGS update. In this method, the first two terms and the last term of the standard BFGS update formula are scaled with two different positive parameters, and the new value of is given. Meanwhile, Yuan-Wei-Lu line search is also proposed. Under the mentioned line search, the modified two-parameter scaled BFGS method is globally convergent for nonconvex functions. The extensive numerical experiments show that this form of the scaled BFGS method outperforms the standard BFGS method or some similar scaled methods.
Considerwhere , and is a continuously differentiable function bounded from below. The quasi-Newton methods are currently used in countless optimization software for solving unconstrained optimization problems [1–8]. The BFGS method, one of the most efficient quasi-Newton methods, for solving (1) is an iterative method of the following form:where , obtained by some line search rule, is a step size, and is the BFGS search direction computed by the following equation:where is the gradient of , and the matrix is the BFGS approximation to the Hessian , which has the following update formula:where and . The problems related to the BFGS method have been analyzed and studied by many scholars, and satisfactory conclusions have been drawn [9–16]. In earlier year, Powell  first proved the global convergence of the standard BFGS method with inexact Wolfe line search for convex functions. Under the exact line search or some specific inexact line search, the BFGS method has the convergence property for convex minimization problems [18–21]. By contrast, for nonconvex problems, Mascaren  has presented an example to elaborate that the BFGS method and some Broyden-type methods may not be convergent under the exact line search. As such, with the Wolfe line searches, Dai  also proved that the BFGS method may fail to converge. To verify the global convergence of the BFGS method for general functions and to obtain a better Hessian approximation matrix of the objective function, Yuan and Wei  presented a modified quasi-Newton equation as follows:where
In practice, the standard BFGS method has many qualities worth exploring and can effectively solve a class of unconstrained optimization problems.
Here, two excellent properties of the BFGS method are introduced. One is the self-correcting quality, scilicet; if the current Hessian approximate inverse matrix estimates the curvature of the function incorrectly, then Hessian approximation matrix will correct itself within a few steps. The other interesting property is that small eigenvalues are better corrected than large ones . Hence, one can see that, the efficiency of the BFGS algorithm is subject to the eigenvalues structure of the Hessian approximation matrix intensely. To improve the performances of the BFGS method, Oren and Luenberger  scaled the Hessian approximation matrix , that is, they replaced by , where is a self-scaling factor. Nocedal and Yuan  further studied the self-scaling BFGS method when . Based on the value of this , Al-Baali  introduced a simple modification: . The numerical experiments showed that the modified self-scaling BFGS method outperforms the unscaled BFGS method. Many other scaled BFGS methods with better properties will be enumerated.
Formula 1. The general one-parameter scaled BFGS updating formula iswhere is a positive parameter, and it is diverse for the selection of the scaled factor , which is listed as follows. Choice A: where the value of is given by Yuan , and with inexact line search, the global convergence of the scaled BFGS method with given by (9) is established for convex functions by Powell . Ulteriorly, for general nonlinear functions, Yuan limited the value range of to [0.01, 100] to ensure the positivity of under the inexact line search and proved the global convergence of the scaled BFGS method in this form. Choice B: which is obtained as a solution of the problem: . The scaled BFGS method based on this value of was introduced by Barzilai and Borwein  and was deemed the spectral scaled BFGS method. Cheng and Li  proved that the spectral scaled BFGS method is globally convergent under Wolfe line search with assuming the convexity of the minimizing function. Choice C: where for . Under the Wolfe line search (20) and (21), holds for , which implies that computed by (11) is bounded away from zero, that is to say, . Therefore, in this instance, the large eigenvalues of given by (8) are shifted to the left .
Formula 2. Proposed by Oren and Luenberger , this scaled BFGS method was the single parameter scaled of the first two items of the BFGS update and was defined aswhere is a positive parameter and is calculated as follows:
The parameter assigned by (13) can make the structure of eigenvalue to inverse Hessian approximation more easily analyzed. Consequently, it is regarded as one of the best factors.
Formula 3. In this method, the scaled parameters are selected to cluster the eigenvalues of the iteration matrix and shift the large eigenvalues to the left. The update formula of the Hessian approximate matrix is computed aswhere both and are positive parameters, and Andrei  preset them as the following values:
If the scaled parameters are bounded and line search is inexact, then this scaled BFGS algorithm is globally convergent for general functions. A large number of numerical experiments show that the double parameter scaled BFGS method with and given by (15) and (16) is more competitive than the standard BFGS method. In this paper, combining (7) and (14), we propose a new update formula of listed as follows:where is determined by formula (6),
Some interesting properties of the BFGS-type method are inseparable from the weak Wolfe–Powell (WWP) line search:where . There are many research studies based on this line search [35–43]. To further develop the inexact line search, Yuan et al. present a new line search and call it Yuan-Wei-Lu (YWL) line search, which has the following form:where , , and . The main work of this paper is to verify the global convergence of the modified scaled BFGS update (17) with and given by (18) and (19), respectively, under this line search. Abundant numerical results show that such a combination is appropriate for nonconvex functions.
Our paper is organized as follows. The motivation and algorithm are introduced in the next section. In Section 3, the convergence analysis of the modified two-parameter scaled BFGS method under Yuan-Wei-Lu line search is established. Section 4 is devoted to show the results of numerical experiments. Some conclusions are stated in the last section.
2. Motivation and Algorithm
Two crucial tools for analyzing properties of the BFGS method are the trace and the determinant of the given by (4). Thus, the corresponding relations are enumerated as follows:
Applying the following existing relation in the study of Sun and Yuan ,where , , , and ; we obtain
Obviously, the efficiency of the BFGS method depends on the eigenvalues structure of the Hessian approximation matrix, and the BFGS method is actually more affected by large eigenvalues than by small eigenvalues [25, 45, 46]. It can be seen that the second item on the right side of the formula (25) is negative. Therefore, it produces a shift of the eigenvalues of to the left. Thus, the BFGS method can modify large eigenvalues. Moreover, the third term on the right hand side of (25) being positive produces a shift of the eigenvalues of to the right. If this term is large, may have large eigenvalues too. Therefore, the eigenvalues of the can be corrected by scaling the corresponding items in (25), which is the main motivation for us to use the scaling BFGS method. In this paper, we scale the first two terms and the last term of the standard BFGS update formula with two different positive parameters and propose a new . In subsequent proof, we will propose some lemmas based on these two important tools to analyze the convergence of the modified scaled BFGS method. Then, an algorithm framework for solving the problem (1) will be built in Algorithm 1, which can be designed as
3. Convergence Analysis
Assumption 1. (i)The level set is bounded(ii)The function is twice continuously differentiable and bounded from below
Lemma 2. Let be generated by (16) for , then and inclines to 1.
Proof. Observe the formula (19); after substituting , we can find that is close to 1. Owing to the symmetry, positive definiteness, and nonsingularity of , its eigenvalues is real and positive, and . Hence, for , and . Since , , and for sufficiently large , , and are roughly of the same order of magnitude, which shows that . To sum up, the relations and are valid, namely for , and inclines to 1. The proof is completed.
Remark 1. Based on the conclusion of lemma, we can infer that for any integer , there exist two positive constants satisfying .
Proof. Considering (25), we haveIn addition,Therefore, by Remark 1 and the above inequality, the formula (33) is transformed intowhich implies (31). From the positive definiteness of , (32) also holds. The proof is completed.
Lemma 4. Consider and for all , where and are constants. Then, there exists a positive constant such thatfor all sufficiently large.
Proof. Utilizing the identity (26) and taking the determinant on both sides of the formula (14) with and computed as in (18) and (16), we havewhere the penultimate inequality follows , , , and for all . Furthermore, by and Lemma 4, we obtainTherefore,Suppose is sufficiently large, (39) implies (36). The proof is completed.
Theorem 1. If the sequence is obtained by Algorithm 1, then
Proof. The proof by contradiction is used to prove (40) holds. Suppose that . By Yuan-Wei-Lu line search (22) and bounded below, we obtainAdding the abovementioned inequalities from to and utilizing Assumption 1 (ii), we haveFrom Assumption 1 (ii) and (42), we haveBased on this, given a constant , there is a positive integer satisfyingwhere is any positive integer, and the first inequality follows the geometric inequality. Moreover, by Lemma 4, we obtainConsidering , the above formula and formula (39) are contradictory. Thus, (40) is valid. The proof is completed.
4. Numerical Results
In this section, numerical results of Algorithm 1 are reported, and the following methods were compared: (i) MTPSBFGS method ( is updated by (17) with and given by (18) and (19)). (ii) SBFGS method ( is updated by (14) with and given by (11) and (16)).
4.1. General Unconstrained Optimisation Problems
Tested problems: a total of 74 test questions, listed in Table 1 and derived from the studies by Bongartz et al. and More et al. [47, 48]. Parameters: Algorithm 1 runs with , , , , and . Dimensionality: the algorithm is tested in the following three dimensions: 300, 900, and 2700. Himmelblau stop rule : if , then set or . The iterations are stopped if or holds, where and . Experiment environment: all programs are written in MATLAB R2014a and run on a PC with an Inter(R) Core(TM) i5-4210U CPU at 1.70 GHz, 8.00 GB of RAM, and the Windows 10 operating system. Symbol representation: No.: the test problem number. CPU time: the CPU time in seconds. NI: the number of iterations. NFG: the total number of function and gradient evaluations. Image description: Figures 1–3 show the profiles for CPU time, NI, and NFG, and Tables 2–6 provide the detail numerical results. From these figures and tables, it is obvious that the MTPSBFGS method possesses better numerical performance between these two methods, that is, the proposed modified scaled BFGS method is reasonable and feasible. The specific reasons for good performance are stated as follows. The parameter scaling the first two terms of the standard BFGS update is determined to cluster the eigenvalues of this matrix, and the parameter scaling the third term is determined to reduce its large eigenvalues, thus obtaining a better distribution of them.
4.2. Muskingum Model in Engineering Problems
In this subsection, we present the Muskingum model, and it has the following form:
Muskingum model :whose symbolic representation is as follows: is the storage time constant, is the weight coefficient, is an extra parameter, is the observed inflow discharge, is the observed outflow discharge, is the total time, and is the time step at time .
The observed data of the experiment are obtained from the process of flood runoff from Chenggouwan and Linqing of Nanyunhe in the Haihe Basin, Tianjin, China. Select the initial point and the time step . The concrete values of and for the years 1960, 1961, and 1964 are listed in . The test results are presented in Table 7.
Figures 4–6 and Table 7 imply the following three conclusions: (i) based on the Muskingum model, the efficiency of the MTPSBFGS method is wonderful, and numerical performance of these three algorithms is fantastic. (ii) Compared to other similar methods, the final points (, , and ) of the MTPSBFGS method are competitive. (iii) Due to the endpoints of these three methods being different, the Muskingum model may have more approximation optimum points.
A modified two parameter scaled BFGS method and the Yuan-Wei-Lu line search technology are introduced in this paper. By scaling the first two terms and the third term of the standard BFGS method with different positive parameters, a new two parameter scaled BFGS method is proposed. In this method, the new value of is given to guarantee better properties of the new scaled BFGS method. With Yuan-Wei-Lu line search, the proposed BFGS method is globally convergent. Numerical results indicate that the modified two parameter scaled BFGS method outperforms the standard BFGS method and even the same type of the BFGS method. As for the longer-term work, there are several points to consider: (1) are there some new values of , , and that make the BFGS method based on the update formula (17) perform better? (2) Whether the new scaled method combined with other line search have also great theoretical results. (3) Some new engineering problems based on the BFGS-type method are worth studying.
The data used to support this study are included within this article.
Conflicts of Interest
The authors declare that there are no conflicts of interest regarding the publication of this paper.
This work was supported by the National Natural Science Foundation of China (Grant no. 11661009), the High Level Innovation Teams and Excellent Scholars Program in Guangxi Institutions of Higher Education (Grant no. (2019)52), the Guangxi Natural Science Key Fund (Grant no. 2017GXNSFDA198046), and the Guangxi Natural Science Foundation (Grant no. 2020GXNSFAA159069).
L. Zhang and H. Tang, “A hybrid MBFGS and CBFGS method for nonconvex minimization with a global complexity bound,” Pacific Journal of Optimization, vol. 14, no. 4, pp. 693–702, 2018.View at: Google Scholar
M. J. D. Powell, “Some global convergence properties of a variable metric algorithm for minimization without exact line searches,” SIAM-AMS Proceedings, vol. 9, pp. 53–72, 1976.View at: Google Scholar
M. Al-Baali, “Analysis of a family of self-scaling quasi-Newton methods,” Tech. Rep., Department of Mathematics and Computer Science, United Arab Emirates University, Al Ain, UAE, 1993, Technical report.View at: Google Scholar
W. Sun and Y. Yuan, Optimization Theory and Methods, Springer US, New York, NY, USA, 2006.
Y. Yuan and W. Sun, Theory and Methods of Optimization, Science Press of China, Beijing, China, 1999.