New Investigation for the Liu-Story Scaled Conjugate Gradient Method for Nonlinear Optimization

Hamed, Eman T.; Al-Kawaz, Rana Z.; Al-Bayati, Abbas Y.

doi:https://doi.org/10.1155/2020/3615208

Journal of Mathematics

On this page

Abstract Introduction Conclusions Data Availability Conflicts of Interest Acknowledgments References Copyright Related Articles

Research Article | Open Access

Volume 2020 | Article ID 3615208 | https://doi.org/10.1155/2020/3615208

New Investigation for the Liu-Story Scaled Conjugate Gradient Method for Nonlinear Optimization

Eman T. Hamed,¹Rana Z. Al-Kawaz,²and Abbas Y. Al-Bayati³

Academic Editor: Hang Xu

Received30 Sept 2019

Revised18 Nov 2019

Accepted13 Dec 2019

Published25 Jan 2020

Abstract

This article considers modified formulas for the standard conjugate gradient (CG) technique that is planned by Li and Fukushima. A new scalar parameter for this CG technique of unconstrained optimization is planned. The descent condition and global convergent property are established below using strong Wolfe conditions. Our numerical experiments show that the new proposed algorithms are more stable and economic as compared to some well-known standard CG methods.

1. Introduction

Conjugate gradient (CG) strategies consists of a category of nonlinear optimization algorithms, which needs low memory and powerful local and global convergence properties [1,2]. Typically, a CG method is meant to resolve massive scaled nonlinear optimization problem:

On the understanding that the function is defined in the form is smooth nonlinear function. The repetitive formula is in the form

The most important component of this formula is step-size, and the search direction consists ofwhereas denotes and denotes a positive scalar. The step-size is sometimes chosen to satisfy bound line search condition [3]. Among these search direction conditions, the strong Wolfe line search condition is sometimes outlined as follows:and . There are many different formulas for conjugate coefficients as in the following sources, e.g., Hestenes and Stiefel, HS [4]; Fletcher and Reeves, FR [5]; Polak and Ribière, PR [6]; Conjugate Descent, CD [7]; Li and Fukushima, LF [1]; and Liu and Story, LS [8], correspond to different choice for the scalar parameter .

2. A New Scalar Formula for the Parameter

Here in this part of this article, we proposed a new version for the parameter by relying on the modified BFGS method proposed by Li and Fukushima [1]. In the BFGS method, the matrix is updated to the following formula [9]:where . In addition, the normal secant relation is outlined consistent with the subsequent formula:

The researchers Li and Fukushima presented an appropriate modified BFGS technique which is globally and super-linearly convergent, even though while not requiring convex objective functions. The subsequent modified secant equation is outlined consistent with the subsequent formula as follows:whereand > 0; > 0, is a parameter defined as

Specifically, take value c is constant, and it is greater than zero.

There are three different cases for the term : Case 1: if in this case we have the problem of the nonpositive definite matrix, so Li and Fukushima proposed formula as in (9) and developed the corresponding BFGS formula as follows: Moreover, the form of in (10), when max is used so that the value (0; zero), is not selected in this case. Through this formula, the researchers proved that the modified symmetric matrix is positive definite [10]. Case 2: if , in this case, we can say surely that the BFGS update matrix is symmetric and positive definite when applied within this formula (in other words, when applying the inequality in the formula , max = 0) [11].

If we use Liu and Story (LS), we use any scalar ; then, (3) becomeswhere

When any positive value to is greater than one, the new parallel search direction provided in equation (12) is the Newton direction. Hence, Newton’s direction is

Hence,

Using equation (8), the new scalar becomes

By substituting equations (9) and (10) and by taking (because we use the strong Wolfe line search in equation (10), yields . Therefore, the new scalar within the new search direction is

Hence, we conclude from equation (17) that the new parameter is best because it is up to date to find the value of y, and also we find different forms when changing the value of c as we will notice in the section of numerical results.

3. New Theorem (Sufficient Descent Direction)

If we presume that the line search satisfies conditions (4) and (5), then the new search direction which is generated from equations (12) and (17) could be a sufficient descent direction.

Proof. From equations (12) and (17) we obtainedBy using Powell restart equation (i.e., ),If > 0, the next inequality is true:Using strong Wolfe line search condition (5a) yieldsThis latter equation implies thatThus, our requirement is complete.

3.1. Outlines of the New CG-Algorithms

Step 1: select the initial point , and select some positive values for and . Then, set and set . Step 2: test for stopping criterion. If satisfied, then stop; otherwise, continue. Step 3: determine by Wolfe conditions, which are defined in equations (4) and (5). Step 4: compute the second iterative point from equation (2). Step 5: calculate the scalar parameter from equation (17). Step 6: calculate the new search directions, namely, Step 7: test Powell restarting criterion, namely, if , then restart the new search direction with . Step 8: set the next iteration k = k + 1, and go to Step 2.

4. Convergence Analysis for the New Proposed Algorithm

In the following parts, we have a tendency to discuss the convergence analysis property for the new algorithm thoroughly. First, we offer an assumption for the convergence analysis property for the new algorithms. Then, we offer another well-known lemma needed within the study of convergence analysis property. Finally, we have a tendency to set new theorems aboard their proofs that area unit associated with the convergence analysis for the new algorithm.

Assumption. (i) The level set is bounded, that is, there exists a constant z > 0, such as [12](ii)In neighbourhood N of S, f is continuously differentiable, and its gradient is Lipschitz continuous, that is, there exists a constant L > 0, such asFrom the assumptions (i) and (ii) on f, we are able to deduce that there exists > 0 such as

Lemma. If we suppose that [3,13](1)Assumption holds.(2)Search direction in the standard CG method is a descent direction.(3)Optimal step is calculated by equations (4) and (5).(4)The convergence condition is satisfied, i.e., ifThen,

5. New Theorem (Uniformly Convex Function)

If we suppose that(1)Assumption holds.(2)The new search direction defined by equations (12) and (17) is a descent direction.(3)The optimal step is calculated by equations (4) and (5).(4)The objective function f is uniformly convex; then,

Proof. Consider the new direction in equation (12) and the parameter of equation (17) satisfy the next absolute value condition:Well parameter Moreover, by combining the results, we obtainedWe got the required proof. We put similar points to the previous hypotheses, but there are some variations in the formulas.

6. New Theorem (General Function)

If we suppose that(1)Assumption holds.(2)The new search direction defined by equations (12) and (17)is a descent direction.(3)The optimal step is calculated by equations (4) and (5).(4)The objective function f is general function; then,

Proof. Using the same proof style of the previous theorem with the difference in the fact that the functions of the algorithm are general functions,Then, we obtainTherefore, the proof of the new theorems in regards to the convergence analyses of the proposed algorithms is complete.

7. Numerical Experiments

In this section, we have reported some numerical experiments that are performed on a set of (60) unconstrained optimization test problems to analyse the efficiency of . Detail of these test problems, with their given initial points, can be found in [14,15]. We handled each of these (60) test functions by adding 1000 for each n to arrive at maximum number of n which is equal to 10000. The termination criterion used in our experiments is .

In our comparisons below, we employ the following algorithms:(i)LS: with the Wolfe line search(ii)CD: with the Wolfe line search(iii)HS: with the Wolfe line search(iv)PR: with the Wolfe line search(v)New Algorithm, using equation (17) and the scalar c = 0.1(vi)New Algorithm, using equation (17) and the scalar c = 0.001

In Tables 1 and 2, we numerically compare the proposed new CG algorithms against other well-known CG algorithms to verify their performance using the known comparison tools for such algorithms which are as follows: NOI = the total number of calculated iterative iterations NOFG = the total number of function and gradient calculations TIME = the total CPU time required for the processor to execute the CG algorithm and reach the minimum value of the required function minimization

Therefore, among these CG algorithms, the new algorithm appears to generate the best search direction. In Table 3, there is a clear evidence that the new algorithm outperforms the standard LS and CD algorithms detailed as follows (when c = 0.1):(a)For 100% LS algorithm: the new algorithm is improved by (71.5%) NOI, improved by (94.92%) NOFG, and improved by (64.2%) time(b)For 100% CD algorithm: the new algorithm is improved by (72.6%) NOI, improved by (89.95%) NOFG, and improved by (66.7%) time

And (when c = 0.001):(c)For 100% LS algorithm: the new algorithm is improved by (64.34%) NOI, improved by (16.99%) NOFG, and improved by (16.08%) time(d)For 100% CD algorithm: the new algorithm is improved by (52.60%) NOI, improved by (14.63%) NOFG, and improved by (12.75%) time In Table 4, there is a clear evident that the new algorithm outperforms the standard HS and PR algorithms as detailed below (when c = 0.1):(e)For 100% HS algorithm: the new algorithm is improved by (77.6%) NOI, improved by (98.9%) NOFG, and improved by (57%) time(f)For 100% PR algorithm: the new algorithm is improved by (86.2%) NOI, improved by (98.3%) NOFG, and improved by (72%) time and (when c = 0.001)(g)For 100% HS algorithm: the new algorithm is improved by (77.7%) NOI, improved by (98.9%) NOFG, and improved by (49.8%) time(h)For 100% PR algorithm: the new algorithm is improved by (86.3%) NOI, improved by (98.3%) NOFG, and improved by (67.3%) time

What can be deduced from the above tables and experiments are summarized in the following:(i)Points (a to d) above are that our new proposed algorithms in the field of CG-type methods are economic and robust as compared to the standard LS and CD algorithms(ii)The abovementioned points (e to h) are that our new proposed algorithms in the field of CG-type methods are economic and robust as compared to the standard HS and PR algorithms

All these comparisons were made using the performance profile of Dolan and Moré [16], and we can conclude that(1)Figure 1 illustrates the new algorithm versus (LS, CD, HS, and PR) the activity of the new algorithms in calculating the number of iterations(2)Figure 2 explains the new algorithm versus (LS, CD, HS, and PR) the activity of the new algorithms in calculating the number of function and gradient evaluations(3)Figure 3 displays how long the algorithms take to reach the solution (i.e., the required CPU time)(4)Figure 4 shows the functions that perform well in the new algorithm with two different constants compared to the basic algorithms (LS, CD, HS, and PR) based on the number of iterations(5)Figure 5 demonstrates the outstanding performance of a number of functions in the new algorithm with two different constants compared to basic algorithms (LS, CD, HS, and PR) based on the number of function and gradient evaluations

(a)

(b)

(a)

(b)

(a)

(b)

(a)

(b)

(a)

(b)

8. Conclusions

In this study, we have submitted two proposed new CG methods (by changing the value of c). A crucial property of proposed CG methods is that it secures sufficient descent directions. Under mild conditions, we have demonstrated that the new algorithms are globally convergent for each uniformly convex and general functions using the strong Wolfe line search conditions. The preliminary numerical results show that if we decide a good value of parameter c, the new algorithms perform very well. However, an optimal value of the parameter c can be handled theoretically (in future research studies) to achieve more best numerical results.

Data Availability

The data used the support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.

Acknowledgments

The research was supported by College of Computer Sciences and Mathematics, University of Mosul, Republic of Iraq, under Project no. 3615208.

References

D.-H. Li and M. Fukushima, “A modified BFGS method and its global convergence in nonconvex minimization,” Journal of Computational and Applied Mathematics, vol. 129, no. 1-2, pp. 15–35, 2001.
View at: Publisher Site | Google Scholar
Y. H. Dai and Y. X. Yuan, “Convergent properties of nonlinear conjugate gradient methods,” SIAM Journal on Optimization, vol. 10, no. 1, pp. 348–358, 1999.
View at: Publisher Site | Google Scholar
Y.-H. Dai, “New conjugacy conditions and related nonlinear conjugate gradient methods,” Applied Mathematics and Optimization, vol. 43, no. 1, pp. 87–101, 2001.
View at: Publisher Site | Google Scholar
M. R. Hestenes and E. Stiefel, “Methods of conjugate gradients for solving linear systems,” Journal of Research of the National Bureau of Standards, vol. 49, no. 6, pp. 409–436, 1952.
View at: Publisher Site | Google Scholar
R. Fletcher, “Function Minimization by conjugate gradients,” The Computer Journal, vol. 7, no. 2, pp. 149–154, 1964.
View at: Publisher Site | Google Scholar
J. Nocedal and S. J. Wright, Numerical Optimization, Springer, Cham, Switzerland, 2nd edition, 2006.
R. Fletcher, Practical Methods of Optimization, Unconstrained Optimization, Wiley, New York, NY, USA, 1987.
Y. Liu and C. Storey, “Efficient generalized conjugate gradients algorithms, part 1: theory,” Journal of Optimization Theory Applications, vol. 69, pp. 129–137, 1999.
View at: Google Scholar
Q. Guo, J.-G. Liu, D. H. Wang, and D.-H. Wang, “A modified BFGS method and its superlinear convergence in nonconvex minimization with general line search rule,” Journal of Applied Mathematics and Computing, vol. 28, no. 1-2, pp. 435–446, 2008.
View at: Publisher Site | Google Scholar
Z. Liu and H. Liu, “An efficient gradient method with approximately optimal stepsize based on tensor model for unconstrained optimization,” Journal of Optimization Theory and Applications, vol. 181, no. 2, pp. 608–633, 2019.
View at: Publisher Site | Google Scholar
X. Li, B. Wang, and W. Hu, “A modified nonmonotone BFGS algorithm for unconstrained optimization,” Journal of Inequalities and Applications, vol. 183, pp. 1–18, 2017.
View at: Google Scholar
J. C. Gilbert and J. Nocedal, “Global convergence properties of conjugate gradient methods for optimization,” SIAM Journal on Optimization, vol. 2, no. 1, pp. 21–42, 1992.
View at: Publisher Site | Google Scholar
X. Chen and J. Sun, “Global convergence of a two-parameter family of conjugate gradient methods without line search,” Journal of Computational and Applied Mathematics, vol. 146, no. 1, pp. 37–45, 2002.
View at: Publisher Site | Google Scholar
N. Andrei, “An unconstrained optimization test functions collection,” Advanced Model Optimization, vol. 10, pp. 147–161, 2011.
View at: Google Scholar
N. Andrei, “40 Conjugate gradients algorithms for unconstrained optimization,” Bulletin of the Malaysian Mathematical Sciences Society, vol. 34, pp. 319–330, 2011.
View at: Google Scholar
E. D. Dolan and J. J. Moré, “Benchmarking optimization software with performance profiles,” Mathematical Programming, vol. 91, no. 2, pp. 201–213, 2002.
View at: Publisher Site | Google Scholar

Copyright

Copyright © 2020 Eman T. Hamed et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

PDF Download Citation

Download other formats

Order printed copies

Views

807

Downloads

1184

Citations