Journal of Mathematics

Journal of Mathematics / 2021 / Article
Special Issue

Advances in Barycentric Interpolation Methods and their Applications

View this Special Issue

Research Article | Open Access

Volume 2021 |Article ID 6692024 | https://doi.org/10.1155/2021/6692024

Ahmad Alhawarat, Thoi Trung Nguyen, Ramadan Sabra, Zabidin Salleh, "An Efficient Modified AZPRP Conjugate Gradient Method for Large-Scale Unconstrained Optimization Problem", Journal of Mathematics, vol. 2021, Article ID 6692024, 9 pages, 2021. https://doi.org/10.1155/2021/6692024

An Efficient Modified AZPRP Conjugate Gradient Method for Large-Scale Unconstrained Optimization Problem

Academic Editor: Qingli Zhao
Received05 Dec 2020
Revised30 Jan 2021
Accepted19 Mar 2021
Published26 Apr 2021

Abstract

To find a solution of unconstrained optimization problems, we normally use a conjugate gradient (CG) method since it does not cost memory or storage of second derivative like Newton’s method or Broyden–Fletcher–Goldfarb–Shanno (BFGS) method. Recently, a new modification of Polak and Ribiere method was proposed with new restart condition to give a so-call AZPRP method. In this paper, we propose a new modification of AZPRP CG method to solve large-scale unconstrained optimization problems based on a modification of restart condition. The new parameter satisfies the descent property and the global convergence analysis with the strong Wolfe-Powell line search. The numerical results prove that the new CG method is strongly aggressive compared with CG_Descent method. The comparisons are made under a set of more than 140 standard functions from the CUTEst library. The comparison includes number of iterations and CPU time.

1. Introduction

The conjugate gradient (CG) method aims to find a solution of optimization problems without constraint. Suppose that the following optimization problem is considered:where is continuous, the function is differentiable, and the gradient is available. The iterative method is given by the following sequence:where is the starting point and is a step length. The search direction of the CG method is defined as follows:where and is a parameter.

To obtain the step length, we normally use the inexact line search, since the exact line search which is defined as follows,requires many iterations to obtain the step length. Normally, we use the strong version of Wolfe-Powell (SWP) [1, 2] line search which is given bywhere .

The weak Wolfe-Powell (WWP) line search is defined by (5) andwhere . The famous parameters of are the Hestenes–Stiefel (HS) [3], FletcherReeves (FR) [4], and PolakRibièrePolyak (PRP) [5] formulas, which are given bywhere .

Powell [6] shows that there exists a nonconvex function such that the PRP method does not globally converge. Gilbert and Nocedal [7] show that if with the WWP and the descent property is satisfied, then it is globally convergent.

Al-Baali [8] proved that the CG method with FR coefficient is convergent with SWP line search when . Hager and Zhang [9, 10] presented a new CG parameter with descent property, i.e., . This formula is given as follows:where ; ; and is a constant. In the numerical experiments, they set in (9). Al-Baali et al. [11] compared with a new three-term CG method (G3TCG).

Regarding the speed, memory requirements, number of iterations, function evaluations, gradient evaluations, and robustness to solve unconstrained optimization problems which have prompted the development of the CG method, the readers are advised to refer references [1015] for more information on these new formulas.

2. The New Formula and the Algorithm

Alhawarat et al. [15] presented the following simple formula:

Dai and Laio [12] presented the following formula:where and

The new formula is a modification of and is defined as follows:where and .

We obtain the following relations (Algorithm 1):

3. Convergence Analysis of Coefficient with CG Method

Assumption 1. (A)The level set is bounded, that is, a positive constant exists such that(B)In some neighbourhoods of , is continuous and the gradient is available and its gradient is Lipschitz continuous; that is, for all there exists a constant such thatThis assumption shows that there exists a positive constant such thatThe descent condition(17) plays an important role in the CG method. The sufficient descent condition proposed by Al-Baali [8] is a modification of (17) as follows:where . Note that the general form of the sufficient descent condition is (18) with

3.1. Global Convergence for with the SWP Line Search

The following theorem demonstrates that ensures that the sufficient descent condition (21) is satisfied with the SWP line search.

The following theorem shows that satisfies the descent condition. The proof is similar to that presented in [8].

Theorem 1. Let and be generated using (2), (3), and , where is computed by the SWP line search (5) and (6). If , then the sufficient descent condition (18) holds.

Algorithm 1 shows the steps to obtain the solution of optimization problem using strong Wolfe-Powell line search.

Descent condition is (18) with c > 0.

Proof. By multiplying () by , we obtainDivide (19) by ; usingand (12), we obtainFrom (3), we obtain . Assume that it is true until i.e., for . Repeating the process for (21), we obtainAshence,and when , we obtain . Let , thenThe proof is complete.

Theorem 2. Let and be obtained by using (2), (3), and where is computed by SWP line search (5) and (6), then the descent condition holds.

Proof. By multiplying (3) by and substituting , we obtain
which completes the proof.
Zoutendijk [16] presented a useful lemma for global convergence property of the CG method. The condition is given as follows.

Lemma 1. Let Assumption 1 hold and consider any method in the form of (2) and (3), where is obtained by the WWP line search (6) and (7), in which the search direction is descent. Then, the following condition holds:

Theorem 3. Suppose Assumption 1 holds. Consider any form of equations (2) and (3), with the new formula (12), in which is obtained from the SWP line search (5) and (6) with . Then,The proof is similar to that presented in [8].

Proof. We will prove the theorem by contradiction. Assume that the conclusion is not true, then a constant exists such thatSquaring both sides of equation (3), we obtainDivide (31) by , we getUsing (6), (12), and (32), we obtainRepeating the process for (33) and using the relationship yieldsFrom (33), we obtainTherefore,This result contradicts (32), thus . The proof is complete.

4. Numerical Results

To investigate the effectiveness of the new parameter, several test problems in Table 1 from CUTEst [17] are chosen. We performed a comparison with the CG_Descent 5.3 based on the CPU time and the number of iterations. We employed the SWP line search with the line as mentioned in [1, 2] with and . The modified CG_Descent 6.8 where the memory (mem) equals zero is employed to obtain all results. The code can be downloaded from Hager web pagehttp://users.clas.ufl.edu/hager/papers/Software/.


CG_Descent 5.3
FunctionDimensionNumber of iterationsCPU timeNumber of iterationsCPU time

AKIVA2100.0280.02
ALLINITU4120.0290.02
ARGLINA20010.0210.02
ARGLINB20050.0260.11
ARWHEAD500070.0260.03
BARD3160.02120.02
BDQRTIC50001360.581610.75
BEALE2150.02110.02
BIGGS66270.02240.02
BOX33110.02100.02
BOX100080.0870.08
BRKMCC250.0250.02
BROWNAL20090.0290.02
BROWNBS2130.02100.02
BROWNDEN4160.02160.02
BROYDN7D500014115.47640.37
BRYBND5000850.38390.22
CHAINWOO40003180.8663791.08
CHNROSNB502870.023400.02
CLIFF2180.02100.02
COSINE10000110.19140.26
CRAGGLVY50001030.451040.48
CUBE2320.02170.02
CURLY101000047808173.742454145.16
CURLY201000066587383.9467279366.03
CURLY301000079030639.6374375509.59
DECONVU634002.00E − 022270.02
DENSCHNA290.0260.02
DENSCHNB270.0260.02
DENSCHNC2120.02110.02
DENSCHND3470.02140.02
DENSCHNE3180.02120.02
DENSCHNF280.0290.02
DIXMAANA300070.0260.02
DIXMAANB300060.0260.02
DIXMAANC300060.0260.02
DIXMAAND300070.0280.02
DIXMAANE30002220.332180.33
DIXMAANF30001610.131160.09
DIXMAANG30001570.121730.14
DIXMAANH30001730.221900.2
DIXMAANI300038564.2531603.34
DIXMAANJ30003270.363600.39
DIXMAANK30002830.284160.36
DIXMAANL30002370.23990.36
DIXON3DQ100001000019.121000019.12
DJTL2820.02750.02
DQDRTIC500050.0250.02
DQRTIC5000170.03150.03
EDENSCH2000260.03320.05
EG2100050.0230.02
EIGENALS255010083178.367247133.4
EIGENBLS25501530123718846290.3
EIGENCLS265210136174.1911152186.86
ENGVAL15000270.06230.12
ENGVAL23260.02260.02
ERRINROS503800.02955042.36
EXPFIT2130.0290.02
EXTROSNB100038081.2523700.87
FLETCBV2500010.0210.02
FLETCHCR10001520.05840.05
FMINSRF256253461.09E + 004851.4
FMINSURF56254731.515421.64
FREUROTH5000250.11290.19
GENROSE50010780.1720980.45
GROWTHLS31560.021090.02
GULF3370.02330.02
HAIRY2360.02170.02
HATFLDD3200.02170.02
HATFLDE3300.02130.02
HATFLDFL3390.02210.02
HEART6LS66840.023750.02
HEART8LS82490.022530.02
HELIX3230.02230.02
HIELOW3140.02130.05
HILBERTA220.0220.02
HILBERTB1040.0240.02
HIMMELBB2100.0240.02
HIMMELBF4260.02230.02
HIMMELBG280.0270.02
HIMMELBH270.0250.02
HUMPS2520.02450.02
JENSMP2150.02120.02
JIMACK3544983141182.2572971030.3
KOWOSB4170.02160
LIARWHD5000210.03150.05
LOGHAIRY2270.02260.02
MANCINO100110.08110.08
MARATOSB211450.025890.02
MEXHAT2200.02140.02
MOREBV50001610.411610.38
MSQRTALS102429058.6427889.08
MSQRTBLS102422806.9121816.84
NCB20B500203546.36418170.16
NCB20501087911.8395913
NONCVXU25000661015.89637915.92
NONDIA500070.0370.03
NONDQUAR500019422.4530583.88
OSBORNEA5940.02820.02
OSBORNEB11620.02570.02
PALMER1C8110.02120.02
PALMER1D7110.02100.02
PALMER2C8110.02110.02
PALMER3C8110.02110.02
PALMER4C8110.02110.02
PALMER5C660.0260.02
PALMER6C8110.02110.02
PALMER7C8110.02110.02
PALMER8C8110.02110.02
PARKCH1567229.4582339.39
PENALTY11000280.02410.02
PENALTY22001910.052000.03
PENALTY3200991.78881.98
POWELLSG5000260.02270.05
POWER100003720.765431.2
QUARTC5000170.03150.02
ROSENBR2340.02280.02
S308280.0270.02
SCHMVETT5000430.23400.27
SENSORS100210.25500.8
SINEVAL2640.02460.02
SINQUAD5000140.09150.08
SISSER260.0250.02
SNAIL21000.02610.02
SPARSINE500018358732132883
SPARSQUR10000280.31350.98
SPMSRTLS49992030.592190.61
SROSENBR5000110.0290.03
STRATEC1046219.981706.23
TESTQUAD500015771.52E + 0015731.42
TOINTGOR501350.021200.02
TOINTGSS500040.0250.02
TOINTPSP501430.021570.02
TOINTQOR50290.02290.02
TQUARTIC5000140.03110.03
TRIDIA50007820.847831.11
VAREIGVL230.02240.02
VIBRBEAM501380.02980.02
WATSON8490.02610.02
WOODS12220.06220.03
YFITU4000840.02680.02
ZANGWIL2310.0210.02

The CG_Descent 5.3 results are obtained by run CG_Descent 6.8 with memory which equals zero. The host computer is an AMD A4-7210 with RAM 4 GB. The results are shown in Figures 1 and 2 in which a performance measure introduced by Dolan and More [18] was employed. As shown in Figure 1, formula A strongly outperforms over CG_Descent in number of iterations. In Figure 2, we notice that the new CG formula A is strongly competitive with CG_Descent.

4.1. Multimodal Function with Its Graph

In this section, we present six-hump camel back function, which is a multimodal function to test the efficiency of the optimization algorithm. The function is defined as follows:

The number of variables (n) equals 2. This function has six local minima, with two of them being global. Thus, this function is a multimodal function usually used to test global minima. Global minima are and . The function value is . As its name describes, this function looks like the back of an upside down camel with six humps (see Figure 3 for a three-dimensional graph); for more information about two-dimensional functions, the reader can refer to [19].

Finally, note that CG method can be applied in image restoration problems and neural network and others. For more information, the reader can refer to [20, 21].

5. Conclusions

In this study, a modified version of the CG algorithm (A) is suggested and its performance is investigated. The modified formula is restarted based on the value of the Lipchitz constant. The global convergence is established by using SWP line search. Our numerical results show that the new coefficient produces efficient and competitive results compared with other methods, such as CG_Descent 5.3. In the future, an application of the new version of CG method will be combined with feed-forward neural network (back-propagation (BP) algorithm) to improve the training process and produce fast training multilayer algorithm. This will help in reducing time needed to train neural network when the training samples are massive.

Data Availability

The data used to support the findings of this study are included within the article.

Conflicts of Interest

The authors declare that they have no conflicts of interest regarding the publication of this paper.

Acknowledgments

The authors would like to thank Universiti Malaysia Terengganu for supporting this work.

References

  1. P. Wolfe, “Convergence conditions for ascent methods,” SIAM Review, vol. 11, no. 2, pp. 226–235, 1969. View at: Publisher Site | Google Scholar
  2. P. Wolfe, “Convergence conditions for ascent methods. II: some corrections,” SIAM Review, vol. 13, no. 2, pp. 185–188, 1971. View at: Publisher Site | Google Scholar
  3. E. Stiefel, “Methods of conjugate gradients for solving linear systems,” Journal of Research of the National Bureau of Standards, vol. 49, pp. 409–435, 1952. View at: Google Scholar
  4. R. Fletcher and C. M. Reeves, “Function minimization by conjugate gradients,” The Computer Journal, vol. 7, no. 2, pp. 149–154, 1964. View at: Publisher Site | Google Scholar
  5. E. Polak and G. Ribiere, “Note sur la convergence de méthodes de directions conjuguées,” ESAIM: Mathematical Modelling and Numerical Analysis-Modélisation Mathématique et Analyse Numérique, vol. 3, no. R1, pp. 35–43, 1969. View at: Publisher Site | Google Scholar
  6. M. J. Powell, “Nonconvex minimization calculations and the conjugate gradient method,” in Numerical Analysis, pp. 122–141, Springer, Berlin, Heidelberg, 1984. View at: Google Scholar
  7. J. C. Gilbert and J. Nocedal, “Global convergence properties of conjugate gradient methods for optimization,” SIAM Journal on Optimization, vol. 2, no. 1, pp. 21–42, 1992. View at: Publisher Site | Google Scholar
  8. M. Al-Baali, “Descent property and global convergence of the fletcher-reeves method with inexact line search,” IMA Journal of Numerical Analysis, vol. 5, no. 1, pp. 121–124, 1985. View at: Publisher Site | Google Scholar
  9. W. W. Hager and H. Zhang, “A new conjugate gradient method with guaranteed descent and an efficient line search,” SIAM Journal on Optimization, vol. 16, no. 1, pp. 170–192, 2005. View at: Publisher Site | Google Scholar
  10. W. W. Hager and H. Zhang, “The limited memory conjugate gradient method,” SIAM Journal on Optimization, vol. 23, no. 4, pp. 2150–2168, 2013. View at: Publisher Site | Google Scholar
  11. M. Al-Baali, Y. Narushima, and H. Yabe, “A family of three-term conjugate gradient methods with sufficient descent property for unconstrained optimization,” Computational Optimization and Applications, vol. 60, no. 1, pp. 89–110, 2015. View at: Publisher Site | Google Scholar
  12. Y.-H. Dai and L. Z. Liao, “New conjugacy conditions and related nonlinear conjugate gradient methods,” Applied Mathematics and Optimization, vol. 43, no. 1, pp. 87–101, 2001. View at: Publisher Site | Google Scholar
  13. M. Al-Baali, E. Spedicato, and F. Maggioni, “Broyden’s quasi-Newton methods for a nonlinear system of equations and unconstrained optimization: a review and open problems,” Optimization Methods and Software, vol. 29, no. 5, pp. 937–954, 2014. View at: Publisher Site | Google Scholar
  14. S. Babaie-Kafaki and R. Ghanbari, “A descent family of Dai-Liao conjugate gradient methods,” Optimization Methods and Software, vol. 29, no. 3, pp. 583–591, 2014. View at: Publisher Site | Google Scholar
  15. A. Alhawarat, Z. Salleh, M. Mamat, and M. Rivaie, “An efficient modified Polak-Ribière-Polyak conjugate gradient method with global convergence properties,” Optimization Methods and Software, vol. 32, no. 6, pp. 1299–1312, 2017. View at: Publisher Site | Google Scholar
  16. G. Zoutendijk, Nonlinear Programming, Computational Methods, Integer and Nonlinear Programming, North Holland, Amsterdam, 1970.
  17. I. Bongartz, A. R. Conn, N. Gould, and P. L. Toint, “CUTE,” ACM Transactions on Mathematical Software, vol. 21, no. 1, pp. 123–160, 1995. View at: Publisher Site | Google Scholar
  18. E. D. Dolan and J. J. Moré, “Benchmarking optimization software with performance profiles,” Mathematical Programming, vol. 91, no. 2, pp. 201–213, 2002. View at: Publisher Site | Google Scholar
  19. A. Alhawarat and Z. Salleh, “Modification of nonlinear conjugate gradient method with weak Wolfe-Powell line search,” Abstract and Applied Analysis, vol. 2017, Article ID 7238134, 6 pages, 2017. View at: Publisher Site | Google Scholar
  20. G. Yuan, T. Li, and W. Hu, “A conjugate gradient algorithm and its application in large-scale optimization problems and image restoration,” Journal of Inequalities and Applications, vol. 2019, no. 1, pp. 1–25, 2019. View at: Publisher Site | Google Scholar
  21. G. Yuan, J. Lu, and Z. Wang, “The modified PRP conjugate gradient algorithm under a non-descent line search and its application in the Muskingum model and image restoration problems,” Soft Computing, vol. 25, no. 8, pp. 5867–5879, 2021. View at: Google Scholar

Copyright © 2021 Ahmad Alhawarat et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Related articles

No related content is available yet for this article.
 PDF Download Citation Citation
 Download other formatsMore
 Order printed copiesOrder
Views320
Downloads473
Citations

Related articles

No related content is available yet for this article.

Article of the Year Award: Outstanding research contributions of 2021, as selected by our Chief Editors. Read the winning articles.