A Modified Nonlinear Conjugate Gradient Method with the Armijo Line Search and Its Application

Zhang, Mengxiang; Zhou, Yingjie; Wang, Songhua

doi:https://doi.org/10.1155/2020/6210965

Mathematical Problems in Engineering

On this page

Abstract Introduction Conclusion Data Availability Conflicts of Interest Acknowledgments References Copyright Related Articles

Special Issue

Machine Learning and its Applications in Image Restoration

View this Special Issue

Research Article | Open Access

Volume 2020 | Article ID 6210965 | https://doi.org/10.1155/2020/6210965

A Modified Nonlinear Conjugate Gradient Method with the Armijo Line Search and Its Application

Mengxiang Zhang,¹Yingjie Zhou,¹and Songhua Wang²

Guest Editor: Maojun Zhang

Received19 Jul 2020

Accepted10 Aug 2020

Published28 Aug 2020

Abstract

In this article, a modified Polak-Ribière-Polyak (PRP) conjugate gradient method is proposed for image restoration. The presented method can generate sufficient descent directions without any line search conditions. Under some mild conditions, this method is globally convergent with the Armijo line search. Moreover, the linear convergence rate of the modified PRP method is established. The experimental results of unconstrained optimization, image restoration, and compressive sensing show that the proposed method is promising and competitive with other conjugate gradient methods.

1. Introduction

Consider the following unconstrained optimization problem:where is a continuously differentiable function. The iteration formula is determined bywhere is stepsize obtained by some kind of line search and is search direction.

It is well known that there are many methods ([1–4] etc.) for solving optimization problem (1), where the conjugate gradient method is a powerful method because of its simplicity, and this method can avoid the computation and storage of some matrices associated with the Hessian of objective functions; then its memory requirements are very low. Large-scale calculations are usually required in the process of image processing; therefore, the conjugate gradient algorithm has excellent application prospects in this aspect. The search direction of the conjugate gradient methods is given bywhere is a scalar and is the gradient of objective function at the point . The choice of determines different conjugate gradient methods. Classical conjugate gradient methods include PRP conjugate gradient method [5, 6], HS conjugate gradient method [7], LS conjugate gradient method [8], DY conjugate gradient method [9], FR conjugate gradient method [10], and CD conjugate gradient method [11]. The parameters of these methods are specified as follows:where , and stands the Euclidean norm. There has been much research on convergence properties of these methods ([12–15] etc.) and applications ([16–23] etc.).

In this paper, the focus is mainly on the PRP method. The PRP method is generally believed to have the best numerical performance in classical conjugate gradient methods, but there is much room for discussion in terms of convergence. When the exact line search was used, the global convergence of the PRP conjugate gradient method has been proved by Polak and Ribière [5] for convex objective functions. However, Powell [24] proposed a counter example that the PRP conjugate gradient method may fail for nonconvex functions even if the exact line search is used. Gilbert and Nocedal [14] showed a modified PRP conjugate gradient method that the modified PRP method is globally convergent if is restricted to be not less than zero and is determined by a line search step satisfying the sufficient descent condition:in addition to the following weak Wolfe-Powell conditions, i.e.,where . Meanwhile, Gilbert and Nocedal [14] pointed out that even if the objective function is guaranteed to be uniformly convex, may be negative. With the strong Wolfe-Powell line search, Dai [25] gave an example showing that even though the objective function is uniformly convex, the PRP method cannot guarantee that the search direction is always the descent direction.

Through the above observations and [14, 24, 26–28], the sufficient descent condition (5) and the condition play key roles in establishing the global convergence of the conjugate gradient methods. However, in the case where Armijo line search or Wolfe line search is used, the descent property of determined by (3) is in general not guaranteed.

In order to obtain the generated sufficient descent direction, Hager and Zhang [29] showed a new conjugate gradient method (CG-C) obtained by modifying the HS method, which generates sufficient descent directions at each step, without relying on any line search. The scalar in CG-C method is determined bywhereand is a constant. With the weak Wolfe-Powell line search, Hager and Zhang [29] established a global convergence result for (7) when the objective function is general nonlinear function.

Along this line, Yu and Guan [30, 31] extended the CG-C method to the PRP, LS, DY, FR, and CD methods. The parameter in the modified PRP method (DPRP) is as follows:where is a constant. The performance of the above methods is better than other conjugate gradient in practice. In addition, because Armijo line search is very simple, compared with Wolfe line search, it is more widely used in image restoration and compressed sensing problems. Therefore, it is necessary to explore the performance of the above methods under Armijo line search. However, the above methods cannot obtain global convergence with Armijo line search, i.e., to find a stepsize satisfyingwhere are given constants. In order to overcome this defect, scholars have done a lot of related work ([32–34] etc). In order to overcome the defect of PRP, HS and LS conjugate gradient methods cannot globally converge with the Armijo line search, Li and Qu [33] proposed a series of modified conjugate gradient methods based on (3) and (4) as follows:where is a scalar to be specified. The parameter is in the modified PRP method (APRP). And the results of global convergence are established when the Armijo line search is used.

Motivated by the idea of [31, 33] and taking into account the excellent numerical performance, we propose a new modified PRP method. The parameter is computed aswhere , and are two constants. And the search direction is determined by the classical two-term conjugate gradient method (3).

In the next section, the proposed algorithm is stated. In Section 3, the properties and the convergent results of the new method are given. This method enjoys linear convergence under suitable conditions in Section 4. Numerical results and conclusion are presented in Section 5 and in Section 6, respectively.

2. Algorithm

This section will give algorithm of the new modified PRP method in association with the Armijo line search technique (10) for (1). The algorithm is stated as follows.

Remark 1. Without any line search method, the NPRP method can get sufficient descent properties.

Lemma 1. Let and be generated by the NPRP method. If , for all , then

Proof. Case (i) . From (3), we get Hence (14) is satisfied. Case (ii) . From (3) and (13), we haveDefineBy the above equation, we getwhere . Thus, (14) is satisfied.

3. Properties and Convergence Analysis

The following few suitable assumptions are often utilized in global convergence analysis for conjugate gradient algorithms.

Assumption 1. (i)The level set is bounded.(ii)Function is continuously differentiable and its gradient is Lipschitz continuous on an open convex set containing , i.e., there exists a constant such thatIt follows directly from Assumption 1 that there is a positive constant , such that

Lemma 2. If Assumptions 1 holds, , , , and be generated by Algorithm 1. There exists a constant such that the following inequality holds for all , then

	Step 0: Choose an initial point , constants . Compute. . Set .
	Step 1: If , then stop the iterations. Otherwise, continue with step 2.
	Step 2: Compute the system of equations (3) to get , where is obtained by (13).
	Step 3: Determine a step size by the Armijo line search (10).
	Step 4: Compute , .
	Step 5: Let and go to Step 1.

Proof. When , from (3), we can obtainthen (21) holds.
When , from (3) and (20), we haveTogether with (3) and the above formula implieslettingThus (21) is satisfied.

Lemma 3. If the stepsize is obtained by Armijo line search, there exists a constant such that the following inequality holds for all , then

Proof. Case (i) . From (14) and (26), we get Then we need to satisfy From Lemma 2, we have Case (ii) . By the Armijo line search condition, does not satisfy inequality (10). This meansBy the mean-value theorem and inequality (19), there is a such as andObserving the last inequality and (30), we havelettingThus (26) is satisfied.

Lemma 4. Let the sequence , and be generated by Algorithm 1, and suppose that Assumption 1 holds, then,

Proof. We have from (10) and Assumption 1 thatSubstituting (26) into the above formula, we havewhere and are two constants. Therefore, (34) holds.
Equation (34) is usually called the Zoutendijk condition [35], and it is very important for establishing global convergence.

Theorem 1. Suppose the conditions in Assumption 1 hold. Let be generated by the NPRP method with Armijo line search, then either for some or .

Proof. By Lemma 2 and Lemma 4, we haveTogether with (21), the above formula implieswhere .
From the above inequality, we can obtain.

Remark 2. From the proof of Theorem 1, with the Wolfe line search, the NPRP method can also establish global convergence without requirement of truncation and convexity of the objective function.

4. Linear Convergence Rate

When discussing the linear convergence rate for conjugate gradient methods, the following assumptions are often established on the basis of Assumption 1.

Assumption 2. (i)Objective function is twice continuously differentiable in (ii) is a symmetric positive definite matrix in , i.e., there exist positive constants such that

Lemma 5. Suppose that Assumption 2 holds and the sequence generated by the NPRP method converges with the Armijo line search. Then the sequence converges to the unique minimal point .

Proof. If Assumption 2 holds, then Assumption 1 holds, and objective function is uniformly convex. By the Armijo line search (10) and Lemma 1, it can be deduced that is a decreasing sequence.
Therefore, by the conclusion of Theorem 1, the sequence converges to the unique minimal point .

Lemma 6. Suppose that Assumption 2 holds. The problem (1) has a unique solution and

Theorem 2. Suppose Assumption 2 holds. Let be the unique solution of problem (1) and the sequence be generated by the NPRP method with the Armijo line search. Then there are constants and such that for all , thenthat is, converges to at least R-linearly.

Proof. If Assumption 2 holds, then Assumption 1 holds. By (14), (21), and (26), it is easy to prove thatWe get from (14), (40), (41), (45), and the Armijo condition (10)Letting , we haveCombing (40), we obtainHence (44) holds, where . The proof is completed.

Remark 3. From the proof of Theorem 2, with the Wolfe line search, the NPRP method can also establish that converges to at least R-linearly.

5. Numerical Experiments

In this section, we report some numerical results of Algorithm 1 and compare the performance of some methods previously detailed in this paper. The following methods were compared: NPRP: is computed by (13), is determined by (3), the Armijo line search (10)). DPRP [30, 31]: is computed by (9), is determined by (3), the weak Wolfe-Powell line search (6). APRP [33]: is computed by (12), is determined by (11) the Armijo line search (10).

All programs are written in MATLAB R2019a and run on a PC with an AMD Ryzen 5 3550 H with Radeon Vega Mobile Gfx CPU @2.10 GHz, 16 GB (14.9 GB available) of RAM and the Windows 10 operating system.

5.1. General Unconstrained Optimization Problems

5.1.1. Contrast Algorithm

This experiment tests three algorithms, NPRP, DPRP and APRP.

5.1.2. Tested Problems

A number of 74 unconstrained optimization test problems are described in Table 1 [36, 37].

5.1.3. Dimensionality

Problem instances with 3000, 9000, and 27000 variables are considered.

5.1.4. Parameters

All the algorithms run with , , , , , and .

5.1.5. Termination Rule

If or if the number of iterations exceeds .

5.1.6. Symbol Representation

No: the test problem number, CPUTime: the CPU time in seconds, NI: the number of iterations, NFG: the total number of function and gradient evaluations.

5.1.7. Image Description

Figures 1–3 show the performance profiles of CPU Time, NI, and NFG of the three methods.

From Table 2 and Figures 1–3, we can see the performance of NPRP method and APRP method is significantly better than that of DPRP method. The reason is that the Armijo line search is more convenient than the Wolfe line search to obtain stepsize . Meanwhile, we can get that the performance of NPRP method is slightly better than that of APRP method.

5.2. Image Restoration Problems

This subsection is done to recover the original image from an image corrupted by impulse noise.

5.2.1. Contrast Algorithm

This experiment tests two algorithms, NPRP and APRP.

5.2.2. Tested Problems

Restore the original image from the image destroyed by impulsive noise. The experiments chose Boat (256 256), Lena (512 512), and Man (1024 1024) are the test images.

5.2.3. Noise Level

Different algorithms perform for 25%, 50%, and 75% noise problems.

5.2.4. Parameters

All the algorithms run with , , , and .

5.2.5. Termination Rule

If or , the algorithm is terminated.

5.2.6. Symbol Representation

CPU Time: the CPU time in seconds.

5.2.7. Image Description

Figures 4 and 5 show the recovery of different algorithms with different noise levels. Restoration of the Boat, Lena, and Man images is done by the NPRP method and APRP method. From left to right: a noise image with different percentages of salt-and-pepper noise, restorations with NPRP method, and restorations with APRP method.

(a)

(b)

(c)

(d)

(e)

(f)

(g)

(h)

(i)

(a)

(b)

(c)

(d)

(e)

(f)

(g)

(h)

(i)

The results in Table 3 and Figures 4–6 show us at least two conclusions: (i) NPRP method and APRP method are successful for restoring these images with suitable CPU time; (ii) NPRP method is promising and competitive to the APRP method for 25% noise problem, 50% noise problem, and 75% noise problem.

(a)

(b)

(c)

(d)

(e)

(f)

(g)

(h)

(i)

5.3. Compressive Sensing Problems

The purpose of this section is to accurately recover the image from a few random projections by compressive sensing (CG), based on the gradient recovery algorithms. The experimental method comes from the model proposed by Dai and Sha [38].

5.3.1. Contrast Algorithm

This experiment tests two algorithms, NPRP and APRP.

5.3.2. Tested Problems

Recover the image accurately from a few of random projections. The experiments chose Phantom (256 256), Fruits (256 256), and Peppers (256 256) as the test images.

5.3.3. Parameters

All the algorithms run with , , , and .

5.3.4. Termination Rule

If the square root of the sum of the diagonal elements of is less than or if the number of iterations exceeds 500.

5.3.5. Symbol Representation

PSNR: peak signal to noise ratio. The greater the PSNR, the better the effect.

5.3.6. Image Description

Figure 7 shows the recovery of images after random measurement in Fourier domain. Restoration of the Phantom, Fruits, and Peppers images by NPRP method and APRP method. From left to right: original image, restorations with NPRP method, restorations with APRP method.

(a)

(b)

(c)

(d)

(e)

(f)

(g)

(h)

(i)

The results in Table 4 and Figure 7 show us at least two conclusions: (i) NPRP method and APRP method are successful for restoring these images with suitable PSNR; (ii) NPRP method is promising and competitive to APRP method for compressive sensing problems.

From the above numerical experiment results, we can see that the performance of the NPRP method is more competitive than the APRP method and the DPRP method. The reasons are as follows. (i) Hager and Zhang pointed out in the survey paper [39] that the common molecule of the PRP, HS, and LS methods possess a built-in restart feature that addresses the jamming problem: the factor in the numerator of tends to zero when the step is too small. In addition, becomes smaller, and the new search direction is closer to the steepest descent direction . Meanwhile, NPRP method is similar in form to CG-C and PRP methods and therefore inherits their good performance. (ii) The Armijo line search is more convenient than the Wolfe line search to obtain stepsize .

6. Conclusion

In this paper, a modified PRP conjugate gradient method was proposed to solve image restoration and compression sensing problems, based on the well-known CG-C method [29]. With the Armijo line search, the global convergence and liner convergence rate of the algorithm is established under some suitable conditions. The sufficient descent property of the algorithm has been proved without the use of any line search method. The numerical results indicate that the algorithm is effective and competitive for solving unconstrained optimization problems, image restoration problems, and compressive sensing problems.

Data Availability

The data used to support the findings of this study are included within the article.

Conflicts of Interest

There are no potential conflicts of interest.

Acknowledgments

The authors want to thank the financial support by the High Level Innovation Teams and Excellent Scholars Program in Guangxi Institutions of Higher Education (Grant No. [2019]52), the National Natural Science Foundation of China (Grant no. 11661009), the Guangxi Natural Science Key Foundation (No. 2017GXNSFDA198046), and the Guangxi Natural Science Foundation (No. 2020GXNSFAA159069).

References

Y. H. Dai, “A nonmonotone conjugate gradient algorithm for unconstrained optimization,” Journal of System Science and Complexity, vol. 15, no. 2, pp. 139–145, 2002.
View at: Google Scholar
J. Nocedal and Y. X. Yuan, “Analysis of a self-scaling quasi-newton method,” Mathematical Programming, vol. 61, no. 1–3, pp. 19–37, 1993.
View at: Publisher Site | Google Scholar
Z. Wei, G. Li, and L. Qi, “New nonlinear conjugate gradient formulas for large-scale unconstrained optimization problems,” Applied Mathematics and Computation, vol. 179, no. 2, pp. 407–430, 2006.
View at: Publisher Site | Google Scholar
Z. Wei, S. Yao, and L. Liu, “The convergence properties of some new conjugate gradient methods,” Applied Mathematics and Computation, vol. 183, no. 2, pp. 1341–1350, 2006.
View at: Publisher Site | Google Scholar
E. Polak and G. Ribiere, “Note sur la convergence de méthodes de directions conjuguées,” ESAIM: Mathematical Modelling and Numerical Analysis-Modélisation Mathématique et Analyse Numérique, vol. 3, no. 16, pp. 35–43, 1969.
View at: Publisher Site | Google Scholar
B. T. Polyak, “The conjugate gradient method in extremal problems,” USSR Computational Mathematics and Mathematical Physics, vol. 9, no. 4, pp. 94–112, 1969.
View at: Publisher Site | Google Scholar
M. R. Hestenes and E. Stiefel, “Methods of conjugate gradients for solving linear systems,” Journal of Research of the National Bureau of Standards, vol. 49, no. 6, pp. 409–436, 1952.
View at: Publisher Site | Google Scholar
Y. Liu and C. Storey, “Efficient generalized conjugate gradient algorithms, part 1: theory,” Journal of Optimization Theory and Applications, vol. 69, no. 1, pp. 129–137, 1991.
View at: Publisher Site | Google Scholar
Y. H. Dai and Y. Yuan, “A nonlinear conjugate gradient method with a strong global convergence property,” SIAM Journal on Optimization, vol. 10, no. 1, pp. 177–182, 1999.
View at: Publisher Site | Google Scholar
R. Fletcher and C. M. Reeves, “Function minimization by conjugate gradients,” The Computer Journal, vol. 7, no. 2, pp. 149–154, 1964.
View at: Publisher Site | Google Scholar
R. Fletcher, “Practical methods of optimization,” in Unconstrained Optimization, vol. Vol. 1, John Wiley & Sons, Hoboken, NJ, USA, 1980.
View at: Google Scholar
W. Y. Cheng, “A two-term prp-based descent method,” Numerical Functional Analysis and Optimization, vol. 28, no. 11-12, pp. 1217–1230, 2007.
View at: Publisher Site | Google Scholar
Z.-f. Dai and B.-S. Tian, “Global convergence of some modified prp nonlinear conjugate gradient methods,” Optimization Letters, vol. 5, no. 4, pp. 615–630, 2011.
View at: Publisher Site | Google Scholar
J. C. Gilbert and J. Nocedal, “Global convergence properties of conjugate gradient methods for optimization,” SIAM Journal on Optimization, vol. 2, no. 1, pp. 21–42, 1992.
View at: Publisher Site | Google Scholar
G. Yu, Y. Zhao, and Z. Wei, “A descent nonlinear conjugate gradient method for large-scale unconstrained optimization,” Applied Mathematics and Computation, vol. 187, no. 2, pp. 636–643, 2007.
View at: Publisher Site | Google Scholar
Z. Dai, X. Dong, J. Kang, and L. Hong, “Forecasting stock market returns: new technical indicators and two-step economic constraint method,” The North American Journal of Economics and Finance, vol. 53, Article ID 101216, 2020.
View at: Publisher Site | Google Scholar
Z. F. Dai and H. Zhu, “A modified hestenes-stiefel-type derivative-free method for large-scale nonlinear monotone equations,” Mathematics, vol. 8, no. 2, p. 168, 2020.
View at: Publisher Site | Google Scholar
F. Wen and X. Yang, “Skewness of return distribution and coefficient of risk premium,” Journal of Systems Science and Complexity, vol. 22, no. 3, pp. 360–371, 2009.
View at: Publisher Site | Google Scholar
G. Yu, J. Huang, and Y. Zhou, “A descent spectral conjugate gradient method for impulse noise removal,” Applied Mathematics Letters, vol. 23, no. 5, pp. 555–560, 2010.
View at: Publisher Site | Google Scholar
G. Yuan, T. Li, and W. Hu, “A conjugate gradient algorithm for large-scale nonlinear equations and image restoration problems,” Applied Numerical Mathematics, vol. 147, pp. 129–141, 2020.
View at: Publisher Site | Google Scholar
G. L. Yuan, J. Y. Lu, and Z. Wang, “The prp conjugate gradient algorithm with a modified wwp line search and its application in the image restoration problems,” Applied Numerical Mathematics, vol. 152, pp. 1–11, 2020.
View at: Publisher Site | Google Scholar
G. Yuan, Z. Wei, and G. Li, “A modified Polak-Ribière-Polyak conjugate gradient algorithm for nonsmooth convex programs,” Journal of Computational and Applied Mathematics, vol. 255, pp. 86–96, 2014.
View at: Publisher Site | Google Scholar
G. Yuan, Z. Wei, and Y. Yang, “The global convergence of the Polak-Ribière-Polyak conjugate gradient algorithm under inexact line search for nonconvex functions,” Journal of Computational and Applied Mathematics, vol. 362, pp. 262–275, 2019.
View at: Publisher Site | Google Scholar
M. J. D. Powell, “Nonconvex minimization calculations and the conjugate gradient method,” in Numerical Analysis, pp. 122–141, Springer, Berlin, Germany, 1984.
View at: Google Scholar
Y. H. Dai, “Analyses of conjugate gradient methods,” Institute of Computational Mathematics and Scientific/Engineering Computing, Chinese Academy of Sciences, beijing, China, 1997, PhD. thesis.
View at: Google Scholar
M. Al-Baali, “Descent property and global convergence of the fletcher-reeves method with inexact line search,” IMA Journal of Numerical Analysis, vol. 5, no. 1, pp. 121–124, 1985.
View at: Publisher Site | Google Scholar
Y. F. Hu and C. Storey, “Global convergence result for conjugate gradient methods,” Journal of Optimization Theory and Applications, vol. 71, no. 2, pp. 399–405, 1991.
View at: Publisher Site | Google Scholar
D. Touati-Ahmed and C. Storey, “Efficient hybrid conjugate gradient techniques,” Journal of Optimization Theory and Applications, vol. 64, no. 2, pp. 379–397, 1990.
View at: Publisher Site | Google Scholar
W. W. Hager and H. Zhang, “A new conjugate gradient method with guaranteed descent and an efficient line search,” SIAM Journal on Optimization, vol. 16, no. 1, pp. 170–192, 2005.
View at: Publisher Site | Google Scholar
G. H. Yu and L. T. Guan, “New descent nonlinear conjugate gradient methods for large-scale optimization,” Department of Scientific Computation and Computer, 2005.
View at: Google Scholar
G. Yu, L. Guan, and W. Chen, “Spectral conjugate gradient methods with sufficient descent property for large-scale unconstrained optimization,” Optimization Methods and Software, vol. 23, no. 2, pp. 275–293, 2008.
View at: Publisher Site | Google Scholar
Z. Dai and F. Wen, “Global convergence of a modified hestenes-stiefel nonlinear conjugate gradient method with armijo line search,” Numerical Algorithms, vol. 59, no. 1, pp. 79–93, 2012.
View at: Publisher Site | Google Scholar
M. Li and A. Qu, “Some sufficient descent conjugate gradient methods and their global convergence,” Computational and Applied Mathematics, vol. 33, no. 2, pp. 333–347, 2014.
View at: Publisher Site | Google Scholar
L. Zhang, W. Zhou, and D. Li, “Global convergence of the dy conjugate gradient method with armijo line search for unconstrained optimization problems,” Optimization Methods and Software, vol. 22, no. 3, pp. 511–517, 2007.
View at: Publisher Site | Google Scholar
G. Zoutendijk, “Nonlinear programming, computational methods,” Integer and nonlinear programming, pp. 37–86, 1970.
View at: Google Scholar
I. Bongartz, A. R. Conn, N. Gould, and P. L. Toint, “Cute: constrained and unconstrained testing environment,” ACM Transactions on Mathematical Software, vol. 21, no. 1, pp. 123–160, 1995.
View at: Publisher Site | Google Scholar
J. J. Moré, B. S. Garbow, and K. E. Hillstrom, “Testing unconstrained optimization software,” ACM Transactions on Mathematical Software (TOMS), vol. 7, no. 1, pp. 17–41, 1981.
View at: Publisher Site | Google Scholar
Q. Dai and W. Sha, “The physics of compressive sensing and the gradient-based recovery algorithms,” 2009, https://arxiv.org/abs/0906.1487.
View at: Google Scholar
W. W. Hager and H. C. Zhang, “A survey of nonlinear conjugate gradient methods,” Pacific Journal of Optimization, vol. 2, no. 1, pp. 35–58, 2006.
View at: Google Scholar

Copyright

Copyright © 2020 Mengxiang Zhang et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

PDF Download Citation

Download other formats

Order printed copies

Views

837

Downloads

789

Citations