Machine Learning and its Applications in Image RestorationView this Special Issue
Research Article | Open Access
Mengxiang Zhang, Yingjie Zhou, Songhua Wang, "A Modified Nonlinear Conjugate Gradient Method with the Armijo Line Search and Its Application", Mathematical Problems in Engineering, vol. 2020, Article ID 6210965, 14 pages, 2020. https://doi.org/10.1155/2020/6210965
A Modified Nonlinear Conjugate Gradient Method with the Armijo Line Search and Its Application
In this article, a modified Polak-Ribière-Polyak (PRP) conjugate gradient method is proposed for image restoration. The presented method can generate sufficient descent directions without any line search conditions. Under some mild conditions, this method is globally convergent with the Armijo line search. Moreover, the linear convergence rate of the modified PRP method is established. The experimental results of unconstrained optimization, image restoration, and compressive sensing show that the proposed method is promising and competitive with other conjugate gradient methods.
Consider the following unconstrained optimization problem:where is a continuously differentiable function. The iteration formula is determined bywhere is stepsize obtained by some kind of line search and is search direction.
It is well known that there are many methods ([1–4] etc.) for solving optimization problem (1), where the conjugate gradient method is a powerful method because of its simplicity, and this method can avoid the computation and storage of some matrices associated with the Hessian of objective functions; then its memory requirements are very low. Large-scale calculations are usually required in the process of image processing; therefore, the conjugate gradient algorithm has excellent application prospects in this aspect. The search direction of the conjugate gradient methods is given bywhere is a scalar and is the gradient of objective function at the point . The choice of determines different conjugate gradient methods. Classical conjugate gradient methods include PRP conjugate gradient method [5, 6], HS conjugate gradient method , LS conjugate gradient method , DY conjugate gradient method , FR conjugate gradient method , and CD conjugate gradient method . The parameters of these methods are specified as follows:where , and stands the Euclidean norm. There has been much research on convergence properties of these methods ([12–15] etc.) and applications ([16–23] etc.).
In this paper, the focus is mainly on the PRP method. The PRP method is generally believed to have the best numerical performance in classical conjugate gradient methods, but there is much room for discussion in terms of convergence. When the exact line search was used, the global convergence of the PRP conjugate gradient method has been proved by Polak and Ribière  for convex objective functions. However, Powell  proposed a counter example that the PRP conjugate gradient method may fail for nonconvex functions even if the exact line search is used. Gilbert and Nocedal  showed a modified PRP conjugate gradient method that the modified PRP method is globally convergent if is restricted to be not less than zero and is determined by a line search step satisfying the sufficient descent condition:in addition to the following weak Wolfe-Powell conditions, i.e.,where . Meanwhile, Gilbert and Nocedal  pointed out that even if the objective function is guaranteed to be uniformly convex, may be negative. With the strong Wolfe-Powell line search, Dai  gave an example showing that even though the objective function is uniformly convex, the PRP method cannot guarantee that the search direction is always the descent direction.
Through the above observations and [14, 24, 26–28], the sufficient descent condition (5) and the condition play key roles in establishing the global convergence of the conjugate gradient methods. However, in the case where Armijo line search or Wolfe line search is used, the descent property of determined by (3) is in general not guaranteed.
In order to obtain the generated sufficient descent direction, Hager and Zhang  showed a new conjugate gradient method (CG-C) obtained by modifying the HS method, which generates sufficient descent directions at each step, without relying on any line search. The scalar in CG-C method is determined bywhereand is a constant. With the weak Wolfe-Powell line search, Hager and Zhang  established a global convergence result for (7) when the objective function is general nonlinear function.
Along this line, Yu and Guan [30, 31] extended the CG-C method to the PRP, LS, DY, FR, and CD methods. The parameter in the modified PRP method (DPRP) is as follows:where is a constant. The performance of the above methods is better than other conjugate gradient in practice. In addition, because Armijo line search is very simple, compared with Wolfe line search, it is more widely used in image restoration and compressed sensing problems. Therefore, it is necessary to explore the performance of the above methods under Armijo line search. However, the above methods cannot obtain global convergence with Armijo line search, i.e., to find a stepsize satisfyingwhere are given constants. In order to overcome this defect, scholars have done a lot of related work ([32–34] etc). In order to overcome the defect of PRP, HS and LS conjugate gradient methods cannot globally converge with the Armijo line search, Li and Qu  proposed a series of modified conjugate gradient methods based on (3) and (4) as follows:where is a scalar to be specified. The parameter is in the modified PRP method (APRP). And the results of global convergence are established when the Armijo line search is used.
Motivated by the idea of [31, 33] and taking into account the excellent numerical performance, we propose a new modified PRP method. The parameter is computed aswhere , and are two constants. And the search direction is determined by the classical two-term conjugate gradient method (3).
In the next section, the proposed algorithm is stated. In Section 3, the properties and the convergent results of the new method are given. This method enjoys linear convergence under suitable conditions in Section 4. Numerical results and conclusion are presented in Section 5 and in Section 6, respectively.
Remark 1. Without any line search method, the NPRP method can get sufficient descent properties.
Lemma 1. Let and be generated by the NPRP method. If , for all , then
3. Properties and Convergence Analysis
The following few suitable assumptions are often utilized in global convergence analysis for conjugate gradient algorithms.
Assumption 1. (i)The level set is bounded.(ii)Function is continuously differentiable and its gradient is Lipschitz continuous on an open convex set containing , i.e., there exists a constant such thatIt follows directly from Assumption 1 that there is a positive constant , such that
Lemma 3. If the stepsize is obtained by Armijo line search, there exists a constant such that the following inequality holds for all , then
Proof. Case (i) . From (14) and (26), we get Then we need to satisfy From Lemma 2, we have Case (ii) . By the Armijo line search condition, does not satisfy inequality (10). This meansBy the mean-value theorem and inequality (19), there is a such as andObserving the last inequality and (30), we havelettingThus (26) is satisfied.
Proof. We have from (10) and Assumption 1 thatSubstituting (26) into the above formula, we havewhere and are two constants. Therefore, (34) holds.
Equation (34) is usually called the Zoutendijk condition , and it is very important for establishing global convergence.
Theorem 1. Suppose the conditions in Assumption 1 hold. Let be generated by the NPRP method with Armijo line search, then either for some or .
Remark 2. From the proof of Theorem 1, with the Wolfe line search, the NPRP method can also establish global convergence without requirement of truncation and convexity of the objective function.
4. Linear Convergence Rate
When discussing the linear convergence rate for conjugate gradient methods, the following assumptions are often established on the basis of Assumption 1.
Assumption 2. (i)Objective function is twice continuously differentiable in (ii) is a symmetric positive definite matrix in , i.e., there exist positive constants such that
Lemma 5. Suppose that Assumption 2 holds and the sequence generated by the NPRP method converges with the Armijo line search. Then the sequence converges to the unique minimal point .
Proof. If Assumption 2 holds, then Assumption 1 holds, and objective function is uniformly convex. By the Armijo line search (10) and Lemma 1, it can be deduced that is a decreasing sequence.
Therefore, by the conclusion of Theorem 1, the sequence converges to the unique minimal point .
Theorem 2. Suppose Assumption 2 holds. Let be the unique solution of problem (1) and the sequence be generated by the NPRP method with the Armijo line search. Then there are constants and such that for all , thenthat is, converges to at least R-linearly.
Proof. If Assumption 2 holds, then Assumption 1 holds. By (14), (21), and (26), it is easy to prove thatWe get from (14), (40), (41), (45), and the Armijo condition (10)Letting , we haveCombing (40), we obtainHence (44) holds, where . The proof is completed.
Remark 3. From the proof of Theorem 2, with the Wolfe line search, the NPRP method can also establish that converges to at least R-linearly.
5. Numerical Experiments
In this section, we report some numerical results of Algorithm 1 and compare the performance of some methods previously detailed in this paper. The following methods were compared: NPRP: is computed by (13), is determined by (3), the Armijo line search (10)). DPRP [30, 31]: is computed by (9), is determined by (3), the weak Wolfe-Powell line search (6). APRP : is computed by (12), is determined by (11) the Armijo line search (10).
All programs are written in MATLAB R2019a and run on a PC with an AMD Ryzen 5 3550 H with Radeon Vega Mobile Gfx CPU @2.10 GHz, 16 GB (14.9 GB available) of RAM and the Windows 10 operating system.
5.1. General Unconstrained Optimization Problems
5.1.1. Contrast Algorithm
This experiment tests three algorithms, NPRP, DPRP and APRP.
5.1.2. Tested Problems
Problem instances with 3000, 9000, and 27000 variables are considered.
All the algorithms run with , , , , , and .
5.1.5. Termination Rule
If or if the number of iterations exceeds .
5.1.6. Symbol Representation
No: the test problem number, CPUTime: the CPU time in seconds, NI: the number of iterations, NFG: the total number of function and gradient evaluations.
5.1.7. Image Description
From Table 2 and Figures 1–3, we can see the performance of NPRP method and APRP method is significantly better than that of DPRP method. The reason is that the Armijo line search is more convenient than the Wolfe line search to obtain stepsize . Meanwhile, we can get that the performance of NPRP method is slightly better than that of APRP method.
5.2. Image Restoration Problems
This subsection is done to recover the original image from an image corrupted by impulse noise.
5.2.1. Contrast Algorithm
This experiment tests two algorithms, NPRP and APRP.
5.2.2. Tested Problems
Restore the original image from the image destroyed by impulsive noise. The experiments chose Boat (256 256), Lena (512 512), and Man (1024 1024) are the test images.
5.2.3. Noise Level
Different algorithms perform for 25%, 50%, and 75% noise problems.
All the algorithms run with , , , and .
5.2.5. Termination Rule
If or , the algorithm is terminated.
5.2.6. Symbol Representation
CPU Time: the CPU time in seconds.
5.2.7. Image Description
Figures 4 and 5 show the recovery of different algorithms with different noise levels. Restoration of the Boat, Lena, and Man images is done by the NPRP method and APRP method. From left to right: a noise image with different percentages of salt-and-pepper noise, restorations with NPRP method, and restorations with APRP method.
The results in Table 3 and Figures 4–6 show us at least two conclusions: (i) NPRP method and APRP method are successful for restoring these images with suitable CPU time; (ii) NPRP method is promising and competitive to the APRP method for 25% noise problem, 50% noise problem, and 75% noise problem.
5.3. Compressive Sensing Problems
The purpose of this section is to accurately recover the image from a few random projections by compressive sensing (CG), based on the gradient recovery algorithms. The experimental method comes from the model proposed by Dai and Sha .
5.3.1. Contrast Algorithm
This experiment tests two algorithms, NPRP and APRP.
5.3.2. Tested Problems
Recover the image accurately from a few of random projections. The experiments chose Phantom (256 256), Fruits (256 256), and Peppers (256 256) as the test images.
All the algorithms run with , , , and .
5.3.4. Termination Rule
If the square root of the sum of the diagonal elements of is less than or if the number of iterations exceeds 500.
5.3.5. Symbol Representation
PSNR: peak signal to noise ratio. The greater the PSNR, the better the effect.
5.3.6. Image Description
Figure 7 shows the recovery of images after random measurement in Fourier domain. Restoration of the Phantom, Fruits, and Peppers images by NPRP method and APRP method. From left to right: original image, restorations with NPRP method, restorations with APRP method.
The results in Table 4 and Figure 7 show us at least two conclusions: (i) NPRP method and APRP method are successful for restoring these images with suitable PSNR; (ii) NPRP method is promising and competitive to APRP method for compressive sensing problems.
From the above numerical experiment results, we can see that the performance of the NPRP method is more competitive than the APRP method and the DPRP method. The reasons are as follows. (i) Hager and Zhang pointed out in the survey paper  that the common molecule of the PRP, HS, and LS methods possess a built-in restart feature that addresses the jamming problem: the factor in the numerator of tends to zero when the step is too small. In addition, becomes smaller, and the new search direction is closer to the steepest descent direction . Meanwhile, NPRP method is similar in form to CG-C and PRP methods and therefore inherits their good performance. (ii) The Armijo line search is more convenient than the Wolfe line search to obtain stepsize .
In this paper, a modified PRP conjugate gradient method was proposed to solve image restoration and compression sensing problems, based on the well-known CG-C method . With the Armijo line search, the global convergence and liner convergence rate of the algorithm is established under some suitable conditions. The sufficient descent property of the algorithm has been proved without the use of any line search method. The numerical results indicate that the algorithm is effective and competitive for solving unconstrained optimization problems, image restoration problems, and compressive sensing problems.
The data used to support the findings of this study are included within the article.
Conflicts of Interest
There are no potential conflicts of interest.
The authors want to thank the financial support by the High Level Innovation Teams and Excellent Scholars Program in Guangxi Institutions of Higher Education (Grant No. 52), the National Natural Science Foundation of China (Grant no. 11661009), the Guangxi Natural Science Key Foundation (No. 2017GXNSFDA198046), and the Guangxi Natural Science Foundation (No. 2020GXNSFAA159069).
- Y. H. Dai, “A nonmonotone conjugate gradient algorithm for unconstrained optimization,” Journal of System Science and Complexity, vol. 15, no. 2, pp. 139–145, 2002.
- J. Nocedal and Y. X. Yuan, “Analysis of a self-scaling quasi-newton method,” Mathematical Programming, vol. 61, no. 1–3, pp. 19–37, 1993.
- Z. Wei, G. Li, and L. Qi, “New nonlinear conjugate gradient formulas for large-scale unconstrained optimization problems,” Applied Mathematics and Computation, vol. 179, no. 2, pp. 407–430, 2006.
- Z. Wei, S. Yao, and L. Liu, “The convergence properties of some new conjugate gradient methods,” Applied Mathematics and Computation, vol. 183, no. 2, pp. 1341–1350, 2006.
- E. Polak and G. Ribiere, “Note sur la convergence de méthodes de directions conjuguées,” ESAIM: Mathematical Modelling and Numerical Analysis-Modélisation Mathématique et Analyse Numérique, vol. 3, no. 16, pp. 35–43, 1969.
- B. T. Polyak, “The conjugate gradient method in extremal problems,” USSR Computational Mathematics and Mathematical Physics, vol. 9, no. 4, pp. 94–112, 1969.
- M. R. Hestenes and E. Stiefel, “Methods of conjugate gradients for solving linear systems,” Journal of Research of the National Bureau of Standards, vol. 49, no. 6, pp. 409–436, 1952.
- Y. Liu and C. Storey, “Efficient generalized conjugate gradient algorithms, part 1: theory,” Journal of Optimization Theory and Applications, vol. 69, no. 1, pp. 129–137, 1991.
- Y. H. Dai and Y. Yuan, “A nonlinear conjugate gradient method with a strong global convergence property,” SIAM Journal on Optimization, vol. 10, no. 1, pp. 177–182, 1999.
- R. Fletcher and C. M. Reeves, “Function minimization by conjugate gradients,” The Computer Journal, vol. 7, no. 2, pp. 149–154, 1964.
- R. Fletcher, “Practical methods of optimization,” in Unconstrained Optimization, vol. Vol. 1, John Wiley & Sons, Hoboken, NJ, USA, 1980.
- W. Y. Cheng, “A two-term prp-based descent method,” Numerical Functional Analysis and Optimization, vol. 28, no. 11-12, pp. 1217–1230, 2007.
- Z.-f. Dai and B.-S. Tian, “Global convergence of some modified prp nonlinear conjugate gradient methods,” Optimization Letters, vol. 5, no. 4, pp. 615–630, 2011.
- J. C. Gilbert and J. Nocedal, “Global convergence properties of conjugate gradient methods for optimization,” SIAM Journal on Optimization, vol. 2, no. 1, pp. 21–42, 1992.
- G. Yu, Y. Zhao, and Z. Wei, “A descent nonlinear conjugate gradient method for large-scale unconstrained optimization,” Applied Mathematics and Computation, vol. 187, no. 2, pp. 636–643, 2007.
- Z. Dai, X. Dong, J. Kang, and L. Hong, “Forecasting stock market returns: new technical indicators and two-step economic constraint method,” The North American Journal of Economics and Finance, vol. 53, Article ID 101216, 2020.
- Z. F. Dai and H. Zhu, “A modified hestenes-stiefel-type derivative-free method for large-scale nonlinear monotone equations,” Mathematics, vol. 8, no. 2, p. 168, 2020.
- F. Wen and X. Yang, “Skewness of return distribution and coefficient of risk premium,” Journal of Systems Science and Complexity, vol. 22, no. 3, pp. 360–371, 2009.
- G. Yu, J. Huang, and Y. Zhou, “A descent spectral conjugate gradient method for impulse noise removal,” Applied Mathematics Letters, vol. 23, no. 5, pp. 555–560, 2010.
- G. Yuan, T. Li, and W. Hu, “A conjugate gradient algorithm for large-scale nonlinear equations and image restoration problems,” Applied Numerical Mathematics, vol. 147, pp. 129–141, 2020.
- G. L. Yuan, J. Y. Lu, and Z. Wang, “The prp conjugate gradient algorithm with a modified wwp line search and its application in the image restoration problems,” Applied Numerical Mathematics, vol. 152, pp. 1–11, 2020.
- G. Yuan, Z. Wei, and G. Li, “A modified Polak-Ribière-Polyak conjugate gradient algorithm for nonsmooth convex programs,” Journal of Computational and Applied Mathematics, vol. 255, pp. 86–96, 2014.
- G. Yuan, Z. Wei, and Y. Yang, “The global convergence of the Polak-Ribière-Polyak conjugate gradient algorithm under inexact line search for nonconvex functions,” Journal of Computational and Applied Mathematics, vol. 362, pp. 262–275, 2019.
- M. J. D. Powell, “Nonconvex minimization calculations and the conjugate gradient method,” in Numerical Analysis, pp. 122–141, Springer, Berlin, Germany, 1984.
- Y. H. Dai, “Analyses of conjugate gradient methods,” Institute of Computational Mathematics and Scientific/Engineering Computing, Chinese Academy of Sciences, beijing, China, 1997, PhD. thesis.
- M. Al-Baali, “Descent property and global convergence of the fletcher-reeves method with inexact line search,” IMA Journal of Numerical Analysis, vol. 5, no. 1, pp. 121–124, 1985.
- Y. F. Hu and C. Storey, “Global convergence result for conjugate gradient methods,” Journal of Optimization Theory and Applications, vol. 71, no. 2, pp. 399–405, 1991.
- D. Touati-Ahmed and C. Storey, “Efficient hybrid conjugate gradient techniques,” Journal of Optimization Theory and Applications, vol. 64, no. 2, pp. 379–397, 1990.
- W. W. Hager and H. Zhang, “A new conjugate gradient method with guaranteed descent and an efficient line search,” SIAM Journal on Optimization, vol. 16, no. 1, pp. 170–192, 2005.
- G. H. Yu and L. T. Guan, “New descent nonlinear conjugate gradient methods for large-scale optimization,” Department of Scientific Computation and Computer, 2005.
- G. Yu, L. Guan, and W. Chen, “Spectral conjugate gradient methods with sufficient descent property for large-scale unconstrained optimization,” Optimization Methods and Software, vol. 23, no. 2, pp. 275–293, 2008.
- Z. Dai and F. Wen, “Global convergence of a modified hestenes-stiefel nonlinear conjugate gradient method with armijo line search,” Numerical Algorithms, vol. 59, no. 1, pp. 79–93, 2012.
- M. Li and A. Qu, “Some sufficient descent conjugate gradient methods and their global convergence,” Computational and Applied Mathematics, vol. 33, no. 2, pp. 333–347, 2014.
- L. Zhang, W. Zhou, and D. Li, “Global convergence of the dy conjugate gradient method with armijo line search for unconstrained optimization problems,” Optimization Methods and Software, vol. 22, no. 3, pp. 511–517, 2007.
- G. Zoutendijk, “Nonlinear programming, computational methods,” Integer and nonlinear programming, pp. 37–86, 1970.
- I. Bongartz, A. R. Conn, N. Gould, and P. L. Toint, “Cute: constrained and unconstrained testing environment,” ACM Transactions on Mathematical Software, vol. 21, no. 1, pp. 123–160, 1995.
- J. J. Moré, B. S. Garbow, and K. E. Hillstrom, “Testing unconstrained optimization software,” ACM Transactions on Mathematical Software (TOMS), vol. 7, no. 1, pp. 17–41, 1981.
- Q. Dai and W. Sha, “The physics of compressive sensing and the gradient-based recovery algorithms,” 2009, https://arxiv.org/abs/0906.1487.
- W. W. Hager and H. C. Zhang, “A survey of nonlinear conjugate gradient methods,” Pacific Journal of Optimization, vol. 2, no. 1, pp. 35–58, 2006.
Copyright © 2020 Mengxiang Zhang et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.