A New Conjugate Gradient Projection Method for Convex Constrained Nonlinear Equations
The conjugate gradient projection method is one of the most effective methods for solving large-scale monotone nonlinear equations with convex constraints. In this paper, a new conjugate parameter is designed to generate the search direction, and an adaptive line search strategy is improved to yield the step size, and then, a new conjugate gradient projection method is proposed for large-scale monotone nonlinear equations with convex constraints. Under mild conditions, the proposed method is proved to be globally convergent. A large number of numerical experiments for the presented method and its comparisons are executed, which indicates that the presented method is very promising. Finally, the proposed method is applied to deal with the recovery of sparse signals.
Solving a system of nonlinear equations can be transformed as an optimization problem, which is widely applied in many fields of sciences and engineering, for instance, the economic equilibrium problem , the neural networks problem , the financial problem [3, 4], the chemical equilibrium system , and the compressed sensing problem [6, 7].
In this paper, the following system of constrained monotone nonlinear equations is considered:where is a nonempty closed convex set of and is a monotone mapping, namely,
Many algorithms have been proposed to deal with (1) during the past few decades (see, e.g., [9–16]), such as the projected Newton method , the projected quasi-Newton method [10–13], the Levenberg–Marquardt method , the trust region method , and the Lagrangian global method . As we know, these methods converge rapidly if the sufficiently good initial points are chosen. However, they are not well-suited for solving large-scale constrained nonlinear equations due to the computation of the Jacobian matrix or its approximation at each iteration. Therefore, in the past few years, the projected derivative-free method (PDFM) has become more and more popular, i.e., the spectral gradient projection method [17–19], the multivariate spectral gradient-type projection method [20, 21], and the conjugate gradient projection method (CGPM) [22–31], and more other PDFM can be seen in references [32, 33].
In this work, we concentrate on studying CGPM for large-scale nonlinear equations with convex constraints. We aim to establish a more efficient CGPM by improving the search direction and the line search rule and use the proposed CGPM to deal with the problem of sparse signals recovery.
The contributions of this article are listed as follows:(i)To guarantee the search direction satisfying the sufficient descent condition and trust region property independent of any line searches, a new conjugate parameter is proposed;(ii)Based on classic line searches for nonlinear equations, an adaptive line search is improved to seek suitable step size easily;(iii)Under general assumptions, the convergence analysis of the proposed algorithm is proved;(iv)The reported numerical experiments show that our method is promising for solving large-scale nonlinear constrained equations and handling the problem of recovering sparse signals.
The remainder of this paper is organized as follows. In Section 2, a new search direction and an adaptive line search are proposed, and the corresponding algorithm is given. The global convergence is studied in Section 3. In Section 4, the numerical experiments for the proposed algorithm and its comparisons are performed, and the corresponding results are reported. Application of the proposed algorithm in compressed sensing is introduced in Section 5. In Section 6, a conclusion for this work is given.
2. A New CGPM Algorithm
The CGPM has been attracting extensive attention, since it not only inherits the advantages of the conjugate gradient method (CGM) with a simple algorithm structure, rapid convergence, and low storage requirements but also uses no any jacobian information of the equation in practice. To the best of our knowledge, the computation cost of CGPM mainly exists in the process of generating the search direction and computing the step size. Therefore, in the following part, we design our search direction by the CGM and give an improved inexact line search to yield the step size.
2.1. The New Search Direction Yielded by CGM
It is well-known that the search direction of the classical CGM is generated bywhich is decided by the conjugate parameter . Usually, a different leads to a different search direction.
In the recent years, many scholars have made efforts to extend the CGM to solve the large-scale nonlinear monotone equations system. For example, based on the hybrid conjugate parameter in , Sun and Liu  proposed a modified conjugate parameter as follows:where and extended it to solve problem (1).
Recently, Tsegay et al.  gave a new CGM with sufficient descent property, that is,
It is interesting that the search direction with have better theoretical properties, that is, it satisfies the sufficient descent condition and trust region property, automatically.
2.2. An Improved Adaptive Line Search Strategy
For an efficient CGPM, choosing an inexpensive line search is a key technique. To this end, many researchers try to exploit an inexact line search strategy to obtain step size with minimal cost. Zhang and Zhou  adopt an Armijo-type line search procedure, that is, the step size , such thatwhere is an initial guess for the step size, , and is a positive constant. Li and Li  obtained the step size by the following line search:which was originally proposed by Solodov and Svaiter . Guo and Wan  proposed an adaptive line search, i.e., the step size satisfied the following inequality:
Obviously, when is far from the solutions of problem (1) and the is too large, it follows from (9) that the step size becomes small, which increases the computation cost. A similar case can appear for the line search (8) when is close to the solution set of problem (1) and is too large. However, it is worth noting that the line search (10) can overcome the previously mentioned weaknesses and take advantage of line searches (8) and (9).
Inspired by , we introduce another new adaptive line search strategy with a disturbance factor, that is, taking , such thatwhere , is an initial guess for the step size, , and . Here, is a disturbance factor, which can adjust the size of the right side of the line search (11) and further reduce the computation cost.
Remark 1. In fact, for a given , is too large if is far away from the solution set, namely, , and then, the new line search (11) is similar to (8) in performance. Otherwise, when is close to the solution set, approaches 0 and so approaches , and then, the new line search (11) comes back to (9).
3. Convergence Property
In order to obtain some important properties and convergence property of Algorithm 1, the following basic assumptions are necessary. Assumption H: (H1) The solution set of system (1), denoted by , is nonempty, and the mapping is monotone on . (H2) The mapping is -Lipschitz continuous on , i.e., there exists a constant such that
The well-known nonexpansive property of the projection operator  is reviewed in the following lemma.
Lemma 1 (see ). Let be a nonempty closed convex set. Then,
Therefore, the projection operator is L-Lipschitz continuous on .
The following lemma shows that the search direction yielded by equation in step 2 in Algorithm 1 satisfies the sufficient descent condition and possesses some important properties.
Lemma 2. Suppose that Assumption H holds, then the search direction generated by equation in step 2 in Algorithm 1 satisfies the sufficient descent condition,and , for some positive constants and .
Proof. For , it is easy to know that and Lemma 2 holds. To proceed, we consider the case . If , it follows from equation in step 2 in Algorithm 1 that . Otherwise, multiplying both sides of equation in step 2 in Algorithm 1 by , from (6) and (7), we havewhich shows that the sufficient descent property (14) holds by taking . Again, according tothe following relation holds:and then, .
On the other hand, it follows from (7) and equation in step 2 in Algorithm 1 thatand the proof is completed.
The next lemma not only indicates that the line search strategy (11) is well-defined but also provides a lower bound for step size .
Lemma 3. (i)Let the sequences and be generated by Algorithm 1; then, there exists a step size satisfying the line search (11)(ii)Suppose that Assumption H holds; then, the step size yielded by Algorithm 1 satisfies
Proof. (i)Suppose that for any nonnegative integer , (11) does not hold at the -th iterate,then From the continuity of and , let , and it is clear that which contradicts (14). The proof is completed.(ii)For the second part, it is clear that if , then (19) holds. If , is computed by the backtracking process in Algorithm 1. Let , and then, does not satisfy (11), namely,where . It follows from (12), (14), and (22) thatThen,which completes the proof.
The following lemma is necessary to analyze the global convergence of Algorithm 1.
Lemma 4. Suppose that Assumptions H holds, let and be generated by Algorithm 1, and let be any given solution for system (1), i.e., . Then, the sequence is convergent, and sequences and are both bounded. Furthermore, it holds that
Proof. In view of the definition of and (11), we know thatOn the other hand, taking Assumption (H1) and into consideration, we haveAccording to equation in step 4 in Algorithm 1, Assumption (H1), Lemma 1, and (26) and (27), it follows thatwhich shows that the inequalities hold, that is, the sequence is monotone nonincreasing and bounded below. Hence, is convergent. Furthermore, the boundedness of is obtained.
By Lemma 2, it holds that is bounded and so is . Without the loss of generality, there exists a constant such that . If , then ; otherwise, . Hence, the following relation holds:This together with (28) implies thatThus, this further implies , and the proof is completed.
Next, based on Assumption H and Lemmas 1–4, the global convergence of the proposed algorithm is established.
Theorem 1. Suppose that Assumption H holds and the sequences be generated by the Algorithm 1; then,
Furthermore, the whole sequence converges to a solution of system (1).
Proof. First, by contradiction, suppose that relation (31) is not true; then, there exists a constant such thatAgain, the following inequality comes directly from Lemma 2:This together with (25) shows thatIn addition, from Lemmas 2 and 3 (ii) and the boundedness of , the following relation holds:which contradicts (34). Therefore, (31) is true.
Second, (31) shows that there exists an infinite index set such that . Again, the is bounded and is a closed set, so without the loss of generality, suppose that . It follows from the continuity of thatwhich shows that .
Finally, noticing that from Lemma 4, it follows that the sequence is convergent, namely,which implies that the whole sequence converges to , and the proof is completed.
4. Numerical Experiments
In this section, the numerical performances of Algorithm 1 (LJJ CGPM) for solving convex constrained nonlinear equations are tested and reported by the following two subsections.
4.1. Experimental Setup
In order to illustrate the effectiveness of the LJJ CGPM, we compare it with two recent CGPMs. Specifically, eight large-scale examples are solved by the LJJ CGPM method, PDY method , and ATTCGP method  in the same calculating environment. All codes were written in Matlab R2014a and run on a DELL with 4 GB RAM memory and Windows 10 operating system.
For the LJJ CGPM, we use (6) and (11) as the conjugate parameter of search direction and the line search rule, respectively, and the parameters in the LJJ CGPM are chosen as . For the PDY  and the ATTCGP  methods, the search direction, line search rule, and selection of parameters are consistent with the original literature, respectively.
For all methods, the computation will be terminated when one of the following criteria are satisfied:where “Itr” refers to the total number of iterations. Defineand the tested functions are listed as follows.
Problem 2. (see Yu et al. ). Setfor and .
Problem 4. (see Zhou and Li ). Set , for and .
Problem 5. (see Gao and He ). Setfor and .
Problem 6. (see Ou and Li ). Setfor and .
Problem 7. (see Gao and He ). Set , for and .
Problem 8. (see Gao and He ). Setfor and .
The new iterate points yielded by the quadratic program solver quadprog. m are taken from the Matlab optimization toolbox. Problems 1–8 are tested with seven initial points . Here, the dimension of problems is chosen as , and , respectively.
The comparison of data is listed in Tables 1–8, where “Init” means the initial point, “n” is the dimension of the problem, “NF” denotes the number of function evaluations, “Tcpu” denotes the CPU time, and “” is the final value of when the program is stopped.
In addition, in order to show the numerical performance clearly, we adopt the profiles introduced by Dolan and Morè  to compare the performance on Tcpu, NF, and Itr, respectively. A brief explanation of the performance figures is as follows. Denote the whole set of test problems by and the set of solvers by . Let be the Tcpu (or the or the ) required to solve problem by solver , and the comparison results between different solvers are based on the performance ratio defined fromand the performance profile for each solver is defined bywhere size A means the number of elements in the set A. Then, is the probability for solver that a performance ratio is within a factor . is the (cumulative) distribution function for the performance ratio. Clearly, the top curved shape of the method is a winner. For details about the performance profile, see .
4.2. Numerical Testing Reports
The specific results of the numerical tests are displayed in Tables 1–8. Their corresponding performance profiles are plotted in Figures 1–3 in terms of Tcpu, NF, and Itr, respectively. From Tables 1–8 and Figures 1–3, the following results are obtained.(i)The LJJ CGPM, PDY, and ATTCGP can solve all test problems completely, which shows that abovementioned three methods are effective.(ii)From Figures 1–3, the LJJ CGPM is the top performer among the three algorithms, and this implies that the LJJ CGPM is superior to the PDY and ATTCGP totally, at least for this set of numerical experiments.
Reasonably, the advantages of the LJJ CGPM are attributed to the choice of techniques (6) and (11) for the conjugate parameter of search direction and the adaptive line search strategy. At the first glance, the parameter (6) and line search strategy (11) are a bit complicated; however, the LJJ CGPM is very effective for solving constrained equations. Furthermore, numerical results indicate that the proposed method is competitive to similar methods for large-scale problems.
5. Application in Compressed Sensing
In this section, the LJJ CGPM is applied to a typical compressed sensing scenario and compared with the PDY method  in terms of the mean of squared error (MSE), iterative number, and CPU time. The parameters for the LJJ CGPM are set as follows: . Also, the parameters of the PDY method are taken from Section 4 in .
5.1. Compressed Sensing
Compressed sensing, also called compressed sampling or sparse sampling, is a typical signal sampling technique. In electronic engineering, especially in signal processing, compressed sensing is often used to acquire and reconstruct sparse or compressible signals.
In this section, the attention is focused on recovering the unknown vector from an incomplete or disturbed observationwhere is the observation data, is a linear mapping, and is an error term. In fact, the sparse result of can be obtained by solving the following convex unconstrained optimization problem:where parameter , and mean the norm and norm, respectively. There are many algorithms to solve problem (47). One of them is the gradient projection method for sparse reconstruction proposed by Figueiredo et al. . The first step of this method is to convert (47) into a convex quadratic programming problem.
For any vector , it can be decomposed into the following two parts:where and for all . In this way, the norm of a vector can be expressed as , where . Therefore, problem (47) can be expressed as the following convex quadratic programming problem with nonnegative constraintswhere . Furthermore, it is easy to get the following standard form:where
Obviously, H is a semipositive definite matrix. So, problem (50) is a convex quadratic programming problem. In , it has been proven to be equivalent towhere the function is vector valued, and the “min” is explained as componentwise minimum. From  Lemma 3, and  Lemma 2.2, we know that is Lipschitz continuous and monotone. So, equation (52) can be solved by the LJJ CGPM.
5.2. Numerical Results
In these experiments, the main goal is to recover a one-dimensional sparse signal from its limited measurements with Gaussian noise, that is, to reconstruct a length sparse signal from observations. The Gaussian noise distributed with is used in the test. and are taken for equation (46). The quality of restoration is measured by the MSE, namely,where is the restored signal and is the original signal. Besides, the original signal contains nonzero elements randomly. The random matrix A is given by the command rand in Matlab, and the observed data is obtained by , where is the Gaussian noise. denotes the merit function, and the value is obtained by the same continuation technique for the abovementioned two algorithms (LJJ CGPM and PDY method).
The iterative process starts at the measurement signal, that is, , and terminates when the relative change between successive iterative falls below , that is,
Figure 4 gives the original signal , the measurement , and the signal reconstructed by the LJJ CGPM and PDY method. It can be seen from Figure 4 that the LJJ CGPM and PDY method almost completely recover the disturbed signal. Moreover, to visualize the performance of the two algorithms, Figures 5 and 6 are plotted, which describe the curves of MSE and objective function values over iterations and CPU time (seconds), respectively. As can be seen from Figures 5 and 6, the LJJ CGPM has advantages over the PDY method in the MSE and objective function values, indeed, and the red curves (LJJ CGPM) drop faster than the blue curves (PDY method). Overall, simple tests show that the LJJ CGPM can effectively decode sparse signals in compressed sensing.
In this work, based on the research studies for the CGMs and some common inexact line searches, a modified conjugate parameter and an adaptive line search strategy are proposed, and then, an effective CGPM is presented for monotone nonlinear equations. Besides, the search direction of our proposed algorithm possesses the properties of sufficient descent and trust region independent of any line search techniques. Furthermore, the presented method inherits the advantages of the CGM with a simple iterative form, rapid convergence, being derivative-free, and low memory requirements. Under some suitable conditions, the global convergence of the proposed method is established, and its numerical experiments are conducted and reported, which show that our method is very effective in dealing with large-scale monotone nonlinear equations with convex constraints and the norm regularization problem in compressive sensing.
Regrettably, the parameter in a new adaptive line search strategy is fixed. Along with the existing line search rule, our future work is to develop the line search (11) by a dynamic adjustment technology instead of the constant . Moreover, summarizing the abovementioned observations and from the fact that the search direction is yielded by the CGM, how to improve the line search rule by combining with the frequently-used inexact line searches for the CGM deserves to be studied.
No data were used to support this study.
Conflicts of Interest
The authors declare that they have no conflicts of interest.
The authors would like to thank J.H. Yin and B. He as without their valuable assistance, this study would not have been successful. This work was supported in part by the National Natural Science Foundation of China under Grant 11771383, in part by the Natural Science Foundation of Guangxi Province under Grant 2016GXNSFAA380028, 2018GXNSFFA281007, and in part by Research Project of Guangxi University for Nationalities under Grant 2018KJQD02.
M. Raissi, P. Perdikaris, and G. E. Karniadakis, “Physics-informed neural networks: a deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations,” Journal of Computational Physics, vol. 378, pp. 686–707, 2019.View at: Publisher Site | Google Scholar
J. M. Ortega and W. C. Rheinboldt, Iterative Solution of Nonlinear Equations in Several Variables, Academic Press, New York, NY, USA, 1970.
W. Zhou and D. Li, “Limited memory BFGS method for nonlinear monotone equations,” Journal of Computational Mathematics, vol. 25, no. 1, pp. 89–96, 2007.View at: Google Scholar
Q. Xu, H. C. Lin, and Y. G. Ou, “A derivative-free memory method for systems of nonlinear equations with convex constraints,” Journal of Applied Mathematics, vol. 29, no. 3, pp. 686–696, 2016.View at: Google Scholar
G. Tsegay, H. Zhang, X. Zhang, and F. Zhang, “A sufficient descent conjugate gradient method for nonlinear unconstrained optimization problems,” Transactions in Operational Research, vol. 22, no. 3, pp. 59–68, 2018.View at: Google Scholar
M. V. Solodov and B. F. Svaiter, “A globally convergent inexact Newton method for systems of monotone equations,” in Reformulation: Piecewise Smooth, Semi-smooth and Smoothing Methods, pp. 355–369, Springer, Berlin, Germany, 1998.View at: Google Scholar
E. H. Zarantonello, Projections on Convex Sets in Hilbert Space and Spectral Theory, Academic Press, New York, NY, USA, 1971.