A Modified Spectral PRP Conjugate Gradient Projection Method for Solving Large-Scale Monotone Equations and Its Application in Compressed Sensing
In this paper, we develop an algorithm to solve nonlinear system of monotone equations, which is a combination of a modified spectral PRP (Polak-Ribière-Polyak) conjugate gradient method and a projection method. The search direction in this algorithm is proved to be sufficiently descent for any line search rule. A line search strategy in the literature is modified such that a better step length is more easily obtained without the difficulty of choosing an appropriate weight in the original one. Global convergence of the algorithm is proved under mild assumptions. Numerical tests and preliminary application in recovering sparse signals indicate that the developed algorithm outperforms the state-of-the-art similar algorithms available in the literature, especially for solving large-scale problems and singular ones.
In many fields of sciences and engineering, solution of a nonlinear system of equations is a fundamental problem. For example, in [1, 2], both a Nash economic equilibrium problem and a signal processing problem were formulated into a nonlinear system of equations. Owing to complexity, in the past five decades, numerous algorithms and some software packages in virtue of those powerful algorithms have been developed for solving the nonlinear system of equations. See, for example, [1, 3–16] and the references therein. Nevertheless, in practice, no any algorithm can efficiently solve all the systems of equations arising from sciences and engineering. It is significant to develop a specific algorithm to solve the problems with different analytic and structural features [17, 18].
In this paper, we consider the following nonlinear system of monotone equations:where is a continuous and monotone function; that is to say, satisfiesIt has been shown that the solution set of problem (1) is convex if it is nonempty . In addition, throughout the paper, the space is equipped with the Euclidean norm and the inner product , for .
Aiming at solution of problem (1), many efficient methods were developed recently. Only by incomplete enumeration, we here mention the trust region method , the Newton and the quasi-Newton methods [4–6, 19], the Gauss-Newton methods [7, 8], the Levenberg-Marquardt methods [20–22], the derivative-free methods and its modified versions [9–16, 23–27], the derivative-free conjugate gradient projection method , the modified PRP (Polak-Ribière-Polyak) conjugate gradient method , the TPRP method , the PRP-type method , the projection method , the FR-type method , and the modified spectral conjugate gradient projection method . Summarily, the spectral gradient methods and the conjugate gradient methods are more popular in solving a large-scale nonlinear system of equations than the Newton and the quasi-Newton methods. One of the former’s advantages lies in that there is no requirement of computing and storing the Jacobian matrix or its approximation.
Specifically, Li introduced a class of methods for large-scale nonlinear monotone equations in , which include the SG-liked method, the MPRP method, and the TPRP method. Chen  proposed a PRP method for large-scale nonlinear monotone equations. A descent modified PRP method and FR-type methods were presented in [9, 11], respectively. Liu and Li proposed a projection method for convex constrained monotone nonlinear equations in . Two derivative-free conjugate gradient projection methods were presented for such a system in . Three extensions of conjugate gradient algorithms were developed in [24–26], respectively. Based on the projection technique in , Wan and Liu proposed a modified spectral conjugate gradient projection method (MSCGP) to solve a nonlinear monotone system of symmetric equations in . Then, in , MSCGP was successfully applied into recovering sparse signals and restoring blurred images.
It is noted that the main idea of [13, 23] is to construct a search direction by projection technique such that it is sufficiently descent. In virtue of derivative-free and low storage properties, numerical experiments indicated that the developed algorithm in  is more efficient to solve large-scale nonlinear monotone systems of equations than the similar ones available in the literature.
In , Yang et al. proposed a modified spectral PRP conjugate gradient method for solving unconstrained optimization problem. It was proved that the search direction at each iteration is a descent direction of objective function and global convergence was established under mild conditions. Our research interest in this paper is to study how to extend this method to solution of problem (1). Specifically, we should address the following issues:
(1) Without need of derivative information of the function , how to determine the spectral and conjugate parameters to get a sufficiently descent search direction at each iteration?
(2) To ensure global convergence of algorithm, how to choose an appropriate step length for the given search direction? Particularly, monotonicity of should be utilized to design a new iterate scheme?
(3) What about the numerical performance of new iteration scheme? Especially, whether it is more efficient or not than the similar algorithms available in the literature.
Note that a new line search rule was proposed in  for solving nonlinear monotone equations with convex constraints, and it was shown that, in virtue of this line search, the developed algorithm has good numerical performance. However, the presented line search is involved with choice of a weight. Since it may be difficult to choose an appropriate weight in the practical implementation of algorithm, we attempt to overcome this difficulty in this paper.
Summarily, we intend to propose a modified spectral PRP conjugate gradient derivative-free projection method for solving large-scale nonlinear equations. Global convergence of this method will be proved, and numerical tests will be conducted by implementing the developed algorithm to solve benchmark large-scale test problems and to reconstruct sparse signals in compressive sensing.
The rest of this paper is organized as follows. In Section 2, we first state the idea to propose a new spectral PRP conjugate gradient method. Then, a new algorithm is developed. Global convergence is established in Section 3. Section 4 is devoted to numerical experiments. Preliminary application of the algorithm is presented in Section 5. Some conclusions are drawn in Section 6.
2. Development of New Algorithm
In this section, we will state how to develop a new algorithm in detail.
2.1. Projection Method
Generally, to solve (1), we need to construct an iterative format as follows:where is called a step length and is a search direction. Let . If satisfiesthen a projection method can be obtained for solving Problem (1). Actually, by monotonicity of , it holds thatfor any solution of (1), . With such a , we define a hyperplane:From (4) and (5), it is clear that strictly separates the iterate point from the solution . Thus, the projection of onto is closer to than . Consequently, the iterative formatis referred to as the projection method proposed in . Both analytic properties and numerical results have shown efficiency and robustness of the projection-based algorithms for monotone system of equations [10, 13, 14]. In this paper, we intend to propose a new spectral conjugate gradient method also in virtue of the above projection technique.
2.2. A Modified Spectral PRP Conjugate Gradient Method
In the projection method (7), it is noted that must satisfy (4). That is to say, should be a search direction satisfyingVery recently, Wan et al.  proposed a modified spectral conjugate gradient projection method for solving nonlinear monotone symmetric equations, where was chosen byand and are computed byrespectively, , and . It was proved in  that given by (9) and (10) is sufficiently descent and satisfies .
Note that a modified spectral PRP conjugate gradient method was proposed for solving unconstrained optimization problems in . Similar to the idea in , we can extend the method in  to solution of problem (1). Specifically, we compute and in (9) byrespectively. Although (11) gives different choices of and from (10), we can also prove the following result.
Proof. For , we haveFor , we haveWe now prove that ifholds for (), then (12) also holds for .
Actually,where the forth equality follows condition (15). Consequently, by mathematical induction, (12) holds for any .
2.3. Modified Line Search Rule
Since it is critical to choose an appropriate step length to improve the performance of the iterate scheme (3), as well as determination of search directions, we now present an inexact line search rule to determine in (3).
Very recently, Ou and Li  presented a line search rule as follows: find a nonnegative step length such that the following inequality is as follows:where is a fixed search direction, is a given initial step length, and are two given constants, and is specified byIn , it was required that in (18) satisfies . Clearly, and are the weights for the values 1 and , respectively. In the practical implementation, it is may be difficult to choose an appropriate . To overcome this difficulty, we choose in (18) byIt is sure that, for the new method as a combination of (9), (11), (17), and (19), we need to establish its convergence theory and to further test its numerical performance.
Remark 2. In , to ensure that well-defined the line search (17) is well-defined, it is assumed that satisfieswhere is a given constant. By Proposition 1, it is clear that chosen by (9) and (11) satisfies (20) as .
2.4. Development of New Projection-Based Algorithm
With the above preparation, we are in a position to develop an algorithm to solve problem (1) by combining the projection technique and the new methods to determine a search direction and a step length.
We now present the computer procedure of Algorithm 1.
Remark 3. Since Algorithm 1 does not involve computing the Jacobian matrix of or its approximation, both information storage and computational cost of the algorithm are lower. In virtue of this advantage, Algorithm 1 is helpful to solution of large-scale problems. In next section, we will prove that Algorithm 1 is applicable even if is nonsmooth. Our numerical tests in Section 4 will further show that Algorithm 1 can find a singular solution of problem (1) (see problem 5 and Table 5).
In this section, we are going to study global convergence of Algorithm 1.
Apart from different choices of search direction and step length, Algorithm 1 can be treated as a variant of the projection algorithm in . So, similar to some critical points of establishing global convergence in , we attempt to prove that Algorithm 1 is globally convergent. Very recently, locally linear convergence was proved in  for some PRP-type projection methods.
We first state the following mild assumptions.
Assumption 5. The function is monotone on .
Assumption 6. The solution set of problem (1) is nonempty.
Assumption 7. The function is Lipschitz continuous on ; namely, there exists a positive constant such that for all ,
Under these assumptions, we can prove that Algorithm 1 has the following nice properties.
Lemma 8. Let be a sequence generated by Algorithm 1. If Assumptions 5, 6, and 7 hold, then
(1) for any , such that ,(2) The sequence is bounded.
(3) If is a finite sequence, then the last iterate point is a solution of problem (1); otherwise,and(4) The sequence is bounded. Hence, there exists a constant such that .
Proof. (1) Let be any point such that . Then, by monotonicity of , we haveFrom (7), it is also easy to verify that is the projection of onto the halfspace:Thus, it follows from (25) that belongs to this halfspace. From the basic properties of projection operator , we know thatConsequently,The desired result (22) is directly obtained from (28).
(2) From (28), it is clear that the sequence is nonnegative and decreasing. Thus, is a convergent sequence. It is concluded that is bounded.
(3) From (28), we knowThus,Since the sequence is bounded, the series is convergent. Consequently,The third result has been proved.
(4) For any , by Lipschitz continuity, we haveSince is convergent, we conclude that is bounded. Consequently, for all , there exists a constant such that .
Lemma 9. Suppose that Assumptions 5, 6, and 7 hold. Let be a sequence generated by Algorithm 1. If there exists a constant such that, for any positive integer ,Then, the sequence of directions is bounded; i.e., there exists a constant such that, for any positive integer ,
Proof. From (9), (11), (12), and the results of Lemma 8, it follows thatFrom (24), we know that there exist a positive integer and a positive number () such that, for all , Hence,Let . TakeThen, holds for any positive integer .
Proof. Our aim is to show that the line search rule (17) terminates finitely with a positive step length . By contradiction, suppose that, for some iterate indexes such as , condition (17) does not hold. As a result, for all ,From (18) and the termination condition of Algorithm 1, it follows that, for all ,By taking the limit as in both sides of (39), we haveEquations (41) contradicts the fact that for all . That is to say, the line search rule terminates finitely with a positive step length ; i.e., the line search step of Algorithm 1 is well-defined.
With the above preparation, we are now state the convergence result of Algorithm 1.
Proof. For the sake of contradiction, we suppose that the conclusion is not true. Then, by the definition of inferior limit, there exists a constant such that, for any ,Consequently, fromit follows that for any .
From (7), (17), and (40), we getCombining (24) and (45), we obtainSince for any , we haveClearly, does not satisfy in (17). It says thatBy Lemmas 8 and 9, we know that the two sequences and are bounded. Without loss of generality, we choose a subset such thatTaking the limit in the two sides of (48) as (), it holds thatOn the other hand, from (43), we know thatBy taking the limit in the two sides of (51) for , we getIt contradicts (50). Thus, the proof of Theorem 11 has been completed.
Since the proof of Theorem 11 does not involve differentiability of , let alone nonsingularity of its Jacobian matrix, we know that the following result holds.
4. Numerical Experiments
In this section, by numerical experiments, we are going to study the effectiveness and robustness of Algorithm 1 for solving large-scale system of equations.
We first collect some benchmark test problems available in the literature.
Problem 14 (see ). The elements of are given by
Problem 15 (see ). The elements of are given by
Problem 16 (see ). The elements of are given by
Problem 18 (see ). The elements of are given by
Problem 19 (see ). The elements of are given by
Clearly, apart from problem 4, all the others are nonlinear system of equations. Problem 15 is nonsmooth at point . The size of all the test problems is variable. If this size is larger than 1000, the problem can be regarded to be large-scale. We solve all the test problems with different sizes by Algorithm 1, especially in comparison with the existing seven similar algorithms, such as those developed very recently in [10, 13–15, 28].
All the algorithms are coded in MATLAB R2010b and run on a personal computer with a 2.2GHZ CPU processor, 8GB memory, and Windows 10 operation system. The relevant parameters of algorithms are specified by where . The termination condition is .
Numerical performance of all the algorithms is reported in Tables 1, 2, and 3. Table 1 shows the numerical performance of all the eight algorithms with the fixed initial points: for problems 1, 5, and 6; for problems 2 and 4; for problem 3. Table 2 demonstrates the numerical performance of all the eight algorithms with initial points randomly generated by Matlab’s Code “rand(n,1)”. Table 3 shows the numerical performance of our algorithm with dimension of 1000000.
For simplification of statement, we use the following notations:
Dim: the dimension of test problems.
Ni: the number of iterations.
Nf: the number of function evaluations.
MPPRP-M: the developed algorithm with determined by (19) in this paper.
MPPRP-W: the developed algorithm with generated by (18) in this paper.
MSDFPB: the modified spectral derivative-free projection-based algorithm in .
PRP: the PRP conjugate gradient derivative-free projection-based methods in .
MPRP: the modified PRP conjugate gradient derivative-free projection-based methods in .
TPRP: the two-term PRP conjugate gradient derivative-free projection-based methods in .
DFPB1: the steepest descent derivative-free projection-based methods in  with search direction as follows:where , , , and .
MHS: the MHS-PRP conjugate gradient derivative-free projection-based methods in .
From the results in Tables 1, 2, and 3, it follows that our algorithm (MPPRP) outperforms the other seven algorithms, no matter how to choose the initial points (see the italicized results). Especially, it seems to more efficiently solve large-scale test problems. Actually, Table 3 shows that MPPRP-M can solve the first five Problems with dimension of 1000000 in less time, compared with the other algorithms.
In order to further measure the efficiency difference of all the eight algorithms, we calculate the average number of iteration, the average consumed CPU time, and their standard deviations, respectively. In Table 4, A-Ni and Std-Ni stand for the average number of iteration and its standard deviation, respectively. A-CT and Std-CT represent the average consumed CPU time and its standard deviation, respectively. The average number of function evaluation and its standard deviation are denoted by A-Nf and Std-Nf, respectively. Clearly, Std-Ni, Std-Nf, and Std-CT can show robustness of all the algorithms.
The results in Table 4 demonstrate that both of MPPRP-M and MPPRP-W outperform the other seven algorithms.
In the end of this section, we use our algorithm to solve a singular system of equations. The next test problem is a modified version of problem 1.
Problem 20. The elements of are given by The initial point is fixed as .
Note that problem 7 is singular since zero is its solution and the Jacobian matrix is singular. We implement Algorithm 1 to solve problem 7 with different dimensions to test whether it can find the singular solution or not.
5. Preliminary Application in Compressed Sensing
In this section, we will apply our algorithm to solve an engineering problem originated from compressed sensing of sparse signals.
Let be a linear operator, let be a sparse or a nearly sparse original signal, and let be an observed value which satisfies the following linear equations:The original signal is desirable to be reconstructed from the linear equations . Unfortunately, it is often that this linear equation is underdetermined or ill-conditioned in practice, and has infinitely many solutions. From the fundamental principles of compressed sensing, it is a reasonable approach to seek the sparsest one among all the solutions, which contains the least nonzero components.
In , it was shown that the compressed sensing of sparse signals can be formulated the following nonlinear system of equations:where where , , and , for all with , and is an -dimensional vector with all elements being one. Clearly, (66) is nonsmooth.
Implement Algorithm 1 to solve the compressed sensing problem of sparse signals with different compressed ratios. In this experiment, we consider a typical compressive sensing scenario, where the goal is to reconstruct an -length sparse signal from observations. We test the three algorithms (two popular algorithms: CGD in  and SGCS in , and MPPRP-M) under three CS ratios (CS-R): , 0.25, 0.5 corresponding to different numbers of measurements with , respectively. The original sparse signal contains 32(64) randomly nonzero elements. The measurement is disturbed by noise, i.e., , where is the Gaussian noise distributed as with . is the Gaussian matrix generated by commend randn(, ) in Matlab. The quality of restoration is measured by the mean of squared error (MSE) to the original signal ; that is,where is the restored signal. The iterative process starts at the measurement image, i.e., , and terminates when the relative change between successive iterates falls below . It says thatwhere and is chosen as suggested in . The other parameters in this experiment are same as those in the numerical experiments conducted in Section 4. Numerical efficiency is shown in Table 6.
Clearly, from the results in Table 6, it follows that, for any CS ratio, MPPRP-M can recover the sparse signals more efficiently without reduction of recovery quality (see the italicized results in Table 6). If the sparsity level 32 is replaced by 64 in the 2048-length original signal, the corresponding results are also presented in Table 6, denoted by .
In this paper, we have presented a modified spectral PRP conjugate gradient derivative-free projection-based method for solving the large-scale nonlinear monotonic equations, where the search direction is proved to be sufficiently descent for any line search rule, and the step length is chosen by a line search which can overcome the difficulty of choosing an appropriate weight.
Under mild assumptions, global convergence theory has been established for the developed algorithm. Since our algorithm does not involve computing the Jacobian matrix or its approximation, both information storage and computational cost of the algorithm are lower. That is to say, our algorithm is helpful to solution of large-scale problems. In addition, it has been shown that our algorithm is also applicable to solve a nonsmooth system of equations or a singular one.
Numerical tests have demonstrated that our algorithm outperforms the others by costing less number of function evaluations, less number of iterations, or less CPU time to find a solution with the same tolerance, especially in comparison with some similar algorithms available in the literature. Efficiency of our algorithm has also been shown by reconstructing sparse signals in compressed sensing.
In future research, due to its satisfactory numerical efficiency, the proposed method in this paper can be extended into solving more large-scale nonlinear system of equations from many fields of sciences and engineering.
The data used to support the findings of this study are available from the corresponding author upon request.
Conflicts of Interest
We declare that all the authors have no any conflicts of interest about submission and publication of this paper.
Zhong Wan conceived and designed the research plan and wrote the paper. Jie Guo performed the mathematical modelling and numerical analysis and wrote the paper.
This research is supported by the National Natural Science Foundation of China (Grant no. 71671190) and Opening Foundation from State Key Laboratory of Developmental Biology of Freshwater Fish, Hunan Normal University.
J. M. Ortega and W. C. Rheinboldt, Iterative Solution of Nonlinear Equations in Several Variables, Siam, New York, NY, USA, 1970.View at: MathSciNet
J. E. Dennis and B. R. Schnabel, “Numerical methods for nonlinear equations and unconstrained optimization,” Classics in Applied Mathematics, p. 16, 1983.View at: Google Scholar
C. Kanzow, N. Yamashita, and M. Fukushima, “Levenberg-Marquardt methods with strong local convergence properties for solving nonlinear equations with convex constraints,” Journal of Computational and Applied Mathematics, vol. 173, no. 2, pp. 321–343, 2005.View at: Publisher Site | Google Scholar | MathSciNet
W. Zhou and D. Li, “On the Q-linear convergence rate of a class of methods for monotone nonlinear equations,” Pacific Journal of Optimization, vol. 14, pp. 723–737, 2018.View at: Google Scholar
B. T. Polyak, Introduction to Optimization, Optimization Software, New York, NY, USA, 1987.View at: MathSciNet
S. Kim, K. Koh, M. Lustig et al., “A method for large-scale 1-regularized least squares problems with applications in signal processing and statistics,” Technical Report, Dept. of Electrical Engineering, Stanford University, Stanford, Calif, USA, 2007.View at: Google Scholar