Abstract

Two unified frameworks of some sufficient descent conjugate gradient methods are considered. Combined with the hyperplane projection method of Solodov and Svaiter, they are extended to solve convex constrained nonlinear monotone equations. Their global convergence is proven under some mild conditions. Numerical results illustrate that these methods are efficient and can be applied to solve large-scale nonsmooth equations.

1. Introduction

Consider the constrained monotone equations where is continuous and satisfies the following monotonicity: and is a nonempty closed convex set. Under these conditions, the solution set of problem (1) is convex [1]. This problem has many applications, such as the power flow equation [2, 3] and some variational inequality problems which can be converted into (1) by means of fixed point maps or normal maps if the underlying function satisfies some coercive conditions [4].

In recent years, the study of the iterative methods to solve problem (1) with has received much attention. The pioneer work was introduced by Solodov and Svaiter in [5], where the proposed method was called inexact Newton method which combines elements of Newton method, proximal point method, and projection strategy and required that is differentiable. Its convergence was proven without any regularity assumptions. And a further study about its convergence properties was given by Zhou and Toh [6]. Then utilizing the projection strategy in [5], Zhou and Li extended the BFGS methods [7] and the limited memory BFGS methods [8] to solve problem (1) with . A significant improvement is that these methods converge globally without requiring the differentiability of .

Conjugate gradient methods are another class of numerical methods [915] after spectral gradient methods [1618] extended to solve problem (1), and the study of this aspect is just catching up. As is well known, conjugate gradient methods are very efficient to solve large-scale unconstrained nonlinear optimization problem where is smooth, due to their simple iterations and their low memory requirements. In [19], they were divided into three categories, that is, early conjugate gradient methods, descent conjugate gradient methods, and sufficient descent conjugate gradient methods. Early conjugate gradient methods rarely ensure a (sufficient) descent condition where is the gradient of at (the th iteration) and is a search direction, while the later two categories always satisfy the descent property. One well-known sufficient descent conjugate gradient method, namely, CG_DESCENT, was presented by Hager and Zhang [20, 21] and satisfied the sufficient descent condition (4) with . Inspired by Hager and Zhang’s work, a unified framework of some sufficient descent conjugate gradient methods was presented in [19, 22]. And by the use of Gram-Schmidt orthogonalization, the other unified framework of some sufficient descent conjugate gradient methods was presented in [23].

Although conjugate gradient methods have been investigated extensively for solving unconstrained optimization problems, the study of them to solve nonlinear monotone equations is relatively rare. For the unconstrained case of monotone equations, Cheng [10] first introduced a PRP type method which is a combination of the well-known PRP conjugate gradient method [24, 25] and the hyperplane projection method [5]. Then some derivative-free methods were presented [1113] which also belong to the conjugate gradient scheme. More recently, Xiao and Zhu [9] presented a modified version of the CG_DESCENT method to solve the constrained nonlinear monotone equations. And under some mild conditions, they proved that their proposed method is globally convergent. We have mentioned that there are two unified frameworks of some sufficient descent conjugate gradient methods, and the CG_DESCENT method belongs to one unified framework. Since the CG_DESCENT method can be used to solve the constrained monotone equations, then, it is natural for us to think about the two unified frameworks. So, in this paper, we extend the conjugate gradient methods who belong to the two unified frameworks to solve the constrained monotone equations and do some numerical experiments to test their efficiency.

The rest of this paper is organized as follows. In Section 2, the motivation to investigate two unified frameworks of some sufficient descent conjugate gradient methods is given. Then these methods are developed to solve problem (1) and are described by a model algorithm. In Section 3, we prove the global convergence of the model algorithm under some mild conditions. In Section 4, we give several specific versions of the model algorithm, test them over some test problems, and compare their numerical performance with that of the conjugate gradient method proposed in [9]. Finally, some conclusions are given in Section 5.

2. Motivation and Algorithms

In this section, we simply describe the hyperplane projection method of Solodov and Svaiter and introduce two classes of sufficient descent conjugate gradient methods for solving large-scale unconstrained optimization problems. Combined with the hyperplane projection method, we extend sufficient descent conjugate gradient methods to solve large-scale constrained nonlinear equations (1).

For convenience, we first give the definition of projection operator which is defined as a mapping from to its a nonempty closed convex subset : And its two fundamental properties are

Now, we recall the hyperplane projection method in [5] for the unconstrained case of problem (1). Let () be the current iteration and , where is a step length obtained by means of a one-dimensional line search and is a search direction. If is not a solution and satisfies then the hyperplane strictly separates the current iteration from the solution set of problem (1). By the property (7) of the projection operator, it is not difficult to verify that is closer to the solution set than the iteration . Then the next iteration is generated by .

We consider the iterative scheme of conjugate gradient methods for solving the unconstrained optimization problem (3). For any given starting point , a sequence is generated by the following recursive relation: where is a steplength and is a descent direction. One way to generate is where and is a scalar. The formula of in the CG_DESCENT method is defined as where . Then the direction from (12) satisfies the sufficient descent condition (4) with . For more efficient versions of the CG_DESCENT method, please refer to [26, 27].

In [19, 22], a generalization of (13) was given by where , , and . Obviously, is a special case of (14) with , , and . More recently, Xiao and Zhu [9] presented a modified version of the CG_DESCENT method to solve the constrained problem (1). This work inspires us to extend the general case (14) to solve problem (1). So, we define where and the scalar is defined as with (), , and . Moreover, the formula of proposed by Xiao and Zhu [9] corresponds to (16) with , , and , where , , and .

The other general way of producing sufficient descent conjugate gradient methods for solving the unconstrained optimization (3) was provided in [23]. By using the Gram-Schmidt orthogonalization, the search direction is generated by where is a scalar, and its definition could be the same as that in (12). Obviously, it always satisfies . In this paper, we will prove that the class of sufficient descent conjugate gradient methods can also be extended to solve problem (1) with the corresponding search direction defined as where the formula of could be (16).

Now we introduce the two unified frameworks of some sufficient descent conjugate gradient methods to solve problem (1) by adopting the projection strategy in [5]. We state the steps of the model algorithm as follows.

Algorithm 1.

Step 0. Choose an initial point , , , and . Set .

Step 1. If , stop. Otherwise, generate by certain iteration formula which satisfies sufficient descent condition.

Step 2. Let be the largest such that and then compute .

Step 3. Compute the new iterate by Set . Go to Step 1.

(i) If the search direction in Step 1 is generated by the formula (15), we name the algorithm as Algorithm 1(a). And if the search direction in Step 1 is generated by the formula (18), we name the algorithm as Algorithm 1(b).

3. Convergence Analysis

In this section, we analyze the convergence properties of Algorithm 1. We first make the following assumptions.

Assumption 2. The mapping is -Lipschitz continuous on the nonempty closed convex set ; that is, there exists a constant such that

Assumption 3. The solution set of problem (1) is nonempty.

Assumption 4. The parameter satisfies inequality where is a positive number.

Assumption 4 is not difficult to satisfy. Taking the parameter in (15) as an example, if there exists a large number such that for all , then it satisfies the inequality (22). In fact,

The following two lemmas show that the search direction , no matter from (15) or (18), satisfies the sufficient descent condition.

Lemma 5. If , , and is generated by (15), then, for every ,

Proof. Since , then which satisfies (24). For every , multiplying (15) by , we have Denote and . By applying the inequality to the second term in (25), we obtain (24).

The lemma above is similar to Theorem  1.1 in [20]. And from this lemma, we can see that the descent property of from (15) is independent of any line search and choices of the parameters and . While different choices of the parameters , , and may yield very different numerical behaviors.

Lemma 6. Let be the sequence generated by (18), and then, for all , it holds that where .

Proof. The desired result is very easy to obtain. In fact, if , it is clear that . If , we have

The lemma above indicates that the descent property of from (18) is independent of the choices of .

Lemma 7. Suppose Assumptions 2 and 3 hold. Let be the steplength involved in Algorithm 1, and let sequences and be generated by Algorithm 1. Then steplength is well defined and satisfies the following inequality:

Proof. Suppose that, at th iteration, is not a solution, that is, , and, for all , inequality (19) fails to hold, and then Since is continuous, taking the limits with respect to on the both sides of (29) yields which contradicts Lemmas 5 and 6. So, the step length is well defined and can be determined within a finite number of trials.
Now, we prove inequality (28). If , then by using the selection of , we have Combining it with (24), (26), and the Lipschitz continuity of yields From Lemmas 5 and 6, we have that Since , then (33) indicates . So, it follows from inequality (32) that Then inequality (28) is obtained.

Lemma 8. Let and let sequences and be generated by Algorithm 1. Then one has Furthermore, And there exists a positive number such that and for all .

Proof. Since , then and . Since the mapping is monotone, then ; further, By using (19) and , we have which implies that Obviously, . By using the property (7) of the projection operator and (37), we have Substituting the second term in (40) by (39), inequality (35) follows.
The inequality (35) shows that the sequence is convergent, and then taking the limits with respect to on the both sides of (35) yields (36).
Since is Lipschitz continuous, then from (35), we have And from (36), we know that there exists a positive number such that , and then Denote ; we have that and .

Theorem 9. Let Assumption 4 hold and let be a sequence generated by Algorithm 1. Then And one has

Proof. If is generated by (15), we have which satisfies (43). If is generated by (18), then The inequality (43) is obtained easily. From Lemma 8, we know that there exists such that , and then
Suppose (44) does not hold, then there exists such that From (33), we have that , which implies By inequalities (28), (47), (48), and (49), we have This contradicts (36). So the conclusion (44) holds.

4. Numerical Experiments

In this section, we give some specific versions of Algorithm 1 and investigate their numerical behaviors. Let us review the HS conjugate gradient method [28]. It generates search direction by (12) and parameter by Among early conjugate gradient methods, the HS method is a relatively efficient one. And many conjugate gradient methods are its improved versions, such as the well-known CG_DESCENT method. Now based on the HS method and Assumption 4, we give several specific versions of Algorithm 1(a) as follows.

Method 1. Consider Algorithm 1(a) with

Method 2. Consider Algorithm 1(a) with

Method 3. Consider Algorithm 1(a) with where .

Since the definition of parameter in Algorithm 1(b) could be the same as that in Algorithm 1(a), and the descent property of in Algorithm 1(b) is independent of the choices of parameter , we can give several specific versions of Algorithm 1(b) as follows.

Method 4. Consider Algorithm 1(b) with (52).

Method 5. Consider Algorithm 1(b) with

Method 6. Consider Algorithm 1(b) with
From Lemma 8, we know that a sequence generated by Algorithm 1 is norm bounded. Then it is easy to verify that the parameters in Methods 16 satisfy Assumption 4. So, from the convergence analysis in Section 3, we know that Methods 16 are convergent in the sense that .

Next, we test the performance of Methods 16 via the following four constrained monotone problems and compare them with the method (abbreviated as CGD_XZ) in [9].

Problem 10 (see [29]). The mapping is taken as , where , , and .

Problem 11 (see [17]). The mapping is taken as , where and .

Problem 12 (see [30]). The mapping is taken as , where and .

Problem 13 (see [31]). The mapping is given by and .

For Methods 16 and CGD_XZ method, we set , , and , where and which is obtained by the monotonicity and the -Lipschitz continuity of . The stopping criterion is .

Our computations were carried out using MATLAB R2011b on a desktop computer with an Intel(R) Xeon(R) 2.40 GHZ CPU, 6.00 GB of RAM, and Windows operating system. The numerical results were reported in Tables 1, 2, 3, and 4, where the initial points , , , , , and and Dim, Iter, Nf, and CPU stand for the dimension of the problem, the number of iterations, the number of function evaluations, and the CPU time elapsed in seconds, respectively. Table 5 showed the number that each method solved the test problems with the least iterations, the least function evaluations, and the best time, respectively.

The performance of the seven methods was evaluated using the profiles of Dolan and Morè [32]. That is, we plotted the fraction of the test problems for which each of the methods was within a factor of the best time. Figures 13 showed the performance profiles referring to the number of iterations, the number of function evaluations, and CPU time, respectively. Figure 1 indicated that relative to the number of iterations, Methods 1 and 2 performed best for near . When , CGD_XZ was comparable with Methods 1 and 2 and had a higher number of wins than Methods 36. Figure 2 revealed that relative to the number of function evaluations, Method 2 performed best for near . When , Method 5 performed more robust and then Methods 3 and 4. Figure 3 revealed that relative to the CPU time metric, Method 2 performed best for near . Method 5 performed more robust when , and Methods 24 were competitive. While CGD_XZ performed worst, it had a lower number of wins than the rest of the methods. So, from the analysis above, we can conclude that all these methods were efficient to solve these test problems. If we consider the number of wins, Method 2 performed best which is also revealed by Table 5, while from the view of robustness, Method 5 performed best.

5. Conclusions

In this paper, we discussed two unified frameworks of some sufficient descent conjugate gradient methods and combined them with the hyperplane projection method of Solodov and Svaiter to solve convex constrained nonlinear monotone equations. The two unified frameworks inherit the advantages of some usual conjugate gradient methods for solving large-scale unconstrained minimization problems. That is, they satisfy the sufficient descent condition () independently of any line search, and they do not require ’s Jacobian, then they are suitable to solve large-scale nonsmooth monotone equations. In Section 4, we gave several specific versions of the two unified frameworks and investigated their numerical behaviors over some test problems. From the numerical results, we concluded that these specific versions are efficient.

Let us review problem (1) and introduce a monotone inclusion problem where the set-value mapping is maximal monotone. Obviously, the latter is more general than the former; then, our further investigation is to extend these sufficient descent conjugate gradient methods to solve problem (60).

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

Acknowledgment

This work was supported by the National Science Foundation of China, no. 61373174.