Abstract

We propose and generalize a new nonlinear conjugate gradient method for unconstrained optimization. The global convergence is proved with the Wolfe line search. Numerical experiments are reported which support the theoretical analyses and show the presented methods outperforming CGDESCENT method.

1. Introduction

This paper is concerned with conjugate gradient methods for unconstrained optimization where is continuously differentiable and bounded from below. Starting from an initial point , a nonlinear conjugate gradient method generates sequences and by the below iteration where is a step length which is determined by a line search and the direction is generated as where is the gradient of at and is a scalar.

Different conjugate gradient algorithms correspond to different choices for the scale parameter . The well-known formulae of are given by which are called Fletcher-Reeves [1] (FR), Polak-Ribière-Polyak [2] (PRP), Dai-Yuan [3] (DY), and Hestenes-Stiefel [4] (HS), respectively. Though FR and DY have strong convergence properties, they may have modest practical performance. While PRP and HS often have better computational performance, but they may not generally be convergent.

These motivate us to derive some efficient algorithms. In this paper, we focus on mixed conjugate gradient methods. These methods are combinations of different conjugate gradient methods. The aim of this paper is to propose the new methods that possess both convergence and well numerical results.

The line search in the conjugate gradient algorithms is often based on the Wolfe inexact line search where .

Many research on the parameter have been concerned [57]. Such as Al-Baali [8] proved that FR global convergent with inexact line search in which . Liu et al. [9] spread the results of [8] to the case of . Dai and Yuan [10] gave an example when , FR may produce a rise direction.

PRP is famous as the best performance of all conjugate gradient methods which is the restart method in nature. When the direction is small and the factor in the numerator of tends to zero, the search direction is close to . Gilbert and Nocedal [11] proposed PRP+ which is the most successful modified method, that is, Dai and Yuan [12] presented DY method and proved the global convergence when the line search satisfies the Wolfe conditions. Zheng et al. [13] derived and discussed the properties of the new formulas.

HS is similar to PRP. It is equal to PRP when using the precision line search. HS satisfies the conjugate condition which is different from other methods.

Touati-Ahmed and Storey [14] gave Dai and Chen [15] proposed Dai and Ni [16] derived Throughout the paper, stands for the Euclidean norm.

Hager and Zhang (CGDESCENT) [17] proposed a conjugate gradient method with guaranteed descent which corresponds to the following choice for the update parameters: , where Here, is a constant. The extensive numerical tests and comparisons with other methods showed that this method has advantage in some aspects.

Zhang et al. (ZZL) [18] derived a descent modified PRP conjugate method, the direction is generated by The numerical results suggested that the efficiency of the MPRP method is encouraging.

Consider the above mixed techniques and the properties of the classical conjugate gradient methods, the new mixed methods will be presented. The main difference between the new methods and the existed methods are the choice of and giving the generalization of the new method. Moreover, the direction generated by the new methods are descent directions of the objective function under mild conditions. In the numerical results, the method’s overall performance will be given.

Firstly, we present a new formula

The rest of the paper is organized as follows. In Section 2, we give a new mixed conjugate gradient algorithm and convergence analysis. Section 3 is devoted to a generalization of the new mixed method. In the last section, numerical results and comparisons with the CGDESCENT and ZZL methods on test problems are reported and show the advantage of the new methods.

2. A New Algorithm and Convergence Analysis

We discuss a new mixed conjugate gradient method where .

Algorithm 2.1.
Step 1. Give , , ; .
Step 2. If , stop, else go to Step 3.
Step 3. Find satisfying Wolfe conditions (1.5) and (1.6).
Step 4. Compute new iterative by .
Step 5. Compute by (2.1), , , go to Step 2.

In order to derive the global convergence of the algorithm, we use the following assumptions.H 2.1 The objective function is bounded in the level set as below where is the starting point. H 2.2 is continuously differentiable in a neighborhood of and its gradient is Lipschitz continuous, there exists a constant such that

Lemma 2.2 (see Zoutendijk condition [19]). Suppose that H 2.1 and H 2.2 hold. If the conjugate gradient method satisfies , then

Theorem 2.3. Suppose that H 2.1 and H 2.2 hold. Let and be generated by (1.2) and (1.3), where is computed by (2.1), satisfies Wolfe line search conditions, then holds for all .

Proof. The conclusion can be proved by induction. When , we have . Suppose that hold for . From (1.3), we have
When , it is obvious that
When , from (1.6) we have Then, from , then we can deduce that holds for all .
Thus, the theorem is proved.

Theorem 2.4. Suppose that H 2.1 and H 2.2 hold. Consider Algorithm 2.1, where is determined by (2.1), if holds for any , then,

Proof. By contradiction, assume that (2.9) does not hold. Then there exists a constant , such that
From (2.1),
By (1.3), if , we derive
Then,
So,
From (1.3), we have
By squaring the two sides of (2.15) and transferring and trimming, we get Then, Since, From (2.10), we have Therefore, This is a contradiction to Lemma 2.2, the global convergence is got.

3. Generalization of the New Method and Convergence

The generalization of the new mixed method is as follows: where , .

Algorithm 3.1.
Step 1–Step 4 are the same as that of Algorithm 2.1.
Step 5. Compute by (3.1), , , go to Step 2.

Theorem 3.2. Suppose that H 2.1 and H 2.2 hold. Let and be generated by (1.2) and (1.3), where is computed by (3.1), satisfies Wolfe line search conditions, then holds for all .

Proof. The conclusion can be proved by induction. When , we have . Suppose that holds. For , it is obvious that if , then
When , from (2.5) and (3.1), we have
To sum up, the theorem is proved.

Theorem 3.3. Suppose that H 2.1 and H 2.2 hold. Consider Algorithm 3.1, where is determined by (3.1), if holds for any , then,

Proof. By contradiction, assume that (3.4) does not hold. Then there exists a constant such that
From (3.1)
By (1.3), we have
Then, From (3.6), By (3.5), we have This is a contradiction to Lemma 2.2, and the global convergence is proved.

4. Numerical Results

This section is devoted to test the implementation of the new methods. We compare the performance of the new methods with the CGDESCENT and ZZL methods.

All tests in this paper are implemented on a PC with 1.8 MHz Pentium IV and 256 MB SDRAM using MATLAB 6.5. If then stop. Some classical test functions with standard starting points are selected to test the methods. These functions are widely used in the literature to test unconstrained optimization algorithms [20].

In the table, the four reported data are iteration numbers/function evaluations/gradient evaluations/CPU time(s), and stands for the square of the gradient at the final iterate. When we set , , , , , the numerical results of the NEW1 and NEW2 are listed in Table 1 and the CGDESCENT and ZZL (NEW3) are listed in Table 2. When we set , , , , , the numerical results of the NEW 1 (NEW4) and NEW 2 (NEW5) are listed in Table 3 and the CGDESCENT (NEW6) and ZZL (NEW7) are listed in Table 4. It can be observed from Tables 14 that for the most of problems, the implementation of the new methods are superior to other methods from the iteration numbers, the calls of function, and gradient evaluations.

Compared with the CGDESCENT method, the new methods are effective (see Table 5).

Using the formula , where is fixed constant, let . By where ; , is the whole of classical problems’ order. If , then CGDESCENT method is regarded as better performance; if , the methods have the same performances and if , the new methods are performed better.

We use as a measure to compare the performance of CGDESCENT method and the new methods, where is the number of . If , then NEW method outperforms CGDESCENT method. The computational results are listed in Table 5.

It is obvious that , where , so we can deduce that the new methods outperform CGDESCENT method.

Acknowledgments

This work is supported in part by the NNSF (11171003) of China, Key Project of Chinese Ministry of Education (no. 211039), and Natural Science Foundation of Jilin Province of China (no. 201215102).