Optimization Algorithms Combining (Meta)heuristics and Mathematical Programming and Its Application in EngineeringView this Special Issue
Research Article | Open Access
A Three-Term Conjugate Gradient Algorithm with Quadratic Convergence for Unconstrained Optimization Problems
This paper further studies the WYL conjugate gradient (CG) formula with and presents a three-term WYL CG algorithm, which has the sufficiently descent property without any conditions. The global convergence and the linear convergence are proved; moreover the n-step quadratic convergence with a restart strategy is established if the initial step length is appropriately chosen. Numerical experiments for large-scale problems including the normal unconstrained optimization problems and the engineer problems (Benchmark Problems) show that the new algorithm is competitive with the other similar CG algorithms.
Consider the following minimization optimizations modelling: where is a continuously differentiable function. The CG algorithms for (1) have the following iterative processes: where is the th iterate, is the step length, and is the search direction defined by where is the gradient and the parameter is a scalar determining different formulas (see [1–8], etc.). The PRP algorithm [6, 7] is one of the most effective CG algorithms and its convergence can be found (see [7, 9, 10], etc.). Powell  suggested that should not be less than zero; then many new CG formulas are proposed (see [11–17], etc.) to ensure the scalar At present, there are many results obtained in CG algorithms (see [11, 18–26], etc.) and a modified weak Wolfe-Powell line search technique is presented to study open unconstrained optimization (see [27, 28]). If a restart strategy is used, the PRP algorithm is n-step quadratic convergence (see [29–31]). Li and Tian  proved that a three-term CG algorithm has quadratic convergence with a restart strategy under some inexact line searches and the suitable assumptions.
Recently, Wei et al.  proposed a new CG formula defined by , where , and are the gradient of at and , respectively, and denotes the Euclidean norm of vectors. It is easy to deduce that The global convergences of the WYL algorithm with exact linear search, the Grippo-Lucidi Armijo line search, and the Wolfe-Powell line search have been established by [21, 33–35]. By restricting the parameter under the strong Wolfe-Powell linear search, the WYL algorithm can meet the sufficiently descent property. However the quadratic convergence is still open based on this algorithm. In this paper, we mainly further research the WYL algorithm. On the base of paper  and the paper , we propose a new WYL three-term CG formula. We show that the new CG algorithm has global convergence for general functions and has the n-step quadratic convergence for uniformly convex functions with r-step restart and standard Armijo line search under appropriate conditions. The numerical results show that the new algorithm performs quite well. The main attributes of this algorithm are listed as follows.
(i) A new WYL three-term CG algorithm is introduced, which has sufficiently descent property automatically.
(ii) The global convergence, linear convergent rate, and the n-step quadratic convergence are established.
(iii) Numerical results show that this algorithm is competitive with the normal algorithm for the given problems.
This paper is arranged as follows. In Section 2, we mainly review the motivation and introduce the modified WYL algorithm. We show that the global convergence and r-step linear convergence of the new algorithm with the standard Armijo line search in Section 3. In Section 4, the n-step quadratic convergence of the given algorithm is proved. In Section 5, some numerical experiments are done.
2. Motivation and Algorithm
In this section, we will give motivations based on the WYL formulas. Consider the WYL search direction and we all know that is sufficiently descent by restricting the parameter under the strong Wolfe-Powell linear search . By the definition of in (4), for , we get In order to ensure that the sufficiently descent property holds, then the first term of the above equality should be maintained. So the directive idea is to add another term to eliminate the second term of the above equality; at the same time, the conjugacy should be guaranteed. Therefore the new conjugate gradient formula called MWYL algorithm is defined by where and . It is easy to see that the above search direction is the normal WYL algorithm if the exact linear search is used. It is not difficult to see from (6) that is a descent direction of at ; namely, we have and Moreover we obtain and Now we list some linear search techniques that will be used in the following sections.
(i) The exact line search is to find such that the function is minimized along the direction , that is, , satisfying
(ii) The Armijo line search is to find a step length which satisfies where and .
(iii) The Wolfe line search conditions are where and .
In the following we will give the MWYL algorithm.
Algorithm 1 (a modified WYL three-terms CG algorithm, called MWYL). : Choose an initial point , , , ; let , : If , stop. Otherwise go to the next step. : Compute step size by the Armijo line search or Wolfe line search. : Let . If , then stop. : Compute the search direction using (6). : Let . Go to step 2.
3. Convergence of Algorithm 1
In this part, we will prove the global convergence and the r-linear convergence of the algorithm with the Armijo line search and Wolfe line search. The following assumptions are required.
Assumption i. The level set is bounded and, in some neighborhood of , is continuously differentiable and bounded below, and its gradient is globally Lipschitz continuous; namely, there exists a constant such that where is the closed convex hull of
Now we establish the global convergence of Algorithm 1.
Theorem 2. Let Assumption i hold and the sequence be generated by Algorithm 1. Then the relation holds.
Proof. We will prove this theorem by contradiction. Suppose that (15) does not hold, then, for all , there exists a constant satisfying Using (9) and (12), if is bounded from below, it is not difficult to get In particular, we have If , by (9) and (18), we obtain This contradicts (16); then (15) holds.
Otherwise if Then there exists an infinite index set satisfying By Step 2 of Algorithm 1, when is sufficiently large, does not satisfy (12), which implies that By (16), similar to the proof of Lemma 2.1 in , it is easy to deduce that there exists a constant such that Using (21), (9), and the mean-value theorem, we have where and the last inequality follows (26). Combining with (20), for all sufficiently large, we obtain By (21) and , then the above inequality implies that This is a contradiction too. The proof is complete.
In the next, we will prove the linear convergence of the sequence by the MWYL algorithm with the Armijo or Wolfe line search. The following assumption is further needed.
Assumption ii. is twice continuously differentiable and the uniformly convex function. In other words, there are positive constants such that where denotes the Hessian matrix of at .
It is not difficult to see that, under the Assumption ii, is continuous and is Lipschitz continuous and problem (1) has a unique solution which satisfies and
Lemma 3. Let Assumption ii hold and the sequence be generated by the MWYL algorithm with the Armijo or Wolfe line search, one has where and . In addition, if the Wolfe line search is used, the following holds:
Proof. We have from line search (12) that hold for any , because the objective function is uniformly convex of Assumption ii, is bounded below, so the inequality holds. Combining the Taylor theorem and Assumption ii, we obtain where belong to the segment . Therefore, we get By the inequalities and , we get . Using (12), (30), and Assumption ii, we obtain which includes .
It is not difficult to get that By the second inequality of (13), we get and By Assumption ii, we obtain . This completes the proof.
Lemma 4. Let the sequence be generated by the MWYL algorithm with the Armijo or Wolfe line search and Assumption ii hold; then there is a constant such that
Proof. Set where . By the mean-value theorem, we have and Therefore, by (9), (39), Lemma 3, and the Assumption ii, we get and By the above conclusion, (6), and the Lipschitz continuity of , we get If the Armijo line search is used, using the line search rule, if , then will not satisfy line search condition (12). Namely, Using the mean-value theorem and the above inequality, there exists satisfying Thus, by the above conclusion and (42), we get and letting , we have (36). If the Wolfe line search is used, from the second inequality of (13), we obtain By similar way to that for the Armijo line search, we can find a lower positive bound of ; the proof is completed.
Theorem 5. Let Assumption ii hold, be the unique solution of (1), and the sequence be generated by the MWYL algorithm with the Armijo or Wolfe line search. Then there are constants and satisfying
Proof. By (12) or the first relation of (13), we get where the first equality follows (9), the second inequality follows (26), and the last inequality (25). Setting generates By (25) again, we have and this relation shows that the proof is complete.
4. The Restart MWYL Algorithm’s N-Step Quadratic Convergence
Setting as exact line search step length, then holds. Thus, where . It is feasible to use the initial step length of the Armijo or Wolfe line search as an approximation of , where is defined bywhere the integer sequence as . If is a quadratic function, then and are consistent; namely, The above discussions can also be found in ; in fact, our ideas are motivated by this paper partly. The following Theorem 6 will show that, for sufficiently large , the inexact line search step which is defined by (53) satisfies the Armijo and Wolfe conditions.
Theorem 6. Let sequence be generated by the MWYL algorithm and Assumption ii hold. Then, when is sufficiently large, satisfies the Armijo and Wolfe conditions.
Proof. Let , using Assumption ii and (10), we have and Using , Assumption ii, (47), and (55), we get For is sufficiently large, we have When is sufficiently large, satisfies the Armijo condition. Setting , we get So, for sufficiently large , we have This implies that satisfy the Wolfe line search. The proof is complete.
If we use the restart algorithm, the n-step quadratic convergence is desirable. In the next, we use the as the initial step-length and give the algorithm steps of the restart MWYL algorithm.
Algorithm 7 (called RWYL). : Given , , , , , let : If , stop. : If the inequality holds, we set . Otherwise, we determine satisfying : Let , and . : If , stop. : If , we let . Go to step 1. : Compute by (6). Go to step 2.
Lemma 8. Let Assumption ii hold and be generated by the RWYL algorithm. Then there exist positive numbers , , such that
Proof. Considering the first inequality of (62), we get where . By the definition of we discuss the other three inequalities of (62), respectively. Starting from , by the (39) and (62), we get By (40), (62), and the definition of , we obtain By the above conclusion and the definition of , we have The proof is complete.
In the following, we will prove the n-order quadratic convergence of the RWYL algorithm. We always let Assumption ii hold and be generated by the RWYL algorithm. Using as the unique solution of problem (1), by Theorem 2, we have . The equation always holds if only is large enough by the Theorem 6. In order to establish this convergence of the RWYL algorithm, we further need the following assumption.
Assumption iii. In some neighborhood of , is Lipschitz continuous.
Based on Assumption iii and the above lemma, we have the following remarks. Let be the second-order approximate function of in the neighborhood of the initial point , then we have Let and be the iterations and directions generated by the RWYL algorithm to minimize the quadratic function with initial point Specifically, the sequence is generated by using the following process: and where for : From the proof process of Theorem 6, it is not difficult to see that when is sufficiently large, step length can always be found. Because is a quadratic function, is the same as the step length obtained by the exact line search. Consequently, we have ; moreover there is an index such that is the exact minimizer of .
Similar to Lemmas in the paper , it is not difficult to get the following relations: