Abstract

We present a nonmonotone trust region algorithm for nonlinear equality constrained optimization problems. In our algorithm, we use the average of the successive penalty function values to rectify the ratio of predicted reduction and the actual reduction. Compared with the existing nonmonotone trust region methods, our method is independent of the nonmonotone parameter. We establish the global convergence of the proposed algorithm and give the numerical tests to show the efficiency of the algorithm.

1. Introduction

In this paper, we consider the equality constrained optimization problem as follows: where , , , , and are assumed to be twice continuously differentiable.

Trust region method is one of the well-known methods for solving problem (1). Due to its strong convergence and robustness, trust region methods have been proved to be efficient for both unconstrained and constrained optimization problems [19].

Most traditional trust region methods are of descent type methods; namely, they accept only a trial point as the next iterate if its associated merit function value is strictly less than that of the current iterate. However, just as pointed out by Toint [10], the nonmonotone techniques are helpful to overcome the case that the sequence of iterates follows the bottom of curved narrow valleys, a common occurrence in difficult nonlinear problems. Hence many nonmonotone algorithms are proposed to solve the unconstrained and constrained optimization problems [1120]. Numerical tests show that the performance of the nonmonotone technique is superior to those of the monotone cases.

The nonmonotone technique was originally proposed by Grippo, Lampariello and Lucidi [13] for unconstrained optimization problems based on Newton’s method, in which the stepsize satisfies the following condition: where , , and is a prefixed nonnegative integer.

Although the nonmonotone technique based on (2) works well in many cases, there are some drawbacks. Firstly, a good function value generated in any iteration is essentially discarded due to the maximum in (2). Secondly, in some cases, the numerical performance is heavily dependent on the choice of (see, e.g., [16, 21]). To overcome these drawback, Zhang and Hager [21] proposed another nonmonotone algorithm, and they used the average of function values to replace the maximum function value in (2). The numerical tests show that their nonmonotone line search algorithm used fewer function and gradient evaluations, on average, than either the monotone or the traditional nonmonotone scheme. Recently, Mo and Zhang [16] extended Zhang and Hager’s nonmonotone technique to unconstrained optimization with trust region global scheme and discussed the global and local convergence of the proposed algorithm.

In this paper, we further extend the nonmonotone technique [16, 21] to equality constrained optimization. To design our algorithm, we first introduce some notations as follows: denote and . Assuming that has full column rank, we define the projective matrix and the Lagrange function where is a projective version of the multiplier vector as follows:

For convenience, we denote the previous quantities at by , , , , , and . At each iteration, we calculate the trust region trial step as follows (see [22]): firstly, we calculate where Then we solve the trust region subproblem where denotes the Hessian matrix of the Lagrange function , is the trust region radius. Let be the solution of (8) and The trust region trial step is taken as To test whether the point can be accepted as the next iteration, we use the Fletcher’s exact penalty function as the merit function as follows: where is the penalty parameter.

To define our nonmonotone algorithm, we define where where , , , and are two chosen parameters.

From (12) and (13), we observe that is a convex combination of the function values so is regarded as the weighted average of the merit function values.

The paper is organized as follows. We describe our algorithm in Section 2 and analyze the global convergence in Section 3. The numerical tests are given in Section 4, and the conclusion is presented in Section 5.

2. Algorithm

In this section, we give the details of the nonmonotone trust region algorithm. We first recall the definition of a stationary point of problem (1). A point is called a stationary point of problem (1) if it satisfies We define the actual reduction from to by and the nonmonotone actual reduction by The predicted reduction is defined as Furthermore, we define the monotone ratio by and the nonmonotone ratio by where is computed by (12) and (13).

The description of the algorithm is given as follows.

Algorithm 1. Step 0. Set , , , , , , a symmetric matrix , parameters and , and .
Step 1. If , stop; otherwise, go to Step 2.
Step 2. Compute the trust region trial step .
Step 3. Set , if , and then set
Step 4. Compute by (12) and (13), and compute the by (20).
Step 5. Set
Step 6. Update as go to Step 3.
Step 7. Update , and choose . Set ; go to Step 1.

3. Global Convergence

In this section, we discuss the global convergence of Algorithm 1. The following assumptions are needed in our convergence analysis:

Assumptions (A1) The sequence and are contained in a compact set .(A2) There exists a positive constant such that for all , .(A3) For all , is of column full rank. We define two index sets as follows: The following lemmas (Lemmas 25) are helpful to analyze the convergence of the Algorithm 1, and the proofs are similar to [4].

Lemma 2. Assume that (A1)–(A3) hold, and then there exists a positive constant such that

Lemma 3. Let , and assume that (A1)–(A3) hold. Then there exists a positive constant such that

Lemma 4. Assume that (A1)–(A3) hold. Then there exists a positive constant such that

Lemma 5. Assume that (A1)–(A3) holds. Then there exists a positive constant such that

The following lemma shows the monotonicity property of the function sequence .

Lemma 6. Suppose that is generated by Algorithm 1. Then the following inequality holds for all :

Proof. We first prove that (29) holds for all ; that is, For , according to Lemma 2, Assumptions (A1) and (A2), we obtain According to (8)–(13), we have the following inequality: By (12) and (13), if , we have Otherwise, if , we have So, from (32) to (34), we know that (30) holds.
Next, we prove that (29) holds for all . From Step 4 of Algorithm 1, we get and for . Firstly, we prove that .
We consider two cases.
Case 1 (). According to (8), we have . Then it follows from (12) and (13) and that
Case 2 (. In this situation, let . If , from Step 4 of Algorithm 1, we have , . Consequently, it follows from (12) and (13) that We suppose that and set , and then we have By (12), we obtain According to (38) repeatedly, we can get Using the definition of and , we know that and through (8).
From (37) and (39), it follows that From (12) and (40) we know that By (35), (36), and (42), we get Now we prove that . If , from (34) and (42), the conclusion is obvious. If , then by (12), (13) and , we have . Thus (29) holds for all . The proof is completed.

Theorem 7. Suppose that the Assumptions (A1)–(A3) hold and the sequence is generated by Algorithm 1. Then the algorithm is well defined.

Proof. Since the algorithm does not stop in Step 2, then we have either or . We prove the conclusion by contradiction; if the conclusion is not true, by the algorithm, we have , but
Case 1 (). Then from Lemmas 2 and 4, we have which means that for large enough, according to Lemma 6, and we have that , so , which contradicts (43).
Case 2 (). In this case, we have and , By Lemma 3, and we can have Combining with Lemma 4, we have Then similar to Case 1, we can get a contradiction. Combining Cases 1 and 2, we can get the conclusion.

Similar to Lemma  7.11 in [4], we get the proposition of the penalty parameter as follows.

Lemma 8. Under Assumption (A1), if , then there exist a integer and a positive constant such that for all , .

Without loss of generality, we assume that for all . The following theorem gives the convergence proposition of the constraint sequence .

Theorem 9. Under the Assumptions (A1)–(A3), we have

Proof. First, we prove that Assume by contradiction that (48) does not hold, then there exists a constant such that for all . According to Lemma 6, we have
By using (13), we can prove that Adding all the previous inequalities and by Lemma 2, we have By Assumption (A1), we know that is bounded, let , and we have Since for all , we have . But similar to the proof of Theorem 7, we get , and therefore we have , which contradicts to . This contradiction shows that (48) holds.
Next we prove (47). Assume that (47) does not hold, then there exist a subsequence and a positive constant such that On the other hand, according to (48) we know that there exists another subsequence such that for , we have We define . According to Lemma 2, we get the following inequality: By Assumption (A1), is bounded, so we have that can be true only finite number of times. Thus there exists such that for , we have . Hence for , we have Then we know that Now, for large , Since is continuous, thus for large enough we have , and this contradicts to the assumption , which means that (47) holds.

Theorem 10. If (A1) holds, we have

Proof. Similar to the proof of Theorem  4 in [18].

Based on Theorems 9 and 10, we get the following global convergence result.

Theorem 11. Under Assumptions (A1)–(A3), we have

4. Numerical Tests

In this section, we test our algorithm for some typical problems. The program code was written in MATLAB and run in MATLAB 7.1 environment. The parameters in our algorithm are taken as follows: , , , , , , , and , and is updated by BFGS formulas as follows: where , . For deciding when to stop the execution of the algorithm declaring convergence we used the criterion . We also stop the execution when 500 iterations were completed without achieving convergence and denoted by fail. Our test problems are chosen from [23], and the problems are numbered in the same way as in [23]. For example, HS28 is the problem  28 in [23]. To test the efficiency of our algorithm, we compare our algorithm with the algorithms in [15, 18], where we choose the nonmonotone parameter .

The test results are given in Table 1: here we use No. to denote the number of the test problems, and denote the number of gradient estimation and the function value estimation, and Time denotes the CPU time when the algorithm is terminated.

From Table 1, we see that our algorithm spend more CPU time than algorithms [15, 18], but we use less function value estimation and gradient value estimation for most of the test problem. These numerical tests show that our algorithm works quiet well.

5. Conclusion

In this paper, we presented a nonmonotone trust region method based on the weighted average of the successive penalty values for equality constrained optimization. Compared with the existing nonmonotone trust region methods for constrained optimization, our method is independent on the nonmonotone parameter . The numerical comparison with some nonmonotone trust region methods shows the efficiency of our proposed method. How to obtain the local fast convergence of our method deserves further study, and we leave it as the future work.

Acknowledgments

This work is supported by National Natural Science Foundation of China (no. 11171221) and National Project and Liberal Base Cultivate Fund of USST (no. 12GXM).