Abstract

We propose a new method for equality constrained optimization based on augmented Lagrangian method. We construct an unconstrained subproblem by adding an adaptive quadratic term to the quadratic model of augmented Lagrangian function. In each iteration, we solve this unconstrained subproblem to obtain the trial step. The main feature of this work is that the subproblem can be more easily solved. Numerical results show that this method is effective.

1. Introduction

In this paper, we consider the following equality constrained optimization:where , , and are twice continuously differentiable.

The method presented in this paper is a variant of the augmented Lagrangian method (denoted by AL). In the late 1960s, AL method was proposed by Hestenes [1] and Powell [2]. Later Conn et al. [3, 4] presented a practical AL method and proved the global convergence under the LICQ condition. Since then, AL method attracted the attentions of many scholars and many variants were presented (see [511]). Up to now, there are many computer packages based on AL method, such as LANCELOT [4] and ALGENCAN [5, 6]. In the past decades, AL method was fully developed. Attracted by its well performance, there are still many scholars devoted to research AL method and its applications in recent years (see [7, 8, 1115]).

For (1), we define the Lagrangian functionand the augmented Lagrangian functionwhere is called the Lagrangian multiplier and is called the penalty parameter. In this paper, refers to the Euclidean norm.

In a typical AL method, at the th step, for given multiplier and penalty parameter , an unconstrained subproblem is solved to find the next iteration point. Then, the multiplier and penalty parameter are updated by some rules. For convenience, for given and , we define

Motivated by the regularized Newton method for unconstrained optimization (see [1619]), we construct a new subproblem of (1). At the th iteration point , is approximated by the following quadratic model: where and is a positive semidefinite approximation of . Letthus we haveIn [14, 15], is minimized within a trust region to find the next iteration point. Motivated by the regularized Newton method, we add a regularization term to the quadratic model and definewhere is called regularized parameter. At the th step of our algorithm, we solve the following convex unconstrained quadratic subproblem:for finding the trial step . Then, we compute the ratio between the actual reduction and predicted reductionWhen is close to , we accept as the next iteration point. At the same time, we think the quadratic model is a sufficiently “good” approximation of and reduce the value of . Conversely, when is close to zero, we set and increase the value of , by which we wish to reduce the length of the next trial step. This technique is similar to the update rule of trust region radius. Actually, sufficiently large indeed reduces the length of the trial step . However, the regularized parameter is different from trust region radius. In [14, 15], the authors construct a trust region subproblemThe exact solution of (12) satisfies the first-order critical conditions if there exists some such that is positive semidefinite andwhile the first-order critical condition of (10) isEquations (13) and (15) can show the similarities and differences between regularized subproblem (10) and trust region subproblem (12). It seems that the parameter plays a role similar to the multiplier in the trust region subproblem. But, actually, the update rule of (see (26)) shows that is not the approximation of . The update of depends on the quality of last trial step and has no direct relation with system (13).

To establish the global convergence of an algorithm, some kind of constraint qualification is required. There are many well-known constraint qualifications, such as LICQ, MFCQ, CRCQ, RCR, CPLD, and RCPLD. In case there are only equality constraints, LICQ is equivalent to MFCQ in which has full rank; CRCQ is equivalent to CPLD in which any subset of maintains constant rank in a neighborhood of ; RCR is equivalent to RCPLD in which maintains constant rank in a neighborhood of . RCPLD is weaker than CRCQ, and CRCQ is weaker than LICQ. In this paper, we use RCPLD which is defined in the following.

Definition 1. One says that RCPLD holds at a feasible point of (1), if there exists a neighborhood of such that maintains constant rank for all .

The rest of this paper is organized as follows. In Section 2, we give a detailed description of the presented algorithm. The global convergence is proved in Section 3. In Section 4, we present the numerical experiments. Some conclusions are given in Section 5.

Notations. For convenience, we abbreviate to , to , to , and to . In this paper, denotes the th component of the vector .

2. Algorithm

In this section, we give a detailed description of the proposed algorithm.

As mentioned in Section 1, we solve the unconstrained subproblem (10) to obtain the trial step . Since is at least positive semidefinite and , is positive definite as Therefore, (10) is a strictly convex quadratic unconstrained optimization. solves (10) if and only ifholds. Global convergence does not depend on the exact solution of (15), although the linear system (15) is easy to be solved. For minimizer of (10) along the direction , specifically, we consider the following subproblem:If , then the minimizer of (16) is . Therefore, at the th step, it follows thatBy direct calculation, we have thatIn Section 3, we always suppose that (18) holds.

In a typical AL algorithm, the update rule of depends on the improvement of constraint violation. A commonly used update rule is that if , where , one may think that the constraint violation is reduced sufficiently and thus is a good choice. Otherwise, if , one thinks that current penalty parameter can not sufficiently reduce the constraint violation and increase it in the next iteration. In [20], Yuan proposed a different update rule of for trust region algorithm. Specifically, if is increased. In (19), is an auxiliary parameter such that tends to zero. We slightly modify (19) in our algorithm. Specifically, if is increased.

In typical AL method, next iteration point is obtained by minimizing . In most AL methods, satisfies that , where is controlling parameter which tends to zero. As when is sufficiently small, is a good estimate of the next multiplier . As we obtain by minimizing , the critical point of has no direct relation to . Therefore, the update rule does not suit our algorithm. We obtain by approximately solving the following least squares problem:

Most AL algorithms require that is bounded to ensure the global convergence. Hence, all components of are restricted to certain interval . This technique is also used in our algorithm.

Now, we give the detailed algorithm in the following.

Algorithm 2.
Step  0 (initialization). Choose the parameters , , . Determine , , , , . Let Set .
Step  1 (termination test). If and , return as a KKT point. If , , and , return as an infeasible KKT point.
Step  2 (determine the trial step). Evaluate the trial step by solvingsuch that (18) holds. Compute the ratio between the actual reduction to the predicted reductionwhere , . SetStep  3 (update the penalty parameter). IfsetOtherwise, setStep  4 (update the multiplier). If , set . Evaluate byand letIf , set and .
Set and go to Step  1.

Remark 3. In practical calculation, it is not required to solve (30) exactly to find . In our implementation of Algorithm 2, we use the Matlab subroutine minres to find an approximate solution of the linear system and take it as an approximation of .

3. Global Convergence

In this section, we discuss the global convergence of Algorithm 2. We assume that Algorithm 2 can find an infinite set and give some assumptions in the following.

Assumptions 1. (A1) and are twice continuously differentiable.
(A2) and are bounded, where is positive semidefinite approximation of .
Firstly, we give a result on the upper bound of the trial step.

Lemma 4. If solves subproblem (23), then one has

Proof. Any approximate solution of (23) satisfies . Clearly,If , then (32) holds. If , implies that or . Thus we can obtain (32).

Now, we discuss convergence properties in two cases. One is that the penalty parameter tends to and the other is that is bounded.

3.1. The Case of

Lemma 5. Suppose that (A1)-(A2) hold and ; then there exists a constant such that .

Proof. See Lemma  3.1 in Wang and Yuan [15].

In Lemma 5, if , then any accumulation point of is infeasible. Sometimes (1) is naturally infeasible; in other words, the feasible set is empty. In this case, we wish to find a minimizer of constraint violation. Specifically, we wish to solveThe solution of this problem is characterized byIn the next theorem, we show that if is not convergent to zero, at least one of the accumulation points of satisfies (35).

Theorem 6. Suppose that (A1)-(A2) hold and . If , then

Proof. We prove this result by contradiction. Suppose that there exists some such thatBy the definition of in (7), we know thatAs and are bounded, we can deduce the boundedness of by (A2); that is, there exists some such thatBy (37), (38), and (39), we can conclude thatholds for all sufficiently large . By the boundedness of and , we can conclude that there exists , such thatholds for all sufficiently large , where is defined by (7). By (18), (40), and (41),holds for all sufficiently large . By the update rule of and the fact that , we have thatholds for infinitely many . As holds for all sufficiently by (40), it is easy to see that (42) contradicts to (43) as and is convergent. Thus we can prove the desired result.

Lemma 7. Suppose that (A1)-(A2) hold, , and ; then

Proof. Assume that there exists such thatThen, by (18) and (41), we know that, for all sufficiently large ,By the update rule of and ,holds for infinitely many . We will prove that (47) contradicts to (46). Let be the index set containing all such that (47) holds. Therefore, for all ,If there exists an infinite subset such that holds for all , then, by (48), it holds thatAs , , and , (50) implies thatholds for all sufficiently large . If there exists an infinite subset such that holds for all , then by (48) we have thatfor all . Equation (52) also implies (51) as and . From (28) and (29), it follows that holds for all . Therefore, by (49) we know that, for all ,As , (53) implies thatholds for all sufficiently large . Thus from (47), we obtain (51) and (54) which contradict to (46). Thus we can complete the proof.

Theorem 8. Suppose that (A1)-(A2) hold. If and , then there exists one cluster point of such that is a KKT point of (1) or the RCPLD condition does not hold at .

Proof. Under the assumptions of this theorem, Lemma 7 implies that there exists an index set such that converge to some ,where is defined in (7). With the help of Theorem  2 in Andreani et al. [21], (55) imply that is a KKT point or the RCPLD condition does not hold at .

3.2. The Case of Being Bounded

In this subsection, without loss of generality, we assume that for all . Thus by the update rule (29), we have that andholds for all . As remains constant, it follows from (A1) and (A2) that and are all bounded. If we define the index setthen for .

Lemma 9. Suppose that (A1)-(A2) hold and for all . If , as , and there exists some constant such thatthen is divergent.

Proof. We prove this lemma by contradiction. We will show that if is convergent, then holds for all sufficiently large which contradicts to the fact that , as .
Suppose that and (32) imply thatholds for all sufficiently large . By the definition of ,Equations (59)–(61) imply that is convergent. LetIt is clear that . By Taylor’s theorem, it holds thatwhere is a convex combination of and . According to (60), we have thatholds for all sufficiently large and thusThe convergence of and the boundedness of imply that . Therefore, for all sufficiently large . This implies that and .

Lemma 10. Suppose that (A1)-(A2) hold and for all ; then we have that

Proof. Firstly, we prove that the sum of is bounded. Define the indices setwhere is defined by Steps 0 and 4 in Algorithm 2. From Step  4 of Algorithm 2, we know that if , then and . Hence we havewhere is the upper bound of . From Step  4 and (67), we havewhich impliesThen, we havewhere is defined by (5).
Secondly, we proveby contradiction. Suppose that there exists some such thatEquations (56) and (73) imply thatConsidering the sum of on the index set (see (57)), we have by (71) thatIt can be deduced by (74) and (75) that and thusIf is a finite set, then it follows from (57) and Step  2 that and hold for all sufficiently large . Therefore, , as . If is an infinite set, the second inequality in (77) implies that , as and . From Step  2, we know that if , then . Hence, we have , as . The fact that and (74) imply that holds for all sufficiently large . Hence it can be deduced by Lemma 9 that is divergent which contradicts the first part in (77).
Finally, we prove (66). If is a finite set, then is convergent. Thus, (72) implies (66). From now on we assume that is an infinite set. Suppose that (66) does not hold; then there exist an infinite index set () and a constant such thatBy (72), there also exists an infinite index set () such that ,Let ; then and is an infinite index set,Therefore, by (75), we have thatWith the help of (56), (80), and (83), we obtain thatA direct conclusion which can be drawn by (84) isThus by Lemma 4, we have that, for all sufficiently large ,Therefore, for sufficiently large ,Equations (84) and (87) imply that , as . Therefore (79) contradicts (81). Thus we complete the proof.

Lemma 11. Suppose that (A1)-(A2) hold and for all ; then we have

Proof. Suppose that (88) does not hold. Then there exists , such thatBy (18), we have thatAs is bounded above, similar to the second part in the proof of Lemma 10, we can conclude thatand thusBy (75) and (92), we have that is convergent and thus is also convergent as is bounded. However, Lemma 9, (91), (92), and the boundedness of deduce the divergence of . This contradiction completes the proof.

With the help of Lemmas 10 and 11, we can easily obtain the following result.

Theorem 12. Suppose that (A1)-(A2) hold and for all ; there exists an accumulation point of at which the KKT condition holds.

Note that, in Theorem 12, we do not suppose that RCPLD holds.

4. Numerical Experiment

In this Section, we investigate the performance of Algorithm 2. We compare Algorithm 2 with the famous Fortran package ALGENCAN. In our computer program, the parameters in Algorithm 2 are chosen as follows:We set to be the exact Hessian of the Lagrangian at the point . The Matlab subroutine minres is used to solve (15). All algorithms are terminated when one of the following conditions holds: and ; and ; . All test problems are chosen from CUTEst collection [22].

The numerical results are listed in Table 1 where the name of problem is denoted by , the number of its variables is denoted by , the number of constraints is denoted by , the number of function evaluations is denoted by , and the number of gradient evaluations is denoted by . In Table 1, we list the results of 38 test problems. Considering the numbers of function evaluations (), Algorithm 2 is better than ALGENCAN for 30 cases (78.9%). Considering the numbers of gradient evaluations (), Algorithm 2 is better than ALGENCAN for 31 cases (81.6%).

5. Conclusions

In this paper, we present a new algorithm for equality constrained optimization. We add an adaptive quadratic term to the quadratic model of the augmented Lagrangian function. In each iteration, we solve a simple unconstrained subproblem to obtain the trail step. The global convergence is established under reasonable assumptions.

From the numerical results and the theoretical analysis, we believe that the new algorithm can efficiently solve equality constrained optimization problems.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.

Acknowledgments

This work is supported by NSFC (11771210, 11471159, 11571169, and 61661136001) and the Natural Science Foundation of Jiangsu Province (BK20141409).