Abstract
We present a new iterative method based on the line search filter method with the nonmonotone strategy to solve the system of nonlinear equations. The equations are divided into two groups; some equations are treated as constraints and the others act as the objective function, and the two groups are just updated at the iterations where it is needed indeed. We employ the nonmonotone idea to the sufficient reduction conditions and filter technique which leads to a flexibility and acceptance behavior comparable to monotone methods. The new algorithm is shown to be globally convergent and numerical experiments demonstrate its effectiveness.
1. Introduction
We consider the following system of nonlinear equations: where each is a twice continuously differentiable function. It is one of the most basic problems in mathematics and has lots of applications in many scientific fields such as physics, chemistry, and economics.
In the context of solving nonlinear equations, a well-known method is the Newton method, which is known to exhibit local and second order convergence near a regular solution, but its global behavior is unpredictable. To improve the global properties, some important algorithms [1] for nonlinear equations proceed by minimizing a least square problem: which can be also handled by the Newton method, while Powell [2] gives a counterexample to show a dissatisfactory fact that the iterates generated by the above least square problem may converge to a nonstationary point of .
However, as we all know, there are several difficulties in utilizing the penalty functions as a merit function to test the acceptability of the iterates. Hence, the filter, a new concept first introduced by Fletcher and Leyffer [3] for constrained nonlinear optimization problems in a sequential quadratic programming (SQP) trust-region algorithm, replaces the merit fuctions avoiding the penalty parameter estimation and the difficulties related to the nondifferentiability. Furthermore, Fletcher et al. [4, 5] give the global convergence of the trust-region filter-SQP method, then Ulbrich [6] gets its superlinear local convergence. Consequently, filter method has been actually applied in many optimization techniques, for instance the pattern search method [7], the SLP method [8], the interior method [9], the bundle approaches [10, 11], and so on. Also combined with the trust-region search technique, Gould et al. extended the filter method to the system of nonlinear equations and nonlinear least squares in [12], and to the unconstrained optimization problem with multidimensional filter technique in [13]. In addition, WΓ€chter and Biegler [14, 15] presented line search filter methods for nonlinear equality-constrained programming and the global and local convergence were given.
In fact, filter method exhibits a certain degree of nonmonotonicity. The idea of nonmonotone technique can be traced back to Grippo et al. [16] in 1986, combined with the line search strategy. Due to its excellent numerical exhibition, many nonmonotone techniques have been developed in recent years, for example [17, 18]. Especially in [17], a nonmonotone line search multidimensional filter-SQP method for general nonlinear programming is presented based on the WΓ€chter and Biegler methods [14, 15].
Recently, some other ways were given to attack the problem (1.1) (see [19β23]). There are two common features in these papers; one is the filter approach is utilized, and the other is that the system of nonlinear equations is transformed into a constrained nonlinear programming problem and the equations are divided into two groups; some equations are treated as constraints and the others act as the objective function. And two groups of equations are updated at every iteration in those methods. For instance combined with the filter line search technique [14, 15], the system of nonlinear equations in [23] becomes the following optimization problem with equality constraints: The choice of two sets and are given as follows: for some positive constant , it is defined that , then and .
In this paper we present an algorithm to solve the system of nonlinear equations, combining the nonmonotone technique and line search filter method. We also divide the equations into two groups; one contains the equations that are treated as equality constraints and the square of other equations is regarded as objective function. But different from those methods in [19β23], we just update the two groups at the iterations where it is needed indeed, which can make the scale of the calculation decrease in a certain degree. Another merit of our paper is to employ the nonmonotone idea to the sufficient reduction conditions and filter which leads to a flexibility and acceptance behavior comparable to monotone methods. Moreover, in our algorithm two groups of equations cannot be changed after an f-type iteration, thus in the case that , the two groups are fixed after finite number of iterations. And the filter should not be updated after an f-type iteration, so naturally the global convergence is discussed, respectively, according to whether the number of updated filter is infinite or not. Furthermore, the global convergent property is induced under some reasonable conditions. In the end, numerical experiments show that the method in this paper is effective.
The paper is outlined as follows. In Section 2, we describe and analyze the nonmonotone line search filter method. In Section 3 we prove the global convergence of the proposed algorithm. Finally, some numerical tests are given in Section 4.
2. A Nonmonotone Line Search Filter Algorithm
Throughout this paper, we use the notations and . In addition, we denote the set of indices of those iterations in which the filter has been augmented by .
The linearization of the KKT condition of (1.3) at the th iteration is as follows: where is the Hessian or approximate Hessian matrix of , and . Then the iterate formation is , where is the solution of (2.1) and is a step size chosen by line search.
Now we describe the nonmonotone Armijo rule. Let be a nonnegative integer. For each , let satisfy for . For fixed constants , we might consider a trial point to be acceptable, if it leads to sufficient progress toward either goal, that is, if where , .
For the convenience we set , and . In order to avoid the case of convergence to a feasible but nonoptimal point, we consider the following switching condition: with . If the switching condition holds, the trial point has to satisfy the Armijo nonmonotone reduction condition, where is a fixed constant.
To ensure the algorithm cannot cycle, it maintains a filter, a βtaboo regionβ for each iteration . The filter contains those combinations of constraint violation value and the objective function value , that are prohibited for a successful trial point in iteration . During the line search, a trial point is rejected, if . We then say that the trial point is not acceptable to the current filter, which is also called .
If a trial point satisfies the switching condition (2.3) and the reduction condition (2.4), then this trial point is called an f-type point, and accordingly this iteration is called an f-type iteration. An f-type point should be accepted as with no updating of the filter, that is
While if a trial point does not satisfy the switching condition (2.3), but this trial point satisfies (2.2), we call it an h-type point, or accordingly an h-type iteration. An h-type point should be accepted as with updating of the filter, that is
In some cases it is not possible to find a trial step size that satisfies the above criteria. We approximate a minimum desired step size using linear models of the involved functions. For this, we define If the nonmonotone line search encounters a trial step size with , the algorithm reverts to a feasibility restoration phase. Here, we try to find a new iterate which is acceptable to the current filter and for which (2.2) holds, by reducing the constraint violation with some iterative method.
The corresponding algorithm can be written as follows.
Algorithm 2.1. Step 1. Initialization: choose an initial guess , , , and . Compute , , , , and for . Set , , , and .
Step 2. If then stop.
Step 3. Compute (2.1) to obtain . If there exists no solution to (2.1), go to Step 8. If then stop.
Step 4. Use nonmonotone line search. Set and . βStep 4.1. If , where the is obtained by (2.7), go to Step 8. Otherwise we get . If , go to Step 4.3. βStep 4.2. Check sufficient decrease with respect to current iterate. βStep . If the switching condition (2.3) and the nonmonotone reduction condition (2.4) hold, set and go to Step 5. While only the switching condition (2.3) are satisfied, go to Step 4.3. βStep . The switching conditions (2.3) are not satisfied. If the nonmonotone filter condition (2.2) holds, set , augment the filter using (2.6) and go to Step 6. Otherwise, go to Step 4.3.βStep 4.3. Choose . Let and go to Step 4.1.
Step 5. Set , and . Go to Step 7.
Step 6. Compute and by (1.3). If , set and .
Step 7. Compute , , and . Let and go to Step 2.
Step 8 (restoration stage). Find such that is acceptable to and . Set and augment the filter by (2.6). Let , and go to Step 2.
In a restoration algorithm, the infeasibility is reduced and it is, therefore, desired to decrease the value of . The direct way is to utilize the Newton method or the similar ways to attack . We now give the restoration algorithm.
Restoration Algorithm
βStep R1. Let , , , , , , . βStep R2. If is acceptable to and , then let and stop. βStep R3. Compute
βto get . Let . βStep R4. If , set ; If , set ; otherwise, . Let , be updated to , and go to Step R2.
The above restoration algorithm is an SQP method for . Of course, there are other restoration algorithms, such as the Newton method, interior point restoration algorithm, SLP restoration algorithm, and so on.
3. Global Convergence of Algorithm
In this section, we present a proof of global convergence of Algorithm 2.1. We first state the following assumptions in technical terms.
Assumptions. (A1) All points that are sampled by algorithm lie in a nonempty closed and bounded set .
(A2) The functions , are all twice continuously differentiable on an open set containing .
(A3) There exist two constants such that the matrices sequence satisfies for all and .
(A4) has full column rank and for all with a positive constant .
In the remainder of this section, we will not consider the case where Algorithm 2.1 terminates successfully in Step 2, since in this situation the global convergence is trivial.
Lemma 3.2. Under Assumption A1, there exists the solution to (2.1) with exact (or inexact) line search which satisfies the following descent conditions: where , and are all positive constants independent of .
Proof. By virtue of the Taylor expansion of with , we obtain
where the last inequality can be done by Assumption A1 and . Furthermore, from (2.1) we immediately obtain , that is, . With , thereby,
then the first inequality consequently holds.
According to the Taylor expansion of (i.e., ), we then have
where the last inequality follows from Assumption A1 and . That is to say,
which is just (3.2).
Lemma 3.3. Let be a subsequence of iterates for which (2.3) holds and has the same and . Then there exists some such that
Proof. Because have the same and , it follows that are fixed and by (2.3) is a decent direction. Hence there exists some satisfying (3.7).
Theorem 3.4. Suppose that is an infinite sequence generated by Algorithm 2.1 and , one has namely, every limit point is the solution to (1.1) or a local infeasible point. If the gradients of are linear independent for all and , then the solution to SNE is obtained.
Proof. From , we know the filter updates in a finite number, then there exists , for the filter does not update. As h-type iteration and restoration algorithm all need the updating of the filter, so for our algorithm only follows the f-type iterations. We then have that for all both conditions (2.3) and (2.4) are satisfied for and .
Then by (2.4) we get . We first show that for all , it holds
where . We prove (3.9) by induction.
If , we have . Suppose that the claim is true for , then we consider two cases.
Case 1. If , it is clear that
Case 2. If let . By the fact thatββ, , we have
Moreover, since is bounded below as , we get , that is, . By Lemma 3.3, there exists a such that . Then together with and Assumption A1, we have . From it is easy to obtain that . This completes the proof.
Lemma 3.5. Under Assumptions A1 and A2, if for a positive constant independent of and for all and with , then there exists , so that for all and .
Proof. Choose , then implies that . So we note from (3.2) that
Let , then implies that . So from (3.1), we obtain
We further point a fact according to the definition of filter. If and , , we obtain . Thus from , , and , we have .
Lemma 3.6. If for a positive constant independent of , then there exists a constant , for all and such that
Proof. Let . In view of (3.2), and , we know which shows that the assertion of the lemma follows.
Theorem 3.7. Suppose that is an infinite sequence generated by Algorithm 2.1 and . Then there exists at least one accumulation which is the solution to (1.1) or a local infeasible point. Namely, one has If the gradients of are linear independent for all and , then the solution to (1.1) is obtained.
Proof. We prove that first.
Suppose by contradiction that there exits an infinite subsequence of such that for some . At each iteration , is added to the filter which means that no other can be added to the filter at a later stage within the area:
and the area of the each of these squares is at least .
By Assumption A1 we have . Since and , then associated with the filter are restricted to
Thereby is completely covered by at most a finite number of such areas in contraction to the infinite subsequence satisfying . Therefore, .
By Assumption A1 and , there exits an accumulation point , that is, , . It follows from that
which implies . If , then (3.16) is true. Otherwise, there exists a subsequence of and a constant so that for all ,
The choice of implies
According to , Assumption A1 as well as , we have
Since and as , we obtain
for sufficiently large . Similarly, we have
and thus
for sufficiently large . This means the condition (2.3) is satisfied for sufficiently large . Therefore, the reason for accepting must been that satisfies nonmonotone Armijo condition (2.4). In fact let , then ; by Lemma 3.6 we obtain nonmonotone Armijo condition (2.4) is satisfied. Consequently, the filter is not augmented in iteration which is a contraction to (3.21). The whole proof is completed.
4. Numerical Experiments
In this section, we test our algorithm on some typical test problems. In the whole process, the program is coded in MATLAB and we assume the error tolerance in this paper is always . The selected parameter values are , , , , , and . In the following tables, the notations NIT, NOF, and NOG mean the number of iterates, number of functions, and number of gradients, respectively.
Example 4.1. Find a solution of the nonlinear equations system as follows:
The only solution of Example 4.1 is . Define the line . If the starting point , the Newton method [24] are confined to . We choose two starting points which belong to in the experiments and then the is obtained. Table 1 shows the results.
Example 4.2. Consider the system of nonlinear equations:
The solution to Example 4.2 is . The numerical results of Example 4.2 are given in Table 2.
Example 4.3. Find a solution of the nonlinear equations system:
The unique solution is . It has been proved in [2] that, under initial point , the iterates converge to the point , which is not a stationary point. Utilizing our algorithm, a sequence of points converging to is obtained. The detailed numerical results for Example 4.3 are listed in Table 3.
Example 4.4. Consider the following system of nonlinear equations:
There are three solutions of above example, , , and . The numerical results of Example 4.4 are given in Table 4.
Example 4.5. Consider the system of nonlinear equations: with the initial point , . The solution to Example 4.5 is . The numerical results of Example 4.5 are given in Table 5.
Refer to these above problems, running the Algorithm 2.1 with different starting points yields the results in the corresponding tables, which, summarized, show that our proposed algorithm is practical and effective. From the computation efficiency, we should point out our algorithm is competitive with the method in [22]. The results in Table 5 in fact show that our method also succeeds well to solve the cases when more equations are active.
Constrained optimization approaches attacking the system of nonlinear equations are exceedingly interesting and are further developed by using the nonmonotone line search filter strategy in this paper. Moreover, the local property of the algorithm is a further topic of interest.
Acknowledgment
The research is supported by the National Natural Science Foundation of China (no. 11126060) and Science & Technology Program of Shanghai Maritime University (no. 20120060).