Abstract

Schubert’s method is an extension of Broyden’s method for solving sparse nonlinear equations, which can preserve the zero-nonzero structure defined by the sparse Jacobian matrix and can retain many good properties of Broyden’s method. In particular, Schubert’s method has been proved to be locally and q-superlinearly convergent. In this paper, we globalize Schubert’s method by using a nonmonotone line search. Under appropriate conditions, we show that the proposed algorithm converges globally and superlinearly. Some preliminary numerical experiments are presented, which demonstrate that our algorithm is effective for large-scale problems.

1. Introduction

In this paper, we consider the quasi-Newton method [1] for solving the general nonlinear equation where the function is continuously differentiable. The ordinary quasi-Newton method for solving (1) generates a sequence by the following iterative scheme: where the quasi-Newton direction is obtained by solving the system of linear equations Here the matrix is an approximation to the Jacobian matrix of at which usually satisfies the quasi-Newton condition (i.e., the secant condition): where and . The matrix can be updated by different quasi-Newton update formulae. However, quasi-Newton method is not applicable for solving large-scale problems due to the density of . Fortunately, a large-scale problem usually has the property of sparsity, and then it is natural to extend some known quasi-Newton methods based on this property. Early in 1970, Schubert [2] modified Broyden’s method and proposed a sparse Broyden’s method, that is, the Schubert’s method [3] with defined in the following way: where where and are the sparsity patterns of the Jacobian matrix and denotes the th column of the identity matrix. It has been proved by Broyden [4] that the Schubert’s method is locally convergent when the Jacobian satisfies a Lipschitz condition. Lam [5] further showed the local and superlinear convergence of Schubert’s method in the special case when , at each iteration. As an improvement, Marwil considered the following updated formula: where, for a scalar , Marwil established stronger local and superlinear convergence, which contains the results in [5] as a special case.

Note that the updated formula (7) is not symmetric; therefore, its use is restricted to problems where the symmetry of the updated matrix is not important. The sparse and symmetric quasi-Newton update has attracted much attention [68]. Toint [6] and Fletcher [8] previously proposed symmetric updates which met the sparsity and secant conditions simultaneously. Yamashita [9] proposed a new sparse quasi-Newton update, called Matrix Completion Quasi-Newton (MCQN) [10], which exploited the sparsity of the Hessian and guaranteed positive definiteness.

So far, most studies in the convergence of sparse quasi-Newton methods have focused on their local behaviors. Seldom studies are concerned with the global convergence of those methods. It is a relatively more difficult research topic than optimization. To the author’s knowledge, the main work related to the general global convergence of sparse quasi-Newton methods is based on the work [1012]. The purpose of this paper is to study the global and superlinear convergence of the Schubert’s method.

The remainder of this paper is organized as follows. In Section 2, we review some properties of Schubert’s method. In Section 3, by using a nonmonotone line search [13], we globalize the Schubert’s method and prove its global and superlinear convergence under appropriate conditions. In particular, we will show that after finitely many iterations, the unit step length will be accepted. In Section 4, some preliminary numerical results are presented. Finally, we provide some remarks in Section 5.

2. Schubert’s Update

In this section, we present some useful properties of the Schubert’s update. For the sake of convenience, we introduce some notations.(1)The subspace that identifies the sparsity structure of the th row of Jacobian matrix is defined as ;(2) denotes the sparsity structure of the Jacobian matrix and it is defined as ;(3)the projection operators , project vectors onto the subspaces , , which particularly makes ;(4).

Schubert’s update (7) is the unique solution to the following minimization problem [3]: Specifically, the following inequality [3] holds for any , with :

The following theorem states the local and superlinear convergence of Schubert’s method, which has been proved by Marwil [3].

Theorem 1. Suppose that satisfies the following conditions. (1) is continuously differentiable in an open convex set .(2)There exists an such that and is nonsingular.(3)There exists with for , such that Then there exist constants , and a nonsingular matrix such that if satisfies and , then (i)Schubert’s method generates with nonsingular for all ;(ii)the sequence converges to ;(iii)the convergence is superlinear.

It is noticed that the matrix determined by (7) may be singular even if is nonsingular. To overcome such a difficulty, Marwil [3] proposed a nonsingular schubert’s method by using Here, we have omitted the subscript and used to represent and to represent . The parameters are chosen so that is nonsingular when is nonsingular and the details are given below.

Set and define for as follows: Note that , and then For a scalar , can be chosen to satisfy Therefore, , and can be chosen so that

The dependence of on the iteration is suppressed, but is independent of [3].

3. The Algorithm

In this section, we will globalize Schubert’s method. To this end, we introduce a derivative-free line search proposed by Li and Fukushima [13] to determine a step length .

Algorithm 2. Given constants and . Let , where is the smallest nonnegative integer such that and is a given positive sequence satisfying
It is not difficult to see that Algorithm 2 is well defined. Moreover, for each , we have By using Algorithm 2, we give our algorithm as follows.

Algorithm 3. Consider the following.
Step 0. Given constants , , select a positive sequence satisfying (18). Then choose an initial point and a nonsingular matrix . Let .
Step 1. Stop if . Otherwise, solve the following system of linear equations to get .
Step 2. If then let and go to Step 4. Else, go to Step 3.
Step 3. Let be determined by Algorithm 2.
Step 4. Set .
Step 5. Compute by (12).
Step 6. Set . Go to Step 1.
We then show some useful properties of Algorithm 3.

Lemma 4. Let the level set be bounded and let be generated by Algorithm 3. Then is contained in . Moreover, it holds that

Proof. By the line search (17), we have for any This implies that the sequence generated by Algorithm 3 is contained in and the sequence is bounded. Moreover, combined with (17) and (21), we can get for each where . Making summation on both sides for from 0 to , we obtain (22).

Similar to Lemma 2.4 in [13], we have the following result.

Lemma 5. Let the level set be bounded and let be generated by Algorithm 3. Then the sequence is convergent.

In order to establish the global convergence of Algorithm 3, we need the following assumptions.

Assumption 6. (i) The level set defined in Lemma 4 is bounded.
(ii) is continuously differentiable in an open set , and there exists an such that and is nonsingular.
(iii) is Lipschitz continuous on ; that is, there exists an such that
(iv) is nonsingular for any .

Assumption 6 (iv) is the same as that in [13], which is not as strong as the assumption adopted in [14], where the uniform nonsingularity of is assumed.

We first introduce some notations. Define and then we have . Let

The following lemma is an extension of Lemma 2.5 in [13].

Lemma 7. Let the sequence be generated by Algorithm 3. Suppose that the conditions (i)–(iii) in Assumption 6 hold. If then one has In particular, there is a subsequence of tending to zero. If then one has In particular, the whole sequence converges to zero.

Proof. By the Lipschitz continuity of , we have Denote According to the updated (12), we have Subtracting from both sides of the above equality gives Taking norms yields Making summation on both sides, , yields Then it follows that According to Lemma 2.5 of [13], we get Moreover, for each we have This completes the proof.

According to Algorithm 3, we have the following lemma.

Lemma 8. Let be generated by Algorithm 3. If there are finitely many for which is determined by (21), then one has In particular, there is an infinity index set such that, for each , the subsequence converges to zero.

Proof. Since there are finitely many for which is determined by (21), we can know that there exists an index such that, for , This implies According to Lemma 7, we can easily prove the result.

Lemma 9. Suppose that and that there is an accumulation point of at which is nonsingular. Then there exists a constant such that the following inequality holds for all sufficiently large:

Proof. Without loss of generality, we suppose . Since , it is clear that when is sufficiently large, is nonsingular. Moreover, there is a constant such that the inequality holds for all sufficiently large. It then follows from Lemma 8 that . Therefore, there exists an index such that the inequality holds for all with . Consequently, we get from the definition of that for all with sufficiently large The last inequality implies (44) with .

We show the global convergence of Algorithm 3 in the following section.

Theorem 10. Let Assumption 6 hold and let index set be specified by Lemma 8. Then the sequence converges to the unique solution of (1).

Proof. We first verify If there are infinitely many for which is determined by (21), then holds for infinitely many . Let be the index set for which (21) holds and let be the number of index , where and . Then we can know that , when . For each , we have , and then for all sufficiently large we have where . This implies , and hence the conclusion is true.
If there are finitely many for which is determined by (21), by Lemma 8, there exists a subsequence that converges to zero. Similar to the proof of Lemma 9, it is not difficult to show that (44) holds for all sufficiently large, where denotes the index set of , . In particular, the subsequence is bounded. Without loss of generality, we suppose that converges to some .
Denote . It is clear that . If , then , and hence it follows from (20) that . Suppose , or equivalently . By the line search rule, when is sufficiently large, and hence Multiplying both sides by and then taking limit as with , we obtain On the other hand, taking the limit in (20) as with yields This together with (49) implies .

In the latter part of this section, we give the superlinear convergence of Algorithm 3.

Theorem 11. Let the conditions in Theorem 10 hold. Then there exist a constant and an index such that whenever and , the inequality holds for all and .

Proof. By Theorem 10, converges to the solution of (1), say, , and there exists a constant such that for all sufficiently large. Similar to the proof of Lemma 9, when is large enough we can show that And from (20) we have and this implies where is an upper bound of in . The second and third inequalities follow from the definition of and (52), respectively. It then follows that On the other hand, by the nonsingularity of and the fact that , there is a constant such that holds for all sufficiently large. Therefore, we deduce from (52) and (56) that when
Let . Then we know that when , (51) holds for all sufficient large.

The following theorem establishes the superlinear convergence of Algorithm 3.

Theorem 12. Let the conditions in Theorem 10 hold. Then the sequence generated by Algorithm 3 converges to the unique solution of (1) superlinearly.

Proof. By Theorem 11, it suffices to show as .
Let and be as specified by Theorem 11. It follows from Lemma 8 that there is an index such that the following inequality holds for all : This shows that, for any , there are at least many indices such that . Let . By Theorem 11, for any , there are at least many indices such that and Let be the set of indices for which (21) holds and let be the number of elements in . Then . On the other hand, for each , we have Multiplying inequalities (21) with and (60) with , we can obtain for any or equivalently So, we have This together with (56) implies It then follows from Lemma 7 that as . The proof is completed.

4. Numerical Experiments

In this section, we will present some numerical results to show the efficiency of Algorithm 3 for a class of sparse nonlinear equations.

In each experiment, we employ the following termination criterion: The parameters in Algorithm 3 are specified as follows: see [13].

The numerical experiments are done by using MATLAB version 7.10 on a Core (TM) 2 PC with Windows XP. The details of the problems are given as follows, where denotes the initial point.

Problem 1 (Broyden tridiagonal function [15]). The elements of are

Problem 2 (Trigexp function [16]). The elements of are

Problem 3 (tridiagonal exponential problem [17]). The elements of are

Problem 4 (discrete boundary value problem [18]). The elements of are

Problem 5 (exponentional problem 1 [19]). The elements of are

Problem 6 (exponentional problem 2 [19]). The elements of are

Problem 7 ( penalty I function [19]). The elements of are

Problem 8 (exponential function [19]). The elements of are

Problem 9 (minimal function [19]). The elements of are

Problem 10 (extended Rosenbrock function ( is even) [20]). The elements of are

Problem 11 (logarithmic function [19]). The elements of are

Problem 12 (strictly convex function 1 [21]). is the gradient of . The elements of are

Problem 13 (tridimensional valley function ( is a multiple of 3) [22]). The elements of are and then denote

Problem 14 (extended Freudenstein and Roth function ( is even) [17]). The elements of are

The sparsity patterns of most of the problems are tridiagonal and the dimension of problems varies from 50 to 20000. The results are given in Tables 1 and 2, and each column is specified as follows:Pro: the problem;: the dimension of the problem;: the initial Euclidean norms of ;: the final Euclidean norms of ;Iter: the total number of iterations;Times: the CPU time in second.

It can be seen from the tables that, for all tested problems, Algorithm 3 terminated successfully. The numerical results show that Algorithm 3 becomes increasingly desirable as increases.

In Tables 1 and 2, we list the results of Algorithm 3 for solving Problems 1 to 14 with . Because is very important for the performance of Broyden’s method, we also present the results with . We can see that Algorithm 3 can be applied to solve a class of nonlinear equations, where the dimension of which can be up to 20000. Since the Schubert’s update formula (7) can maintain the sparsity pattern of Jacobian matrix exactly, so Algorithm 3 is especially effective for solving large-scale nonlinear equations with sparse Jacobian matrix, such as tridiagonal or block diagonal Jacobian matrix.

5. Remarks

In this paper, based on the work of Schubert, Broyden, and Marwil, we have globalized Schubert’s method and proposed a global algorithm by using a nonmonotone line search. We have established the global and superlinear convergence. Numerical results showed that the algorithm is especially effective for large-scale problems.

Conflict of Interests

The author declares that there is no conflict of interests regarding to the publication of this paper.

Acknowledgment

The work is supported by the National Science Foundation of China through Grant 11371154.