Abstract
The Jacobi–Davidson iteration method is very efficient in solving Hermitian eigenvalue problems. If the correction equation involved in the Jacobi–Davidson iteration is solved accurately, the simplified Jacobi–Davidson iteration is equivalent to the Rayleigh quotient iteration which achieves cubic convergence rate locally. When the involved linear system is solved by an iteration method, these two methods are also equivalent. In this paper, we present the convergence analysis of the simplified Jacobi–Davidson method and present the estimate of iteration numbers of the inner correction equation. Furthermore, based on the convergence factor, we can see how the accuracy of the inner iteration controls the outer iteration.
1. Introduction
Let be a sparse and Hermitian matrix. Then, we are supposed to compute the smallest eigenvalue of and the associated eigenvector of with large, i.e.,
Here and in the following, indicates the induced Euclidean norm for either a vector or a matrix. In the literature, there have been many methods developed involving gradienttype methods and subspace methods. The Lanczos, Arnoldi, Davidson, and Jacobi–Davidson methods are classical and effective methods for solving eigenproblem (1), and there have been many convergence results developed for these methods. It should be mentioned that, for the Lanczos and Arnoldi methods, the involved projection subspace should be restricted to the Krylov subspace. For more details about these methods as well as their variants, refer to in [1–7].
The main framework for the subspace methods is generating a sequence of enlarging subspaces , which contain more and more information for the desired eigenvalue or eigenvector of matrix . The central problem for this method, which can be accomplished by the Rayleigh–Ritz procedure, is to extract the approximation to the desired eigenvalue or eigenvector from the projection subspace. For more details of the Rayleigh–Ritz procedure, refer to in [4].
As we know, for Hermitian matrices, the exact Rayleigh quotient iteration (RQI) [4] converges cubically. However, an illconditioned linear system of equations should be solved exactly which is expensive in each step of the iteration as the approximation is close to the target eigenvalue. The idea replacing the exact solution by a cheaper approximate solution results in an inexact Rayleigh quotient iteration (IRQI) [8, 9]; however, this replacement may destroy the local convergence property of the RQI.
The Jacobi–Davidson (JD) iteration method [6] proposed by Sleijpen and van der Vorst can overcome these difficulties. A socalled correction equationwhere is the current approximation to the desired eigenvector with norm unity, is the Rayleigh quotient, and is the current residual, is solved to expand the projection subspace which contains more and more information of the desired eigenvector theoretically. As we know, if (2) is solved exactly, the JD method converges as fast as the RQI method. However, what we are interested in is the iterative solver with an ideal preconditioner for it. Although the shifted matrix becomes illconditioned as the approximation is near to the desired eigenvalue, correction equation (2) remains well conditioned, thanks to the projection onto the orthogonal complement of . As Notay in [10] pointed out, with proper implementations, the potential indefiniteness of the coefficient matrix in system (2) cannot spoil the method.
At present, we have not seen the analysis of the connection between the solution of the correction equation and the convergence of the approximate vector towards the wanted eigenvector. In this paper, we study the convergence of the simplified JD iteration. Through the convergence factor, we try to analyze how the inner linear system controls the outer iteration. Moreover, we can also see that, under some assumptions, the JD method attains quadratic convergence rate locally. Then, we can gain higher convergence rate through increasing the accuracy precision of the inner linear system. In the last part of this paper, we give the convergence analysis in terms of the residual norms, from which we can see that these results are asymptotically identical to those derived in the former section; we also gave the analysis on the iteration number of the inner linear system.
We should mention that some analyses for the JD method or, more generally, the Newton method have been established so far; see, e.g., [7, 11–13] and the references therein. In particular, Bai and Miao in [11] presented the convergence of the JD iteration method, and they proved that the JD iteration method attains quadratic convergence locally when the involved correction equation is solved by a Krylov subspace method and attains cubic convergence rate when the correction equation is solved to a prescribed precision proportional to the norm of the current residual vector. In this paper, we will use a completely different technique to demonstrate the convergence of the JD method from another point of view. In addition, we have further studied the connection between the iteration steps of the inner correction equation with the convergence of the outer iteration.
This paper is organized as follows. In Section 2, we give some preliminaries of the JD method. In Section 3, we give some known results and then built several new results concerning the convergence property of the simplified JD method whose involved correction equation is inexactly solved by Krylov solvers. In Section 4, we give the estimate on the iteration number of the inner linear system. Finally, we give some concluding remarks.
2. Preliminaries
Let be the eigenvectors of the Hermitian matrix associated with the eigenvalues with the ascending order . In the following discussion, we want to compute the eigenvector associated with the simple smallest eigenvalue . Denote by the angle between the current approximation and the wanted eigenvector .
We first present the algorithmic description of the simplified JD method in Algorithm 1.

As we know, if the correction equation in Algorithm 1 is solved accurately, the simplified JD method is equivalent to the RQI method. In fact, according to (2), we have or, equivalently, . We can see that the new approximation is the Rayleigh quotient iteration vector. In fact, Theorem 4.2 in [14] tells us that the inexact simplified JD method and the IRQI method can also be equivalent if the inner linear system is solved by Krylov subspace methods. Here, the ‘inexact’ method means that the inner linear system is solved by an iteration method.
Based on the property of the correction equation, the inner linear system in Algorithm 1 is solved approximately by Krylov subspace methods, and we will use the obtained vector to update the current eigenvector approximation.
Suppose that we have obtained an approximate eigenvector which is close to the wanted eigenvector in Algorithm 1; we decompose it in the following way:where with and is the angle between vectors and . Also, in this paper, the approximate vector is close to the wanted eigenvector which means .
In the following, we give a lemma which reveals the relation between two Krylov subspaces.
Lemma 2.1. Let be a normalized vector, , and . Denote by and . Then, for , the two Krylov subspaces and have the following relation:whereand
Proof. We prove this lemma by induction over . Obviously, for , . Assume that, for all , we have . Next, we will prove that this relation satisfies for . Denote ; based on the fact and the induction hypothesis, we have . Then, we have . Thus, we have verified the relation .
3. Convergence Analysis of the JD Iteration
In the following discussion, we solve the correction equation approximately by Krylov subspace methods, such as CG or MINRES with the initial vector being zero, to obtain an approximate solution to update the new eigenvector approximation . That is to say, solution satisfies the following equation:where is the residual at step .
In order to analyze conveniently, (7) can also be represented by other equivalent forms such as (30).
As we know, the JD method is one of the “innerouter” type iterations. In the method of this type, it is essential to know how the inner iteration controls the outer iteration. In other words, we want to know how accuracy or how many steps should be solved for the inner iteration equation to ensure the convergence of the outer iteration. As Sleijpen and van der Vorst pointed out in [6], we cannot answer the question at present. All these problems may be based on the convergence analysis of the algorithm. In the following, we first give some convergence results.
Lemma 3.1. (see [7]). If Algorithm 1 is applied to seek the smallest simple eigenvalue of the Hermitian matrix and we assume that the approximate solution satisfies the relation in (30), then for , we asymptotically have
Lemma 3.2. (see [13]). If Algorithm 1 is applied to seek the smallest simple eigenvalue of the Hermitian matrix and we assume that the approximate solution satisfies the relation in (30), then for , we asymptotically have
The two convergence results above are analysed based on the angle between vectors and . Proposition 1 in [15] gives us another result for the convergence analysis based on the metric of . However, all these results cannot answer the question asked above, and we could not answer how the convergence order varies as the inner correction equation is being solved.
Theorem 3.1. Let be a simple eigenpair of the Hermitian matrix in Algorithm 1, be the eigenpair of the shifted matrix with the ascending order , be the approximate solution obtained by using the unpreconditioned Krylov subspace method with zero starting vector, and be the associated residual. Then, the following estimate holds:where is the angle between the new approximation and and .
Proof. Suppose that is the current approximation to the wanted eigenvector , and we decompose it in the way of (3). If we solve the correction equation inexactly with zero starting vector by the unpreconditioned Krylov subspace method, then the approximate solution of the linear system in step is in the Krylov subspace , where and are defined as Lemma 2.1. Based on Lemma 2.1, we have for . Then, the solution , and we get the new approximationwhere .
The residual of the correction equation at step iswith and .
According to the decomposition in (3), we haveThus, we obtainNote that ; that is, ; then,Based on (3) and (12), we getThus, we haveIn addition, it obviously holds thatThereby, based on (15), (17), (18), and (19), we have the estimate
Theorem 3.1 gives us a preliminary convergence analysis of the simplified JD method. Since the simplified JD method is a JD method without subspace acceleration, the convergence factor of the former is an upper bound of the latter. That is to say, in order to gain the convergence property of the JD method, we can analyze its simplified form.
Next, we analyze the convergence factor of Theorem 3.1. The current approximation is very near to the wanted vector ; that is to say, with being a constant smaller than one. Combining with the fact , it is clear that the second term of the convergence factor can be bounded; then, it is the first term which plays a vital role on the analysis of the convergence.
For the purpose of analysis, we present the following lemma.
Lemma 3.3. Let be the orthonormal eigenvectors of Hermitian matrix . Assume that is a normalized vector which is near to such that the corresponding Rayleigh quotient satisfies
Then, the following estimate holds:where is the angle between vectors and and represents the smallest singular value of restricted to the subspace span .
Proof. with , we decompose and as and , where . Then, we haveIt follows from that ; equivalently, we have . Moreover, we haveThe first assumption in (21) indicates that and hold; then, we have the estimateHere, we use the fact . Combining the above estimate with the second assumption in (21), we frequently obtain the estimate in (22).
In the following, we give an estimate of with .
Thus, combining (26) and the correct equation in (7), we have
It further indicates thatand, at last, we obtain
To see the behavior of the convergence for the JD method clearly, the stopping criterion we adopt is that the norm of the current residual is reduced by a factor from that of the initial residual. That is, satisfies the following equation:where is the residual direction and is the stopping factor.
Theorem 3.2. Given , , if , then the JD method converges linearly as follows:under the assumption
If , then the JD method converges quadratically as follows:under the assumptionwhere , ,and
Proof. According to the properties of the Krylov subspace method, e.g., the conjugate gradient method, we haveSpecially, we have ; combining with the factorization in (3), we haveThen, we further haveBy straightforward computations, we havewhich implies thatIf , according to the estimate of , we getThereby, we obtainOn the contrary, under the assumption in (3), we haveThus, combining with the estimate in (10), we have the following estimate:Similar to the above proof, if , according to the estimate of , we getNote thatand combining with the assumption in (33), we obtainUtilizing the estimate in (10) again, we have the following estimate:
Note that the assumptions in (32) and (34) will easily be satisfied if is not very small because the right terms of the two assumptions have the factor , which would be small if the current approximate eigenvector is near to the desired one.
From Theorem 3.1, we cannot fully understand the convergence of the JD method because it just gives us a preliminary convergence analysis and includes some unknown factors to be explored. Thus, we further explored these unknown factors in Theorem 3.2 and established the convergence of the JD method, from which we can see clearly how the inner correction equation controls the convergence of the outer iteration.
In addition, from Theorem 3.2, we can see that the JD method converges linearly if the accuracy of the correction equation is roughly and converges quadratically with the accuracy of the correction equation being . Moreover, observing the convergence factor, we can see that the method can gain cubic convergence rate ideally.
4. Estimate for the Iteration Number
In this section, we first give the bounds of the residual norms of the JD iteration method. Through this bound, we may analyze the relation between the outer iteration and the inner iteration; more clearly speaking, we can see how the inner iteration controls the convergence property of the outer iteration.
Theorem 4.1. Let be the approximate eigenpair obtained by Algorithm (1) with , , and the residual . If we solve the JD correction equation by the Krylov subspace method with the zero initial vector, then we get the following estimate:where and is the new approximation defined as (11).
Proof. Using the minimal residual property of Ritz values (Fact 1.9 in [4]), we haveBased on the relation in (12) of the residual equation, we haveBy making use of the estimate of , we get
From the above theorem, we can choose a considerate stopping factor to obtain a higher convergence rate as follows.
Corollary 4.1. Under the assumption of Theorem 4.1, if the stopping factor satisfies (equivalent to ) with , then the following relation holds:
Next, we give a rough estimate for ; based on and , we havewhich further indicates that
We can see that, under some assumptions, e.g., and , the asymptotical convergence properties of Theorem 4.1 and Corollary 4.1 are identical to those of Theorem 3.2.
Theorem 4.1 provides us with an asymptotical description of the convergence property from the point of view of the residual norm. We can see that, at the starting iterate, i.e., is not very small relative to the stopping precision, the stopping factor need not be very small; we can obtain a fast convergence rate; that is to say, it only needs a few iterate numbers for the inner iteration. As the approximate eigenvector is near to the desired eigenvector , we need a relative high accuracy for the inner iteration to gain an ideal convergence rate.
In theory, we can give an estimate whether the JD method converges or not through the factor . From the above factor, we can see that it is and the inner residual factor which determine whether the outer residual norm decreases or not. So, we can use as the inner iteration stopping criterion. Perhaps, we can also estimate the convergence property through judging the norm of the new approximation .
In the following discussion, let us give an estimate on the iteration number roughly by using the conjugate gradient method to solve the JD correction equation.
First, we illustrate that it is reasonable to solve the correction equation by using the conjugate gradient method as the approximation is close to the desired vector .
Lemma 4.1. For any , the following inequality satisfieswhere .
From the above lemma, we can see that if , which is satisfied under the assumption , is positive definite in the subspace span .
Next, we give a simple convergence property of the conjugate gradient method for the linear equation with being positive definite.
Lemma 4.2. Solving the symmetric positive definite linear system of equations by applying the conjugate gradient algorithm, the following estimate holds for the residual norm:where is the approximate solution obtained at the th step of the conjugate gradient algorithm, is the corresponding residual, and is the condition number of .
Proof. According to Theorem 6.6 in [5], we havewhere is the exact solution and is the norm defined as .
Based on the minimax theorem, we know thatand by straightforward computations, we frequently obtainThus, the residual norm has the following estimate:which completed the proof of this theorem.
Next, we will combine Theorem 4.1 with Lemma 4.2 to give an estimate on the iteration number of the correction equation roughly.
In fact, the coefficient operator of the correction equation can be seen as the operator restricted to the subspace , which is denoted by .
Assume that, at the current outer iterate, all approximations obtained by the conjugate gradient method satisfy for , where is the maximum iteration number of the correction equation.
Theorem 4.2. In the JD method, solving the correction equation by applying the conjugate gradient method with the zero initial vector, the iteration number of the correction equation should satisfywhere is the condition number of the matrix to ensure the convergence of the algorithm.
Proof. By assumption, the convergence factor in Theorem 4.1 satisfiesThus, if , that is,the JD method converges. It indicates that the residual norm of the correction equation satisfiesBased on Lemma 4.2, we know that if , the JD method converges. By straightforward computations, we frequently get
We remark that the convergence theories in Theorem 4.1 and Corollary 4.1 are identical to those of Theorem 3.2 to interpret the convergence of the JD method from the point of view of the residual norm. It indicates that, at the start of the iteration process, the inner correction equation only needs to be solved with a small number of iterations; however, once we obtain a good approximate eigenvector, the inner correction equation is recommended to be solved with a high accuracy.
From Theorem 4.2, we can see that the iteration number of the correction equation solved by the CG method, which ensures the decrease of the outer iteration, depends on . If does not shake vigorously, the inner iteration number will be roughly a constant as the outer iteration proceeds.
5. Concluding Remarks
We have proved that the inexact simplified Jacobi–Davidson iteration method for Hermitian eigenvalue problems can attain cubic convergence rate locally, and it is asymptotically convergent as fast as the Rayleigh quotient iteration. Thus, both exact and inexact simplified Jacobi–Davidson methods are competitive with the exact and inexact Rayleigh quotient iterations. Moreover, we give an estimate of iteration numbers of the inner correction equation. Based on these theoretical results, we can see clearly how the accuracy of the inner correction equation controls the outer iteration.
Data Availability
No data were used to support this study.
Conflicts of Interest
The authors declare no conflicts of interest.
Acknowledgments
The authors are grateful to Dr. Cunqiang Miao for his helpful discussions. This work was supported by the National Science Foundation of China (nos. 1206030 and 11661031) and Key Project of Teaching Reform in Shanxi Province of China (no. J2015111).