About this Journal Submit a Manuscript Table of Contents
Journal of Electrical and Computer Engineering
Volume 2010 (2010), Article ID 972794, 33 pages
http://dx.doi.org/10.1155/2010/972794
Review Article

Milestones in the Development of Iterative Solution Methods

Institute of Geonics AS CR, 70800 Ostrava, Czech Republic

Received 10 June 2010; Accepted 27 August 2010

Academic Editor: Christian Schlegel

Copyright © 2010 Owe Axelsson. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Abstract

Iterative solution methods to solve linear systems of equations were originally formulated as basic iteration methods of defect-correction type, commonly referred to as Richardson's iteration method. These methods developed further into various versions of splitting methods, including the successive overrelaxation (SOR) method. Later, immensely important developments included convergence acceleration methods, such as the Chebyshev and conjugate gradient iteration methods and preconditioning methods of various forms. A major strive has been to find methods with a total computational complexity of optimal order, that is, proportional to the degrees of freedom involved in the equation. Methods that have turned out to have been particularly important for the further developments of linear equation solvers are surveyed. Some of them are presented in greater detail.

1. Introduction

In many applications of quite different types appearing in various sciences, engineering, and finance, large-scale linear algebraic systems of equations arise. A particular type of problems appear in signal processing. This also includes nonlinear systems of equation, which are normally solved by linearization at each outer nonlinear iteration step, but they will not be further discussed in this paper.

Due to their high demand of computer memory and computer time, which can grow rapidly with increasing problem size, direct solution methods, such as Gaussian elimination, are in general not feasible unless the size of the problem is relatively small. In the early computer age, when available size of computer central memories was very small and the speed of arithmetic operations slow, this was found to be the case even for quite modest-sized problems.

Even for modern computers with exceedingly large memories and very fast arithmetics it is still an issue because nowadays one wants to solve much more involved problems of much larger sizes, for instance to enable a sufficient resolution of partial differential equation problems with highly varying (material) coefficients, such as is found in heterogeneous media. Presently problems with up to billions of degrees of freedom (d.o.f.) are solved. For instance, if an elliptic equation of elasticity type is discretized and solved on a 3D mesh with 512 meshpoints in each coordinate direction, then an equation of that size arises.

A basic iteration method to solve a linear system 𝐴 𝐱 = 𝐛 , ( 1 ) where 𝐴 is nonsingular, has the following form.

Given an initial approximation 𝐱 0 , for 𝑘 = 0 , 1 , until convergence, let 𝐫 𝑘 = 𝐴 𝐱 𝑘 𝐛 , 𝐞 𝑘 = 𝜏 𝐫 𝑘 , 𝐱 𝑘 + 1 = 𝐱 𝑘 + 𝐞 𝑘 . Here, 𝜏 > 0 is a parameter to be chosen.

This method can be described either as a defect ( 𝐫 𝑘 ) —correction ( 𝐞 𝑘 ) method or, alternatively, as a method to compute the stationary solution of the evolution equation ( 𝑡 ) 𝑑 𝐱 ( 𝑡 ) ( 0 ) 𝑑 𝑡 = 𝐴 𝐱 𝐛 , 𝑡 > 0 , 𝐱 = 𝐱 0 , ( 2 ) by timestepping with time-step 𝜏 , that is, 𝐱 ( ) ( 𝑡 ) ( ( 𝑡 ) ) 𝑡 + 𝜏 = 𝐱 𝜏 𝐴 𝐱 𝐛 , 𝑡 = 0 , 𝜏 , . ( 3 )

Such methods are commonly referred to as Richardson iteration methods (e.g., see [14]). However, already in 1823 Gauss [5] wrote, “Fast jeden Abend mache ich eine neue Auflage des Tableau, wo immer leicht nachzuhelfen ist. Bei der Einförmigkeit des Messungsgeschäfts gibt dies immer eine angenehme Unterhaltung; man sieht daran auch immer gleich, ob etwas Zweifelhaftes eingeschlichen ist, was noch wünschenswert bleibt usw. Ich empfehle Ihnen diesen Modus zur Nachahmung. Schwerlich werden Sie je wieder direct eliminieren, wenigstens nicht, wenn Sie mehr als zwei Unbekannte haben. Das indirecte Verfahren läßt sich halb im Schlafe ausführen oder man kann während desselben an andere Dingen denken.” (Freely translated, “I recommend this modus operandi. You will hardly eliminate directly anymore, at least not when you have more than two unknowns. The indirect method can be pursued while half asleep or while thinking about other things.”)

It holds that 𝐱 𝑘 + 1 = 𝐱 𝑘 󶀢 𝜏 𝐴 𝐱 𝑘 󶀲 , 𝐛 ( 4 ) or 𝐞 𝑘 + 1 = ( ) 𝐞 𝐼 𝜏 𝐴 𝑘 , ( 5 ) where 𝐞 𝑘 = 𝐱 𝐱 𝑘 is the iteration error and 𝐱 is the solution of (1).

Hence, 𝐞 𝑘 = ( ) 𝐼 𝜏 𝐴 𝑘 𝐞 0 , 𝑘 = 0 , 1 , . ( 6 )

For convergence of the method, that is 𝐞 𝑘 0 , the parameter 𝜏 must in general be chosen such that 𝜌 = 𝐼 𝜏 𝐴 < 1 , where is a matrix norm, subordinate to the chosen vector norm. (We remark here that this is not possible if 𝐴 is indefinite.)

Let 𝜌 ( ) denote the spectral radius of a matrix, that is, the maximal absolute value of the eigenvalues of the matrix.

If 𝐴 is self–adjoint, then it can be shown that 𝜌 ( 𝐴 ) = 𝐴 2 = 󵀆 𝜌 ( 𝐴 𝐴 ) , where 2 denotes the matrix norm subordinate to the Euclidian vector norm. For general, nonsymmetric matrices it has been shown (e.g., see [6, page 162]) that there exist matrix norms that are arbitrarily close to the spectral radius. These can, however, correspond to an unnatural scaling of the matrix.

The rate of convergence is determined by the convergence factor 𝜌 . For symmetric positive definite matrices, the optimal value of 𝜏 to minimize 𝜌 , is 𝜏 = 2 / ( 𝜆 1 + 𝜆 𝑛 ) , where 𝜆 1 , 𝜆 𝑛 are the extreme eigenvalues of 𝐴 . Normally, however, the eigenvalues are not available.

As an example, for second order elliptic diffusion type of problems in Ω 𝑑 ( 𝑑 = 2 , 3 ) using a standard central difference or a finite element method, the spectral condition number 𝜆 𝑛 / 𝜆 1 = 𝑂 ( 2 ) , where is the (constant) meshsize parameter. Hence, the number of iterations to reach a relative accuracy 𝜀 is of order 𝑂 ( 2 ) | l o g 𝜀 | ) , 0 .

Since each iteration uses 𝑂 ( 𝑑 ) elementary arithmetic operations, this shows that the total number of operations needed to reduce the error to a given tolerance is of order 𝑂 ( 𝑑 2 ) . This is in general smaller than for a direct solution method when 𝑑 2 , but still far from the optimal order, 𝑂 ( 𝑑 ) , that we aim at.

To improve on this, often a splitting of the matrix 𝐴 is used. It is readily shown that for any initial vector, the number of iterations required to get a relative residual, 𝐫 𝑘 / 𝐫 0 < 𝜀 , for some 𝜀 , 0 < 𝜀 < 1 , is at most 𝑘 𝑖 𝑡 = l n ( 1 / 𝜀 ) / l n ( 1 / 𝜌 ) + 1 , where denotes the integer part. Frequently, 𝜌 = 1 𝑐 𝛿 𝑟 , where 𝑐 is a constant, 𝑟 is a positive integer, often 𝑟 = 2 and 𝛿 is a small number, typically 𝛿 = 1 / 𝑛 , which decreases with increasing problems size 𝑛 . This implies, that the number of iterations is propotional to ( 1 / 𝛿 ) 𝑟 , which number increases rapidly when 𝛿 0 .

For 𝜏 = 1 , the splitting 𝐴 = 𝐶 𝑅 of 𝐴 in two terms is used, where 𝐶 is nonsingular. The iterative method (4) then takes the form 𝐶 𝐱 𝑘 + 1 = 𝑅 𝐱 𝑘 + 𝐛 , 𝑘 = 0 , 1 , . ( 7 ) Method (7) is convergent if 𝜌 ( 𝐶 1 𝑅 ) < 1 . Splitting methods will be discussed in Section 2.

Let 𝐵 = 𝐶 1 𝑅 . If 𝐵 is known and 𝐵 < 1 , we can use the following theorem to get a test when the iteration error is small enough, that is, when to stop the iterations.

Theorem 1. Let 𝐵 < 1 , 𝐵 = 𝐶 1 𝑅 , and 𝐱 𝑚 be defined by (7). Then, 󶙱 𝐱 𝐱 𝑚 󶙱 𝐵 𝐵 󶙲 𝐱 1 𝑚 𝐱 𝑚 1 󶙲 , 𝑚 = 1 , 2 , . ( 8 )

Proof. From (7) follows 𝐱 𝑚 + 1 𝐱 𝑚 = 𝐵 ( 𝐱 𝑚 𝐱 𝑚 1 ) and, by recursion, 𝐱 𝑚 + 𝑘 + 1 𝐱 𝑚 + 𝑘 = 𝐵 𝑘 + 1 󶀢 𝐱 𝑚 𝐱 𝑚 1 󶀲 , 𝑘 = 0 , 1 , . ( 9 ) Note now that 𝐱 𝑚 + 𝑝 𝐱 𝑚 = 𝑝 1 𝑘 = 0 ( 𝐱 𝑚 + 𝑘 + 1 𝐱 𝑚 + 𝑘 ) . Hence, by the triangle inequality and (9) 󶙱 𝐱 𝑚 + 𝑝 𝐱 𝑚 󶙱 𝑝 1 󵠈 𝑘 = 0 󶙲 𝐵 𝑘 + 1 𝐱 󶙲 󶙲 𝑚 𝐱 𝑚 1 󶙲 𝐵 𝐵 𝑝 + 1 𝐵 󶙲 𝐱 1 𝑚 𝐱 𝑚 1 󶙲 . ( 1 0 ) Letting 𝑝 and noting that 𝐱 𝑚 + 𝑝 󵰁 𝐱 , (8) follows.

The basic iteration method (4) or the splitting methods in Section 2, can be improved in various ways. This will be the major topic of this paper.

Note first that application of the splitting in (7) requires in general that the matrix 𝑅 is given in explicit form, which can make the method less viable.

The most natural way to improve (4) is to introduce an approximation 𝐶 of 𝐴 , to be used when the correction 𝐞 𝑘 in (4) is computed. The relation 𝐞 𝑘 = 𝜏 𝐫 𝑘 is then replaced by 𝐶 𝐞 𝑘 = 𝜏 𝐫 𝑘 .

Such a matrix is mostly called preconditioner since, by a proper choice, it can significantly improve the condition number 𝒦 of 𝐴 , that is, 𝒦 󶀢 𝐶 1 𝐴 󶀲 ( 𝐴 ) 𝒦 , ( 1 1 ) where 𝒦 ( 𝐵 ) = 𝐵 𝐵 1 .

Clearly, in practice, the matrix 𝐶 must be chosen such that the linear systems with 𝐶 can be solved with relatively little expense compared to a solution method for 𝐴 . In particular, the expense for 𝐶 is much smaller than that for 𝐴 using a direct solution method.

For badly scaled matrices 𝐴 a simple, but often practically useful, choice of 𝐶 is the (block) diagonal part 𝐷 of 𝐴 . Much more efficient choices will be discussed later in the paper.

Early suggestions to use such a matrix 𝐶 can be found in papers by D'Yakonov [7] and Gunn [8]. For an automatic scaling procedure, see [9] and references therein.

In the present paper, we will survey various choices of 𝐶 which have proven to be useful in practice. The paper attempts to give a more personal account of the development of iterative solution methods. It is also not our ambition to present the present state- of- the- art but rather to describe the unfolding of the field.

In the remainder of the paper, we discuss, in order, methods based on splitting of the given matrix, the accelerated iterative methods of the Chebyshev and (generalized) conjugate gradient types, pointwise and block incomplete factorization preconditioning methods, symmetrized preconditioners of SSOR and ADI type, approximate inverses, and augmented subspace preconditioning methods. If space had allowed it, it would have been followed by presentation of geometric and algebraic multigrid methods, two-level and multilevel methods, elementwise preconditioning methods, and domain decomposition methods. Also, iteration error estimates and influence of rounding errors, and preconditioners for matrices of saddle point type would have been included. The paper ends with some concluding remarks.

The following relations will be used; if 𝐴 = [ 𝑎 𝑖 𝑗 ] , then 𝐴 𝑇 = [ 𝑎 𝑗 𝑖 ] denotes the transpose of 𝐴 , while 𝐴 = [ 𝑎 𝑗 𝑖 ] , denotes the Hermitian transpose.

2. Splitting Methods

A comprehensive, early presentation of splitting methods, and much more on iterative solution methods, is found in Varga [10]. Often there is a natural splitting of the given matrix as 𝐴 = 𝐶 𝑅 , ( 1 2 )

where 𝐶 is nonsingular. This can be used to formulate an iterative solution method in the form 𝐶 𝐱 𝑘 + 1 = 𝑅 𝐱 𝑘 + 𝐛 , 𝑘 = 0 , 1 , . ( 1 3 )

This method converges if 𝜌 ( 𝐶 1 𝑅 ) < 1 .

Definition 1. (a) 𝐴 matrix 𝐶 is said to be monotone if 𝐶 is nonsingular and 𝐶 1 0 (componentwise).
(b) 𝐴 = 𝐶 𝑅 is called a regular splitting [10], if 𝐶 is monotone and 𝑅 0 .
(c) a weak regular splitting [11], if 𝐶 is monotone and 𝐶 1 𝑅 0 .
(d) a nonnegative splitting [12], if 𝐶 is nonsingular and 𝐶 1 𝑅 0 .
The following holds, see, for example, [6].

Theorem 2. Let 𝐴 = 𝐶 𝑅 be a nonnegative splitting of 𝐴 . Then, the following properties are equaivalent: (a) 𝜌 ( 𝐵 ) < 1 , that is, 𝐴 = 𝐶 𝑅 is a convergent splitting, (b) 𝐼 𝐵 is monotone, (c) 𝐴 is nonsingular and 𝐺 = 𝐴 1 𝑅 0 .(d) 𝐴 is nonsingular and 𝜌 ( 𝐵 ) = 𝜌 ( 𝐺 ) / [ 1 + 𝜌 ( 𝐺 ) ] , where 𝐺 = 𝐴 1 𝑅 .

Corollary 1. If 𝐴 = 𝐶 𝑅 is a weak regular splitting, then the splitting is convergent if and only if 𝐴 is monotone.

Proof. (see, e.g., [6]).

A splitting method that became popular in the fifties is the SOR method. Here, 𝐴 = 𝐷 𝐿 𝑈 , where 𝐷 is the (block) diagonal and 𝐿 , 𝑈 are the (block) lower and upper triangular parts of 𝐴 , respectively. The successive relaxation method takes the form 󶀤 1 𝜔 󶀴 𝐱 𝐷 𝐿 𝑘 + 1 = 1 󶁤 󶀤 𝜔 󶀴 󶁴 𝐱 1 𝐷 + 𝑈 𝑘 + 𝐛 , 𝑘 = 0 , 1 , , ( 1 4 ) where 𝜔 0 is a parameter, called the relaxation parameter. For 𝜔 = 1 one gets the familiar Gauss-Seidel method (Gauss 1823 [5] and Seidel 1814 [13]) and for 𝜔 > 1 the successive overrelaxation (SOR) method (Frankel 1950 [14] and Young 1950 [15]).

For the iteration matrix in (14), 𝜔 = 󶀤 1 𝜔 󶀴 𝐷 𝐿 1 1 󶀤 󶀤 𝜔 󶀴 󶀴 , 1 𝐷 + 𝑈 ( 1 5 )

it holds that 𝜌 ( 𝜔 ) | 𝜔 1 | , where the upper bound is sharp. Therefore, the relaxation method is divergent for 𝜔 0 and 𝜔 2 (see, e.g., [6, 16]).

An optimal value of 𝜔 can be determined as follows. Assume then that 𝐴 has property ( 𝐴 𝜋 ) , that is, there exists a permutation matrix 𝑃 such that 𝑃 𝐴 𝑃 𝑇 is a block tridiagonal matrix. The following Lemma holds.

Lemma 1 (see [15]). Assume that 𝐴 has property ( 𝐴 𝜋 ) and let 𝜔 0 . Let 𝐵 = 𝐷 1 ( 𝐿 + 𝑈 ) . Then,(a)if 𝜆 0 is an eigenvalue of 𝜔 and 𝜇 satisfies 𝜇 2 = ( ) 𝜆 + 𝜔 1 2 󶀡 𝜔 2 𝜆 󶀱 , ( 1 6 ) then, 𝜇 is an eigenvalue of 𝐵 ,(b)if 𝜇 is an eigenvalue of 𝐵 and 𝜆 satisfies 𝜆 + 𝜔 1 = 𝜔 𝜇 𝜆 1 / 2 , ( 1 7 ) then, 𝜆 is an eigenvalue of 𝜔 .

Proof. For a short proof, see [6].

Theorem 3. Assume that (a) 𝐴 has property ( 𝐴 𝜋 ) , and (b)the block matrix 𝐵 = 𝐼 𝐷 1 𝐴 has only real eigenvalues. Then, the SOR method converges for any initial vector if and only if 𝜌 ( 𝐵 ) < 1 and 0 < 𝜔 < 2 . Further, we have 𝜔 o p t = 2 󵀆 1 + ( 𝐵 ) 1 𝜌 2 , ( 1 8 ) for which the asymptotic convergence factor is given as m i n 𝜔 𝜌 󶀡 𝜔 󶀱 󶀣 = 𝜌 𝜔 o p t 󶀳 = 𝜔 o p t 󶀤 󵀆 1 = 1 ( 𝐵 ) 1 𝜌 2 󶀴 󶀤 󵀆 1 + ( 𝐵 ) 1 𝜌 2 󶀴 . ( 1 9 )

Proof. Fort a short proof, see [6].

The eigenvalues of 𝐶 1 𝐴 are in general complex, and for 𝜔 = 𝜔 o p t it can be shown that they are distributed around a circle in the complex plane. This implies that the method can not be polynomially accelerated. (See Section 3 for a presentation of polynomial acceleration methods.) Further, the efficiency of the SOR method turns out to be critically dependent on the choice of 𝜔 .

A similar result as in Theorem 3 has been shown in [6], see also [17], that holds even if 𝐴 does not have property ( 𝐴 𝜋 ) , but is Hermitian.

Theorem 4. Let 𝐴 be Hermitian and positive definite and let 󵰓 𝜔 = 𝐷 1 / 2 𝐿 𝜔 𝐷 1 / 2 = 󶀤 1 𝜔 󵰑 𝐿 󶀴 𝐼 1 1 󶀤 󶀤 𝜔 󶀴 󵰑 𝐿 1 𝐼 󶀴 , ( 2 0 ) where 󵰑 𝐿 = 𝐷 1 / 2 𝐿 𝐷 1 / 2 , and let 0 < 𝜔 < 2 . Then, 𝜌 󶀡 𝐿 𝜔 󶀱 2 󶀢 󵰑 𝐿 = 𝜌 𝜔 󶀲 2 1 2 / 𝜔 1 ( ) 1 / 𝜔 1 / 2 2 𝛿 1 , + 𝛾 + 1 / 𝜔 ( 2 1 ) where 𝛾 = s u p 𝑥 0 󶀂 󶀊 󶀚 󵰑 󶙢 󶁣 󶀣 󶙢 𝐱 , 𝐿 𝐱 2 / ( ( 𝐱 , 𝐱 ) 󶀳 1 / 4 𝐱 , 𝐱 ) 󶁳 󶀢 󵰒 󶀲 󶀃 󶀋 󶀛 , 𝐴 𝐱 , 𝐱 𝛿 = 𝜆 m i n 󶀢 󵰒 𝐴 󶀲 = m i n 𝑥 0 󶀢 󵰒 󶀲 𝐴 𝐱 , 𝐱 ( ) . 𝐱 , 𝐱 ( 2 2 ) Further, if 󵰑 | ( 𝐱 , 𝐿 𝐱 ) | 1 / 2 ( 𝐱 , 𝐱 ) , then 𝜔 = 2 1 + 2 𝛿 , ( 2 3 ) minimizes the upper bound of 𝜌 ( 𝐿 𝜔 ) and we have 𝜌 󶀡 𝐿 𝜔 󶀱 2 = 󵀄 1 𝛿 / 2 󵀄 1 + . 𝛿 / 2 ( 2 4 )

For a proof, see [6].

In Section 4, we present a symmetric version of the SOR method where acceleration is possible.

3. Accelerated Iterative Methods

In this section, the important Chebyshev and conjugate gradient iteration methods are presented.

Consider first the iterative method (4) with variable time-steps 𝜏 𝑘 , 𝐱 𝑘 + 1 = 𝐱 𝑘 𝜏 𝑘 𝐶 1 𝐫 𝑘 , 𝐫 𝑘 = 𝐴 𝐱 𝑘 𝐛 , 𝑘 = 0 , 1 , . ( 2 5 ) Here, { 𝜏 𝑘 } is a sequence of iteration (acceleration) parameters. If 𝜏 𝑘 = 𝜏 , 𝑘 = 0 , 1 , , we talk about a stationary iterative method, otherwise about a nonstationary or semiiterative method.

Let 𝐞 𝑘 = 𝐱 𝐱 𝑘 , the iteration error. Then, it follows from (25) that 𝐞 𝑘 + 1 = ( 𝐼 𝜏 𝑘 𝐶 1 𝐴 ) 𝐞 𝑘 , 𝑘 = 0 , 1 , , so 𝐞 𝑚 = 𝑃 𝑚 ( 𝐶 1 𝐴 ) 𝐞 0 (and 𝐫 𝑚 = 𝐴 𝑃 𝑚 ( 𝐶 1 𝐴 ) 𝐴 1 𝐫 0 = 𝑃 𝑚 ( 𝐴 𝐶 1 ) 𝐫 0 ). Here, 𝑃 𝑚 ( 𝜆 ) = Π 𝑚 𝑘 = 0 ( 1 𝜏 𝑘 𝜆 ) a polynomial of degree 𝑚 having zeros at 1 / 𝜏 𝑘 and satisfying 𝑃 𝑚 ( 0 ) = 1 .

We want to choose the parameters { 𝜏 𝑘 } such that 𝐞 𝑚 is minimized. However, this would mean that in general the parameters would depend on 𝐞 0 , which is not known. Also the eigenvalues of 𝐶 1 𝐴 are not known. We then take the approach of minimizing 𝐞 𝑚 / 𝐞 0 for all 𝐞 0 ; that is, we want to minimize 𝑃 𝑚 ( 𝐶 1 𝐴 ) 𝐫 0 .

3.1. The Chebyshev Iterative Method

In case the eigenvalues of 𝐶 1 𝐴 are real and positive and if a positive lower (a) and (b) an upper bound are known of the spectrum, then we see that { 𝜏 𝑘 } should be chosen such that m a x 𝑎 𝜆 𝑏 | 𝑃 𝑚 ( 𝜆 ) | is minimized over all 𝑃 𝑚 Π 0 𝑚 , that is, over the set of polynomials of degree 𝑚 satisfying 𝑃 𝑚 ( 0 ) = 1 .

The solution to this min-max problem is well known, 𝑃 𝑚 ( 𝜆 ) = T 𝑚 ) / ( ( ( 𝑏 + 𝑎 2 𝜆 𝑏 𝑎 ) ) 𝑇 𝑚 ) / ( , ( ( 𝑏 + 𝑎 𝑏 𝑎 ) ) ( 2 6 ) where 𝑇 𝑚 ( 𝑧 ) = ( 1 / 2 ) [ ( 𝑧 + ( 𝑧 2 1 ) 1 / 2 ) 𝑚 + ( 𝑧 ( 𝑧 2 1 ) 1 / 2 ) 𝑚 ] = c o s ( 𝑚 a r c c o s 𝑧 ) are the Chebyshev polynomials of the first kind. The corresponding values of 𝜏 𝑘 satisfy 1 𝜏 𝑘 = 𝑏 𝑎 2 c o s Θ 𝑘 + 𝑏 + 𝑎 2 , Θ 𝑘 = 2 𝑘 1 2 𝑚 𝜋 , 𝑘 = 1 , 2 , , 𝑚 , ( 2 7 ) which are the zeros of the polynomial. The corresponding method is referred to as the Chebyshev (one-step) acceleration method, see, for example, [10, 18]. It is an easy matter to show that 1 𝑇 𝑚 󶀥 𝑏 + 𝑎 󶀵 𝑏 𝑎 2 𝜚 𝑚 󶀢 𝑏 , w h e r e 𝜚 = 1 / 2 𝑎 1 / 2 󶀲 󶀢 𝑏 1 / 2 + 𝑎 1 / 2 󶀲 . ( 2 8 )

This implies that if the number of iterations satisfies 𝑚 l n 𝜚 1 l n ( 2 / 𝜀 ) , that is, in particular if 1 𝑚 2 󶀥 𝑏 𝑎 󶀵 1 / 2 ( ) l n 2 / 𝜀 , 𝜀 > 0 , ( 2 9 ) then 𝐞 𝑚 / 𝐞 0 𝜀 .

The disadvantage with this method is that to make it effective one needs accurate estimates of 𝑎 and 𝑏 , and we need to determine 𝑚 beforehand (which, however, can be done by (29)). The method cannot utilize any special distribution of the eigenvalues in the spectrum (as opposed to the conjugate gradient method, see below). More important, however, is that this method is actually numerically unstable (similarly to an explicit time stepping method for initial value problems when the time steps are too large). This is due to the fact that 𝐼 𝜏 𝑘 𝐶 1 𝐴 is much larger than unity for several of the values 𝜏 𝑘 . However, one may prove that with some particular permutation of the parameters, their instability effect can be avoided.

There is an alternative to the choice (25). Namely, there exists a three term form of the Chebyshev acceleration method 𝐱 𝑘 + 1 = 𝛼 𝑘 𝐱 𝑘 + 󶀡 1 𝛼 𝑘 󶀱 𝐱 𝑘 1 𝛽 𝑘 𝐶 1 𝐫 𝑘 , 𝑘 = 1 , 2 , , ( 3 0 )

where 𝐱 1 = 𝐱 0 ( 1 / 2 ) 𝛽 0 𝐶 1 𝐫 0 .

Here, the parameters are chosen as 𝛽 0 = 4 / ( 𝑎 + 𝑏 ) , 𝛼 𝑘 = 𝑎 + 𝑏 2 𝛽 𝑘 , 𝛽 𝑘 1 = 𝑎 + 𝑏 2 󶀥 𝑏 𝑎 4 󶀵 2 𝛽 𝑘 1 𝑘 = 1 , 2 , . ( 3 1 )

Hence, we do not have to determine the number of steps beforehand. More importantly, it has been shown in [18] that this method is numerically stable. (For some related remarks, see [6]). A similar form of the method was proposed a long time ago, see Golub and Varga [19] and the references cited therein.

It is interesting to note that the parameters approach stationary values. If 𝐶 1 𝐴 = 𝐼 𝐵 and 𝐵 has eigenvalues in [ 𝜚 , 𝜚 ] , 𝜚 = 𝜚 ( 𝐵 ) < 1 (the spectral radius of 𝐵 ), then 𝛼 𝑎 = 1 𝜚 , 𝑏 = 1 + 𝜚 , 𝑘 = 𝑎 + 𝑏 2 𝛽 𝑘 2 󶁣 󶀡 1 + 1 𝜚 2 󶀱 1 / 2 󶁳 , ( 3 2 ) which is recognized as the parameter 𝜔 o p t of the optimal SOR method (see Section 2). Young [20] has proven that the asymptotic rate of convergence is retained even if one uses the stationary values throughout the iterations.

For the case of complex eigenvalues of 𝐶 1 𝐴 with positive real parts and contained in an ellipse one may choose parameters similarly. See [6, 21, 22] for details. For comments on the optimaly of the method, see [23]. For application of the method for nonsymmetric problems, see [6, 24].

Perhaps the main thrust during the 1970 has been in using the conjugate gradient method as an acceleration method. Already much has been written on the subject; we refer to [2527] for a historical account, to [18, 2833] for an exposition of the preconditioned conjugate gradient PCG method and to [18, 34] for a survey of generalized and truncated gradient methods for nonsymmetric and indefinite matrix problems.

The advantage with conjugate gradient methods is that they are self adaptive; the optimal parameters are calculated by the algorithm so that the error in energy norm 𝑒 𝑙 𝐴 1 / 2 = { ( 𝑒 𝑙 ) 𝑇 𝐴 𝑒 𝑙 } 1 / 2 is minimized. This applies to a problem where 𝐶 and 𝐴 are symmetric and positive definite (SPD) or, more generally, if 𝐶 1 𝐴 is similarly equivalent to an SPD matrix. Hence, there is no need to know any bounds for the spectrum. Since the method converges at least as fast as the Chebyshev method it follows that 𝑥 𝑥 𝑚 𝐴 1 / 2 𝜀 𝑥 𝑥 0 𝐴 1 / 2 , if 󶁄 1 𝑚 = i n t 2 𝒦 1 / 2 󶀤 2 l n 𝜀 󶀴 󶁔 + 1 . ( 3 3 )

We describe now the conjugate gradient method. Thereby we follow the presentations in [29, 35].

3.2. The Preconditioned Conjugate Gradient Method

During the past 40 years or so, the preconditioned conjugate gradient method has become the major iterative solution method for linear systems of algebraic equations, in particular those arising in science and engineering. The author of these notes became interested in the method by the beginning of 1970 (cf. [18]).

The conjugate gradient algorithm to solve a system of linear equations the 𝐴 𝐱 = 𝐛 , where 𝐴 ( 𝑛 × 𝑛 ) is symmetric and positive definite, was originally introduced by Hestenes and Stiefel [25] in 1950. Before we discuss the properties of the CG method, we describe its implementation. Namely, the algorithm consists of the steps in Algorithm 1.

alg1
Algorithm 1: Standard conjugate gradient algorithm.

What one sees from a first glance is that the CG algorithm is quite simple. Each iteration consists of one matrix-vector multiplication, two vector updates and two scalar products. Apart from the initial guess 𝐱 ( 0 ) (which can be taken to be the zero vector) and stopping tolerance, there are no other method parameters to be determined or tuned by the user. Thus, the method is easily programmable, cheap in terms of arithmetic operations and performs as a black box.

For some problems the standard (unpreconditioned) CG method performs impressively well and this can be explained by some particular properties of this powerful algorithm.

The CG method is best described as a method to minimize the quadratic functional 𝑓 ( 𝐱 ) = 1 2 𝐱 𝑇 𝐴 𝐱 𝐛 𝑇 𝐱 + 𝐜 , ( 3 4 ) over a set of vectors. If 𝐴 is nonsingular, then we can rewrite 𝑓 in the form 𝑓 ( 𝐱 ) = 1 2 𝐱 𝑇 ( ) 𝐴 𝐱 𝐛 𝑇 𝐴 1 ( ) 1 𝐴 𝐱 𝐛 2 𝐛 𝑇 𝐴 1 𝐛 + 𝐜 , ( 3 5 ) so, minimizing the quadratic functional is equivalent to solving the system 𝐴 𝐱 = 𝐛 . If 𝐴 is singular and 𝐴 1 in (35) is replaced by a generalized inverse of 𝐴 , then the above equivalence still holds if the minimization takes place on a subspace in the orthogonal complement to the null-space of 𝐴 .

Given an initial approximation 𝐱 ( 0 ) and the corresponding residual 𝐫 ( 0 ) = 𝐴 𝐱 ( 0 ) 𝐛 , the minimization in the conjugate gradient method takes place successively on a subspace 𝒦 𝑘 = 󶁂 𝐫 ( 0 ) , 𝐴 𝐫 ( 0 ) , 𝐴 2 𝐫 ( 0 ) , , 𝐴 𝑘 1 𝐫 ( 0 ) 󶁒 , ( 3 6 ) of growing dimension. This subspace is referred to as the Krylov set.

In the derivation of the algorithm, the next approximate solution is constructed as 𝐱 ( 𝑘 + 1 ) = 𝐱 ( 𝑘 ) + 𝜏 𝑘 𝐝 ( 𝑘 ) , ( 3 7 ) where 𝜏 𝑘 is chosen 𝜏 𝑘 = 𝐝 ( 𝑘 ) 𝑇 𝐠 ( 𝑘 ) 𝐝 ( 𝑘 ) 𝑇 𝐴 𝐝 ( 𝑘 ) = 𝐝 ( 𝑘 ) 𝑇 󶀢 𝐴 𝐱 ( 𝑘 ) 󶀲 𝐛 𝐝 ( 𝑘 ) 𝑇 𝐴 𝐝 ( 𝑘 ) , ( 3 8 )

which minimizes the function 𝑓 ( 𝐱 ( 𝑘 ) + 𝜏 𝐝 ( 𝑘 ) ) , < 𝜏 < . Also, the gradient of 𝑓 at 𝐱 ( 𝑘 + 1 ) is made orthogonal to the search direction 𝐝 ( 𝑘 ) . This is seen from the following relations: 𝐱 ( 𝑘 + 1 ) = 𝐱 ( 𝑘 ) + 𝜏 𝑘 𝐝 ( 𝑘 ) 𝐴 𝐱 ( 𝑘 + 1 ) 𝐛 = 𝐴 𝐱 ( 𝑘 ) 𝐛 + 𝜏 𝑘 𝐴 𝐝 ( 𝑘 ) 𝐠 ( 𝑘 + 1 ) = 𝐠 ( 𝑘 ) + 𝜏 𝑘 𝐴 𝐝 ( 𝑘 ) 𝐝 ( 𝑘 ) 𝑇 𝐠 ( 𝑘 + 1 ) = 𝐝 ( 𝑘 ) 𝑇 𝐠 ( 𝑘 ) + 𝜏 𝑘 𝐝 ( 𝑘 ) 𝑇 𝐴 𝐝 ( 𝑘 ) = 0 . ( 3 9 )

As in Fourier type minimization methods, it turns out to be efficient to work with orthogonal ( 𝐴 -orthogonal) search directions 𝐝 ( 𝑘 ) which, since 𝐴 is symmetric, can be determined from a three-term recursion 𝐝 ( 0 ) = 𝐫 ( 0 ) , 𝐝 ( 𝑘 + 1 ) = 𝐴 𝐝 ( 𝑘 ) + 󵰑 𝛽 𝑘 𝐝 ( 𝑘 ) , 𝑘 = 1 , 2 , , ( 4 0 )

or equivalently, from 𝐝 ( 𝑘 + 1 ) = 𝐫 ( 𝑘 + 1 ) + 𝛽 𝑘 𝐝 ( 𝑘 ) . ( 4 1 )

This recursive choice of search directions is done so that at each step the solution has smallest error in the 𝐴 -norm, 𝐱 𝐱 ( 𝑘 ) 𝐴 = { 𝐞 ( 𝑘 ) 𝑇 𝐴 𝐞 ( 𝑘 ) } 1 / 2 , where 𝐞 ( 𝑘 ) = 𝐱 𝐱 ( 𝑘 ) is the iteration error. As mentioned, the minimization takes place over the set of (Krylov) vectors 𝒦 𝑘 and, as is readily seen 𝒦 𝑘 = 󶁂 𝐱 ( 1 ) 𝐱 ( 0 ) , 𝐱 ( 2 ) 𝐱 ( 0 ) , , 𝐱 ( 𝑘 ) 𝐱 ( 0 ) 󶁒 = 󶁂 𝐠 ( 0 ) , 𝐠 ( 1 ) , , 𝐠 ( 𝑘 1 ) 󶁒 = 󶁂 𝐝 ( 0 ) , 𝐝 ( 1 ) , , 𝐝 ( 𝑘 1 ) 󶁒 . ( 4 2 )

To summarize, the CG method possesses the following remarkable properties.

Theorem 5. Let the CG Algorithm 1 be applied to a symmetric positive definite matrix 𝐴 . Then, in exact arithmetic the following properties hold: (1)the iteratively constructed residuals 𝐠 are mutually orthogonal, that is, 𝐠 ( 𝑘 ) 𝑇 𝐠 ( 𝑗 ) = 0 , 𝑗 < 𝑘 ;(2)the search directions 𝐝 are 𝐴 -orthogonal (or conjugate), that is, 𝐝 ( 𝑘 ) 𝑇 𝐴 𝐝 ( 𝑗 ) = 0 , 𝑗 < 𝑘 : (3)as long as the method has not converged, that is, 𝐠 ( 𝑘 ) 0 , the algorithm proceeds with no breakdown and (42) holds; (4)as long as the method has not converged, the newly constructed approximation 𝐱 ( 𝑘 ) is the unique point in 𝐱 ( 0 ) 𝒦 𝑘 that minimizes 𝐞 ( 𝑘 ) 𝐴 = 𝐱 𝐱 ( 𝑘 ) 𝐴 , (5)the convergence is monotone in 𝐴 -norm, that is, 𝐞 ( 𝑘 ) 𝐴 < 𝐞 ( 𝑘 1 ) 𝐴 and 𝐞 ( 𝑚 ) = 0 will be achieved for some 𝑚 𝑛 .

For a proof of the above theorem consult, for instance, [29] or [6].

Since the method is optimal, that is it gives the smallest error on a subspace of growing dimension, it terminates with the exact solution (ignoring round-off errors) in at most 𝑛 steps (the dimension of the whole vector space 𝐱 𝑅 𝑛 ). In fact, it can be readily seen that the CG algorithm terminates after 𝑚 steps, where 𝑚 is the degree of the minimal polynomial 𝑄 𝑚 to 𝐴 with respect to the initial residual vector, in other words, 𝑄 𝑚 has the smallest degree of all polynomials 𝑄 for which 𝑄 ( 𝐴 ) 𝐫 ( 0 ) = 0 . Therefore, the CG method can be viewed also as a direct solution method. However, in practice we want convergence to occur to an acceptable accuracy in much fewer steps than 𝑛 or 𝑚 . Thus, we use CG as an iterative method.

For further discussions of the CG methods, see [6, 34, 35].

When one experiments with CG to solve systems with various matrices one observes some phenomena which need special attention. This can be illustrated by a simple example.

Consider the solution of 𝐴 𝐱 = 𝐛 by the standard conjugate gradient, where 󶀄 󶀔 󶀔 󶀔 󶀔 󶀔 󶀔 󶀔 󶀔 󶀔 󶀔 󶀔 󶀜 󶀅 󶀕 󶀕 󶀕 󶀕 󶀕 󶀕 󶀕 󶀕 󶀕 󶀕 󶀕 󶀝 󶀄 󶀔 󶀔 󶀔 󶀔 󶀔 󶀔 󶀔 󶀔 󶀔 󶀔 󶀔 󶀜 1 0 0 0 0 󶀅 󶀕 󶀕 󶀕 󶀕 󶀕 󶀕 󶀕 󶀕 󶀕 󶀕 󶀕 󶀝 𝐴 = 2 1 1 2 1 1 2 1 1 1 , 𝑏 = . ( 4 3 )

The exact solution is 󵰁 𝐱 = [ 1 , 1 , , 1 ] 𝑇 . Starting with 𝐱 ( 0 ) = [ 0 , 0 , , 0 ] 𝑇 one finds that after 𝑘 iterations 𝐱 ( 𝑘 ) = 󶁥 𝑘 , 𝑘 + 1 𝑘 1 1 𝑘 + 1 , , 󶁵 𝑘 + 1 , 0 , , 0 𝑇 , ( 4 4 ) for 1 𝑘 𝑛 1 and 𝐱 ( 𝑛 ) = 󵰁 𝐱 . Hence, the information travels one step at a time from left to right and it takes 𝑛 steps before the last component has changed at all. The algorithm converges exactly in 𝑛 steps and terminates due to the final recurrence property of the method.

Another detail one observes is that the norm of the error, 𝐱 𝐱 ( 𝑘 ) , can be much larger than the norm of the iteratively computed residual.

These examples illustrate the fact that although the method has an optimal order of convergence rate in the energy norm, its actual convergence rate in spectral norm can be different and depends both on the distribution of the eigenvalues of the (preconditioned) system matrix and on the initial approximation (or residual). (For comparison, we note that the rate of convergence for steepest descent depends only on the ratio of the extremal eigenvalues of 𝐴 .) Faster convergence for the CG method is expected when the eigenvalues are clustered.

One way to get a better eigenvalue distribution is to precondition 𝐴 by a proper preconditioner 𝐵 . Hence, in order to achieve a better eigenvalue distribution it is crucial in practice to use some form of preconditioning, that is, a matrix 𝐵 which approximates 𝐴 in some sense, which is relatively cheap to solve systems with and for which the spectrum of 𝐵 1 𝐴 (equivalently 𝐵 1 / 2 𝐴 𝐵 1 / 2 if 𝐵 is s.p.d.) is more favorable for the convergence of the CG method. As it turns out, if 𝐵 is symmetric and positive definite, the corresponding preconditioned version, the PCG method, is best derived by replacing the inner product with ( 𝐮 , 𝐯 ) = 𝐮 𝑇 𝐵 𝐯 . It takes the following form, see Algorithm 2.

alg2
Algorithm 2: Preconditioned conjugate gradient algorithm.

Here, [ 𝐵 ] 1 denotes the action of 𝐵 1 , that is, one does not multiply with the inverse matrix 𝐵 1 , but normally solves a linear system with matrix 𝐵 .

In order to understand what is wanted of a good preconditioning matrix, we discuss first some issues of major importance related to the rate of convergence of the CG method. Thereby it becomes clear that the standard spectral condition number is often too simple to explain the detailed convergence behaviour. In particular we discuss the sub- and superlinear convergence phases frequently observed in the convergence history of the conjugate gradient method.

A preconditioner can be applied in two different manners, namely, as 𝐵 1 𝐴 or 𝐵 𝐴 . The first form implies the necessity to solve a system with 𝐵 at each iteration step while the second form implies a matrix-vector multiplication with 𝐵 (a multiplicative preconditioner). In the latter, case 𝐵 can be seen as an approximate inverse of 𝐴 . One can also use a hybrid form 𝛼 𝐵 1 1 + 𝛽 𝐵 2 .

The presentation here is limited to symmetric positive semidefinite matrices. It is based mainly on the articles [29, 31].

3.3. On the Rate of Convergence Estimates of the Conjugate Gradient Method

Let 𝐴 be symmetric, positive semidefinite and consider the solution of 𝐴 𝐱 = 𝐛 by a preconditioned conjugate gradient method. In order to understand how an efficient preconditioner to 𝐴 should be chosen we must first understand some general properties of the rate of convergence of conjugate gradient methods.

3.3.1. Rate of Convergence Estimates Based on Minimax Approximation

As is well known (see e.g., [6, 30]), the conjugate gradient method is a norm minimizing method. For the preconditioned standard CG method, we have 󶙲 𝐞 𝑘 󶙲 𝐴 = m i n 𝑃 𝑘 𝜋 𝑘 󶙲 𝑃 𝑘 ( 𝐵 ) 𝐞 0 󶙲 𝐴 , ( 4 5 ) where 𝐮 𝐴 = { 𝐮 𝑇 𝐴 𝐮 } 1 / 2 , 𝐞 𝑘 = 𝐱 𝐱 𝑘 is the iteration error and 𝜋 𝑘 denotes the set of polynomials of degree 𝑘 which are normalized at the origin, that is, 𝑃 𝑘 ( 0 ) = 1 . This is a norm on the subspace orthogonal to the mullspace of 𝐴 , that is, on the whole space, if 𝐴 is nonsingular.

Consider the 𝐶 -innerproduct ( 𝐮 , 𝐯 ) = 𝐮 𝑇 𝐶 𝐯 , and note that 𝐵 = 𝐶 1 𝐴 is symmetric with respect to this innerproduct, let 𝐯 1 , 𝐯 2 , , 𝐯 𝑛 be orthonormal eigenvectors and let 𝜆 𝑖 , 𝑖 = 1 , , 𝑛 be the corresponding eigenvalues of 𝐵 . Let 𝐞 0 = 𝑛 󵠈 𝑗 = 1 𝛼 𝑗 𝐯 𝑗 , ( 4 6 ) be the eigenvector expansion of the initial vector where 𝛼 j = ( 𝐞 0 , 𝐯 𝑖 ) , 𝑖 = 1 , , 𝑛 . Note further that the eigenvectors are both 𝐴 - and 𝐶 -orthogonal. Then, by the construction of the CG method, it follows 𝐞 𝑘 = 𝑛 󵠈 𝑗 = 1 𝛼 𝑗 𝑃 𝑘 󶀢 𝜆 𝑗 󶀲 𝐯 𝑗 , ( 4 7 )

and, using the nonnegativity of the eigenvalues, we find 󶙲 𝐞 𝑘 󶙲 𝐴 = 󶙁 󶙙 𝑛 󵠈 𝑗 = 1 𝛼 𝑗 𝑃 𝑘 ( 𝜆 𝑗 ) 𝐯 𝑗 󶙁 󶙙 𝐴 = 󶀂 󶀊 󶀚 𝑛 󵠈 𝑗 = 1 𝛼 2 𝑗 𝜆 𝑗 𝑃 2 𝑘 󶀢 𝜆 𝑗 󶀲 󶀃 󶀋 󶀛 1 / 2 󶀂 󶀊 󶀚 𝑛 󵠈 1 , 𝜆 𝑗 > 0 𝛼 2 𝑗 𝜆 𝑗 󶀃 󶀋 󶀛 1 / 2 𝜆 m a x 1 𝑖 𝑛 𝑖 󶙡 𝑃 > 0 𝑘 󶀡 𝜆 𝑖 󶀱 󶙡 𝜆 = m a x 1 𝑖 𝑛 𝑖 󶙡 𝑃 > 0 𝑘 󶀡 𝜆 𝑖 󶀱 𝐞 󶙡 󶙲 0 󶙲 𝐴 . ( 4 8 ) Here we have used the 𝐴 -orthogonality of the eigenvectors. Similarly, using the 𝐶 -orthogonality, we find 󶙲 𝐞 𝑘 󶙲 𝐶 = 󶀂 󶀊 󶀚 𝑛 󵠈 𝑗 = 1 𝛼 2 𝑗 𝑃 2 𝑘 󶀢 𝜆 𝑗 󶀲 󶀃 󶀋 󶀛 1 / 2 m a x 1 𝑖 𝑛 󶙢 𝑃 𝑘 󶀢 𝜆 𝑗 󶀲 𝐞 󶙢 󶙲 0 󶙲 𝐶 . ( 4 9 )

Due to the minimization property (45) there follows from (48) the familiar bound 󶙲 𝐞 𝑘 󶙲 𝐴 m i n 𝑃 𝑘 𝜋 1 𝑘 𝜆 m a x 1 𝑖 𝑛 𝑖 󶙡 𝑃 > 0 𝑘 󶀡 𝜆 𝑖 󶀱 𝐞 󶙡 󶙲 0 󶙲 𝐴 . ( 5 0 ) Estimate (50) is sharp in the respect that for every 𝑘 there exists an initial vector for which equality is attained. In fact, for such a vector we necessarily have that 𝛼 𝑗 0 if and only if 𝛼 𝑗 belongs to a set of 𝑘 + 1 points ( the so-called Haar condition) where m a x 𝑖 | 𝑃 𝑘 ( 𝜆 𝑖 ) | is taken. For such an initial vector (49) shows that, if the eigenvalues are positive, we have also 󶙲 𝐞 𝑘 󶙲 𝐶 = m i n 𝑃 𝑘 𝜋 1 𝑘 m a x 1 𝑖 𝑛 󶙡 𝑃 𝑘 󶀡 𝜆 𝑖 󶀱 𝐞 󶙡 󶙲 0 󶙲 𝐶 . ( 5 1 )

The rate of convergence of the iteration error 𝐞 𝑘 𝐴 is measured by the average convergence factor 󶀂 󶀊 󶀚 󶙲 𝐞 𝑘 󶙲 𝐴 󶙱 𝐞 0 󶙱 𝐴 󶀃 󶀋 󶀛 1 / 𝑘 . ( 5 2 )

Inequality (50) shows that this can be majorized with an estimate of the rate of convergence of a best polynomial approximation problem (namely the best approximation of the function 0 , of polynomials in 𝜋 1 𝑘 ) in maximum norm on the discrete set formed by the spectrum of 𝐵 . Clearly, multiple eigenvalues are treated as single so the actual approximation problem is m i n 𝑃 𝑘 𝜋 1 𝑘 m a x 1 𝑖 𝑚 󶙢 𝑃 𝑘 󶀢 󵰑 𝜆 𝑖 󶀲 󶙢 , ( 5 3 ) where the disjoint positive eigenvalues 󵰑 𝜆 𝑗 have been ordered in increasing value, 󵰑 𝜆 0 < 𝑖 󵰑 𝜆 < < 𝑚 , and 𝑚 is the number of such eigenvalues. However, the solution of this problem requires knowledge of the spectrum, which is not available in general. Even if it is known, the estimate (53) can be troublesome in practice, since it involves approximation on a general discrete set of points.

Besides being costly to apply, such estimates do not give any qualitative insight in the behaviour of the conjugate gradient method for various typical eigenvalue distributions.

That is why we make some further assumptions on the spectrum in order to simplify the approximation problem and at the same time, present estimates which can be used both to estimate the number of iterations and to give some insight in the qualitative behaviour of the iteration method.

3.3.2. Standard Condition Number: Linear Convergence

If the eigenvalues are (densely) located in an interval [ 𝑎 , 𝑏 ] where 𝑎 > 0 , we can majorize the best approximation problem on the discrete set with the best approximation problem on the interval and frequently still get a good estimate. We have m i n 𝑃 𝑘 𝜋 1 𝑘 m a x 1 𝑖 𝑚 󶙢 𝑃 𝑘 󶀢 󵰑 𝜆 𝑖 󶀲 󶙢 m i n 𝑃 𝑘 𝜋 1 𝑘 m a x 𝑎 𝑥 𝑏 󶙡 𝑃 𝑘 ( 𝑥 . ) 󶙡 ( 5 4 )

The solution to this m i n m a x problem is well known and uses Chebyshev polynomials. One finds that m i n 𝑃 𝑘 𝜋 1 𝑘 m a x 𝑎 𝑥 𝑏 󶙡 𝑃 𝑘 ( 𝑥 = 1 ) 󶙡 𝑇 𝑘 ) / ( = ( ( 𝑏 + 𝑎 𝑏 𝑎 ) ) 2 𝜎 𝑘 1 + 𝜎 2 𝑘 , ( 5 5 ) where 󵀄 𝜎 = ( 1 󵀄 𝑎 / 𝑏 ) / ( 1 + 𝑎 / 𝑏 ) , and 𝑇 𝑘 ( 𝑥 ) = ( 1 / 2 ) [ ( 𝑥 + 𝑥 2 1 ) + ( 𝑥 𝑥 2 1 ) 𝑘 ] , 󵰑 𝜆 𝑎 = 1 , 󵰑 𝜆 𝑏 = 𝑚 . Hence, the average rate of convergence of the upper bound approaches 𝜎 as 𝑘 . Also, it is readily found (see [35]) that the relative iteration error 𝑒 𝑘 𝐴 / 𝑒 0 𝐴 𝜀 if 𝑘 = 𝑘 ( ) = 󶂢 󶂲 󶂺 󶀤 ( ) + 󵀆 𝑎 , 𝑏 , 𝜀 l n 1 / 𝜀 󶀡 1 / 𝜀 2 󶀱 󶀴 1 l n 𝜎 1 󶂣 󶂳 󶂻 . ( 5 6 ) Here 𝜉 denotes the smallest integer not less than 𝜉 .

It turns out that the above holds more generally if 𝐴 is nonsymmetric but the eigenvalues are contained in an ellipse with foci 𝑎 , 𝑏 , where 𝑏 𝑎 > 0 , if one replaces 𝜎 with 󵰁 󵀄 𝜎 = 𝜎 ( 1 + 𝛿 ) / ( 1 𝛿 ) , where 𝛿 is the eccentricity of the ellipse, (i.e., the ratio of the semiaxes) and 󵀄 𝛿 < 2 𝑎 / 𝑏 / ( 1 + 𝑎 / 𝑏 ) . Also, in a similar way, the case of eigenvalues contained in two separate intervals or ellipses can be analysed, see, for example, [35] for further details.

When 𝑏 / 𝑎 , 𝛿 = 0 , and 𝜀 0 the following upper bound becomes an increasingly accurate replacement of (56), 𝑘 󶃦 1 2 󵀊 𝑏 𝑎 2 l n 𝜀 󶃶 . ( 5 7 ) The above estimate of the rate of convergence and of the number of iterations show that they depend only on the condition number 𝑏 / 𝑎 and on the eccentricity of the ellipse, containing the eigenvalues. Therefore, except in special cases, this estimate is not very accurate. When we use a more detailed information of the spectrum and the initial error vector, sometimes substantially better estimates can be derived. This holds for instance when there are well separated small and/or large eigenvalues. Before we consider this important case, we mention briefly another similar minimax result which holds when we use different norms for the iteration error vector and for the initial vector.

By (48), we have 󶙲 𝐞 𝑘 󶙲 𝐴 = 󶁂 󵠈 𝛼 2 𝑗 𝜆 𝑗 𝑃 2 𝑘 󶀢 𝜆 𝑗 󶀲 󶁒 1 / 2 = 󶁂 󵠈 𝛼 2 𝑗 𝜆 𝑗 1 2 𝑠 𝜆 𝑗 2 𝑠 𝑃 2 𝑘 󶀢 𝜆 𝑗 󶀲 󶁒 1 / 2 m i n 𝑃 𝑘 𝜋 1 𝑘 m a x 1 𝜆 𝑗 𝑚 󶙢 𝜆 𝑠 𝑗 𝑃 𝑘 󶀢 𝜆 𝑗 󶀲 󵠈 𝛼 󶙢 󶁂 2 𝑗 𝜆 𝑗 ( 1 2 𝑠 ) 󶁒 1 / 2 = m i n 𝑃 𝑘 𝜋 1 𝑘 m a x 1 𝜆 𝑗 𝑚 󶙢 𝜆 𝑠 𝑗 𝑃 𝑘 󶀢 𝜆 𝑗 󶀲 𝐞 󶙢 󶙲 0 󶙲 𝐴 1 2 𝑠 . ( 5 8 )

If the initial vector is such that Fourier coefficients for the highest eigenvalue modes are dominating, then 𝐞 0 𝐴 1 2 𝑠 may exist and take not too large values even for some 𝑠 1 / 2 . We consider the interesting case where 𝑠 1 / 2 , for which the following theorem holds (see [6, 36]).

Theorem 6. Let 𝜋 1 𝑘 denote the set of polynomials of degree k such that 𝑃 𝑘 ( 0 ) = 1 . Then for 𝑘 = 1 , 2 and for any 𝑠 1 / 2 such that 2 𝑠 is an integer, it holds 󶙲 𝐞 𝑘 󶙲 𝐴 󶙱 𝐞 0 󶙱 𝐴 1 2 𝑠 m i n 𝑃 𝑘 𝜋 1 𝑘 m a x 0 𝑥 1 󶙡 𝑥 𝑠 𝑃 𝑘 ( 𝑥 󶀤 𝑠 ) 󶙡 󶀴 𝑘 + 𝑠 2 𝑠 . ( 5 9 )

Remark 1. For 𝑠 = 1 / 2 it holds m a x 0 𝑥 1 󶙢 𝑥 1 / 2 𝑃 𝑘 ( 𝑥 = 1 ) 󶙢 , 2 𝑘 + 1 ( 6 0 ) for 𝑃 𝑘 ( 𝑥 ) = 𝑈 2 𝑘 ( 1 𝑥 ) and for 𝑠 = 1 , it holds m a x 0 𝑥 1 󶙡 𝑥 𝑃 𝑘 ( 𝑥 = 1 ) 󶙡 𝜋 𝑘 + 1 t a n < 1 4 𝑘 + 4 ( ) 𝑘 + 1 2 , ( 6 1 ) for 𝑃 𝑘 ( 𝑥 ) = ( 𝑥 1 ( 1 ) 𝑘 ) / ( 𝑘 + 1 ) t a n ( 𝜋 / ( 4 𝑘 + 4 ) ) 𝑇 𝑘 + 1 ( ( 1 + c o s ( 𝜋 / ( 2 𝑘 + 2 ) ) ) 𝑥 c o s ( 𝜋 / ( 2 𝑘 + 2 ) ) ) where 𝑇 𝑘 ( 𝑥 ) and 𝑈 𝑘 ( 𝑥 ) are the Chebyshev polynomials of 𝑘 t h degree of the first and second kind, respectively.

For other values (59) is an upper bound only, that is, not sharp. At any rate, it shows that the error 𝐞 𝑘 𝐴 converges (initially) at least as fast as ( 𝑠 / ( 𝑘 + 𝑠 ) ) 2 𝑠 , that is, as 1 / ( 2 𝑘 + 1 ) for 𝑠 = 1 / 2 and as ( 1 / ( 𝑘 + 1 ) ) 2 for 𝑠 = 1 .

Note that this convergence rate does not depend on the eigenvalues, in particular not on the spectral condition number.

Conclusion. By computing the initial approximation vector from a coarse mesh, the components for 𝐞 0 for the first Fourier modes will be small and 𝐞 0 𝐴 1 2 𝑠 may take on values that are not very large even when 𝑠 = 1 or bigger. Therefore, there is an initial decay of the residual as 𝑂 ( 𝑘 2 𝑠 ) , independent of the condition number. Note, however, that the actual errors may not have decayed sufficiently even if the residual has.

We consider now upper bound estimates which show how the convergence history may enter a superlinear rate.

An Estimate to Show a Superlinear Convergence Rate Based on the K-Condition Number
A somewhat rough, but simple and illustrative superlinear convergence estimate can be obtained in terms of the so called 𝐾 -condition number, (see [37, 38]) ( 𝐵 ) = ( ( 𝐵 𝐾 = 𝐾 ( 1 / 𝑛 ) t r ) ) 𝑛 ( 𝐵 ) = 󶀧 1 d e t 𝑛 𝑛 󵠈 𝑖 = 1 𝜆 𝑖 󶀷 𝑛 󶀡 Π 𝑛 𝑖 = 1 𝜆 𝑖 󶀱 1 , ( 6 2 ) where we assume that 𝐵 is s.p.d.
Note that 𝐾 1 / 𝑛 equals the quotient between the arithmetic and geometric averages of the eigenvalues. This quantity is similar to the spectral condition number 𝜅 ( 𝐵 ) in that it is never smaller than 1, and is equal to 1 if and only if 𝐵 = 𝛼 𝐼 , 𝛼 > 0 (recall that 𝐵 is symmetrizable).
Based on the 𝐾 -condition number, a superlinear convergence result can be obtained as follows.

Theorem 7. Let 𝑘 < 𝑛 be even and 𝑘 3 l n 𝐾 . Then, 𝐞 𝑘 𝐴 𝐞 0 𝐴 󶀥 3 l n 𝐾 𝑘 󶀵 𝑘 / 2 . ( 6 3 )

Proof. Let 𝑘 = 2 𝑚 and the polynomial 𝑃 𝑘 be of a simplest possible form, that is, let it vanish at the 𝑚 smallest and 𝑚 largest eigenvalues of 𝐵 . As follows from (48), we have then 󶙲 𝐞 𝑘 󶙲 𝐴 󶙱 𝐞 0 󶙱 𝐴 m a x 𝜆 1 𝜆 𝜆 𝑛 󶙥 Π 𝑚 𝑖 = 1 󶀥 𝜆 1 𝜆 𝑖 𝜆 󶀵 󶀥 1 𝜆 𝑛 + 1 𝑖 󶀵 󶙥 = Π 𝑚 𝑖 = 1 m a x 𝜆 𝑖 𝜆 𝜆 𝑛 + 1 𝑖 󶀥 𝜆 𝜆 𝑖 𝜆 1 󶀵 󶀥 1 𝜆 𝑛 + 1 𝑖 󶀵 = Π 𝑚 𝑖 = 1 󶀧 󶀡 𝜆 𝑖 + 𝜆 𝑛 + 1 𝑖 󶀱 2 4 𝜆 𝑖 𝜆 𝑛 + 1 𝑖 󶀷 Π 1 󶀨 󶀧 𝑚 𝑖 = 1 󶀡 𝜆 𝑖 + 𝜆 𝑛 + 1 𝑖 󶀱 2 4 𝜆 𝑖 𝜆 𝑛 + 1 𝑖 󶀷 1 / 𝑚 󶀸 1 𝑚 . ( 6 4 ) The latter follows from ( Π 𝑚 𝑖 = 1 ( 1 Θ 𝑖 ) ) 1 / 𝑚 + ( Π 𝑚 𝑖 = 1 Θ 𝑖 ) 1 / 𝑚 1 with Θ 𝑖 = 4 𝜆 𝑖 𝜆 𝑛 + 1 𝑖 ( 𝜆 𝑖 + 𝜆 𝑛 + 1 𝑖 ) 2 . Using now twice the inequality between the arithmetic and geometric mean values, one has Π 𝑚 𝑖 = 1 󶀡 𝜆 𝑖 + 𝜆 𝑛 + 1 𝑖 󶀱 2 4 𝜆 𝑖 𝜆 𝑛 + 1 𝑖 󶀡 ( 1 / 𝑛 ) 2 𝑚 𝑛 𝑚 𝑖 = 𝑚 + 1 𝜆 𝑖 󶀱 𝑛 2 𝑚 Π 𝑛 𝑚 𝑖 = 𝑚 + 1 𝜆 𝑖 × Π 𝑚 𝑖 = 1 󶀡 𝜆 𝑖 + 𝜆 𝑛 + 1 𝑖 󶀱 / 2 2 Π 𝑚 𝑖 = 1 𝜆 𝑖 𝜆 𝑛 + 1 𝑖 󶀡 󶀡 1 / 𝑛 𝑛 𝑚 𝑖 = 𝑚 + 1 𝜆 𝑖 + 𝑚 𝑖 = 1 󶀡 𝜆 𝑖 + 𝜆 𝑛 + 1 𝑖 󶀱 󶀱 󶀱 𝑛 Π 𝑛 𝑚 𝑖 = 𝑚 + 1 𝜆 𝑖 Π 𝑚 𝑖 = 1 𝜆 𝑖 𝜆 𝑛 + 1 𝑖 = 𝐾 . ( 6 5 ) Using exp ( 2 𝑥 ) 1 2 𝑥 / ( 1 𝑥 ) , 𝑥 < 1 , we get the required estimate, 󶙲 𝐞 𝑘 󶙲 𝐴 𝐞 0 𝐴 󶀢 𝐾 2 / 𝑘 󶀲 1 𝑘 / 2 󶀥 2 l n 𝐾 󶀵 𝑘 l n 𝐾 𝑘 / 2 󶀥 2 l n 𝐾 ( ) 󶀵 2 𝑘 / 3 𝑘 / 3 l n 𝐾 𝑘 / 2 󶀥 3 l n 𝐾 𝑘 󶀵 𝑘 / 2 . ( 6 6 )

A somewhat better result of the same type exists. The estimate is of similar type, that is, 𝐫 𝑘 𝐶 1 𝐫 0 𝐶 1 󶀢 𝐾 1 / 𝑘 󶀲 1 𝑘 / 2 , ( 6 7 ) where 𝐫 𝑘 = 𝐴 𝐞 𝑘 was obtained using more complicated techniques, see [6, 37], and the references quoted therein. Note that here as 𝑘 / l n 𝐾 , we have 𝐞 𝑘 𝐶 1 𝐞 0 𝐶 1 󶀢 𝐞 1 / 𝑘 l n 𝐾 󶀲 1 𝑘 / 2 󶀥 1 + l n 𝐾 󶀵 2 𝑘 k / 2 󶀥 l n 𝐾 𝑘 󶀵 𝑘 / 2 𝐾 1 / 4 󶀥 l n 𝐾 𝑘 󶀵 𝑘 / 2 , ( 6 8 ) that is, the simpler upper bound in (63) is asymptotically worse than this bound (albeit in a different norm) by the factor 3 𝑘 / 2 / l n 𝐾 1 / 4 .

The upper bounds in the above estimates involve a convergence factor which decreases with increasing iteration number and show hence a superlinear rate of convergence. Note, however, that 𝐾 1 / 𝑛 𝜅 ( 𝐵 ) 4 𝐾 (see [6]) where 𝜅 ( 𝐵 ) = 𝜆 𝑛 / 𝜆 1 is the spectral condition number, so the 𝐾 -condition number may take very large values when 𝜅 ( 𝐵 ) is large.

The estimates based on the 𝐾 -condition number involve only “integral” characteristics of the preconditioned matrix (the trace and the determinant). Sometimes, it is possible to obtain a practical estimate of 𝐾 ( 𝐵 ) which can be useful for the a priori construction of good preconditioners and for the a posteriori assessment of their quality, see Section 5 for further details.

The estimate 󶙲 𝐫 𝑘 󶙲 𝐶 1 󶙱 𝐫 0 󶙱 𝐶 1 󶀢 𝐾 1 / 𝑘 󶀲 1 𝑘 / 2 𝜀 ( 6 9 )

shows that 𝐾 1 / 2 󶀢 1 𝐾 1 / 𝑘 󶀲 𝑘 / 2 𝜀 ( 7 0 )

or 𝑘 2 l o g 2 1 1 𝐾 1 / 𝑘 l o g 2 𝐾 1 / 2 𝜀 , ( 7 1 )

which holds if 𝑘 > l o g 2 𝐾 + 2 l o g 2 1 𝜀 . ( 7 2 ) Hence, when 𝐾 𝜀 2 the estimated number of iterations depends essentially only on l o g 2 𝐾 , that is, depends little on the relative accuracy 𝜀 , which indicates a fast superlinear convergence, when 𝑘 > l o g 2 𝐾 .

When actually estimating the number of iterations, Theorem 6 shows a useful result only when 𝑘 > 𝑂 ( l n 𝐾 ) = 𝑛 ( l n 𝐾 1 / 𝑛 ) , that is, the quotient between the arithmetic and geometric averages of the eigenvalues, which equals 𝐾 1 / 𝑛 , must be close to unity and the eigenvalues must be very well clustured so 𝐾 1 / 𝑛 = 1 + 𝑂 ( 𝑛 𝜀 ) for some 𝜀 > 0 ; otherwise the estimated number of iterations will be 𝑂 ( 𝑛 ) , which is normally a useless result. The next example illustrates this further.

Example 1. Consider a geometric distribution of eigenvalues of 𝐴 , 𝜆 𝑗 = 𝑗 𝑠 , 𝑗 = 1 , 2 , , 𝑛 for some positive 𝑠 . Here, asymptotically ( 𝐴 ) = t r 𝑛 󵠈 1 𝑗 𝑠 1 𝑛 𝑠 + 1 𝑠 + 1 , 𝑛 . ( 7 3 ) Using Stirling's formula, we find ( 𝐴 ) = d e t 𝑛 󵠉 1 𝜆 𝑗 = 󶀧 𝑛 󵠉 1 𝑗 󶀷 𝑠 ( ) 2 𝜋 𝑛 𝑠 / 2 󶀤 𝑛 𝑒 󶀴 𝑛 𝑠 , 𝑛 , ( 7 4 ) so, 𝜅 ( 𝐴 ) 1 / 𝑛 𝑒 𝑠 𝑠 + 1 , 𝑛 . ( 7 5 ) Hence, 𝑠 must be sufficiently small for the estimate in Theorem 6. to be useful. On the other hand, the spectral condition number (i) 𝜅 ( 𝐴 ) = 𝑛 𝑠 , and the simple estimate based on 𝜅 ( 𝐴 ) leads to 𝑘 𝑂 ( 𝑛 𝑠 / 2 ) and gives hence, asymptotically, a smaller upper bound when 𝑠 < 2 . For further discussions on superlinear rate of convergence, see [39].

3.4. Generalized Conjugate Gradient Methods

The rate of convergence estimates as given above, holds for a restricted class of matrices, symmetric or, more generally, for normal matrices.

To handle more general classes of problems for which such optimal rate of convergence results as in (45) holds, one needs more involved methods. Much work has been devoted to this problem. This includes methods like generalized minimum residual (GMRES), see Saad and Schultz [40], generalized conjugate residual (GCR), and generalized conjugate gradient (GCG), see [6] and for further details [41]. As opposed to the standard conjugate and gradient method, they require a long version of updates for the search directions, as the newest search direction at each stage is not in general, automatically (in exact precision) orthogonal to the previous search directions, but must be orthogonalized at each step. This makes the computational expense per step grow linearly and the total expense grows quadratically with the iteration index. In addition, due to finite precision, there is a tendency of loss of orthogonality, even for symmetric problems when many iterations are required. One remedy which has been suggested is to use the method only for a few steps, say 10, and restart the method with the current approximation as initial approximation.

Clearly, however, in this way, the optimal convergence property of the whole Krylov set of vector is lost. For this and other possibilities, see, for example, [42].

Another important version of the generalized conjugate gradient methods occurs when one uses variable preconditioners. Variable preconditioners, that is, a preconditioner changed from one iteration to the next iteration step, are used in many contexts.

For instance, one can use variable drop tolerance, computed adaptively, in an incomplete factorization method (see Section 4). When the given matrix is partitioned in two by two blocks, it can be efficient to use inner iterations when solving arising systems for one, or both, of the diagonal block matrices, see, for example, [43], and the flexible conjugate gradient method in Saad, [44, 45].

Due to space limitations, the above topics will not be further discussed in this paper.

4. Incomplete Factorization Methods

There exist two classes of preconditioning methods that are closely related to direct solution methods. In this paper, we make a survey only of their main ingredients, but delete many of the particular aspects.

The first method is based on incomplete factorization were some entries arising during a triangular factorization are neglected to save in memory. The deletion can be based on some drop tolerance criterion or on a normally a priori, chosen sparsity pattern. The factorization based on a drop tolerance takes the following form. During the elimination (or equivalently, triangular factorization), the off-diagonal entries are accepted only if they are not too small. For instance, 𝑎 𝑖 𝑗 󶁆 𝑎 = 𝑖 𝑗 𝑎 𝑖 𝑟 𝑎 1 𝑟 𝑟 𝑎 𝑟 𝑗 󶙢 𝑎 i f 𝑖 𝑗 󶙢 󵀄 𝜀 𝑎 𝑖 𝑖 𝑎 𝑗 𝑗 , 0 , o t h e r w i s e . ( 7 6 ) Here, 𝜀 , 0 < 𝜀 1 is the drop-tolerance parameter. Such methods may lead to too much fill-in (i.e., 𝑎 𝑖 𝑗 0 in positions where the original entry was occupied by a zero), because to be robust, they may require near machine-precision drop tolerances. Furthermore, as direct solution methods, they are difficult to parallelize efficiently.

The incomplete factorization method can readily be extended to matrices partitioned in block form. Often, instead of a drop tolerance, one prescribes the sparsity pattern of the triangular factors in the computed preconditioner, that is, entries arising outside the chosen pattern are ignored. An early presentation of such incomplete factorization methods was given by Meijerink and van der Vorst [46]. One can make a diagonal compensation of the neglected entries, that is add them to the diagonal entries in the same row, possibly first multiplied by some scalar Θ , 0 < Θ 1 . For discussions of such approaches, see [29, 30, 47, 48]. This frequently moves small eigenvalues, corresponding to the smoother harmonics, to cluster near the origin, in this way sometimes improving the spectral condition number by an order of magnitude (see [6, 47]).

The other class of methods are based on approximate inverses 𝐺 , for instance such that minimizes a Frobenius norm of the error matrix 𝐼 𝐺 𝐴 , see Section 5 for further details. To be sufficiently accurate these methods lead frequently to nearly full matrices. This can be understood as the matrices we want to approximate are often sparse discretizations of diffusion problems. The inverse of such an operator is a discrete Green's function which, as wellknown, often has a significantly sized support on nearly the whole domain of definition.

However, we can use an additive approximation of the inverse involving two, or more, terms which is approximate on different vector subspaces. By defining in this way the preconditioner recursively on a sequence of lower dimensional subspaces, it may preserve the accurate approximation property of the full, inverse method while still needing only actions of sparse operators.

Frequently, the given matrices are partitioned in a natural way in a two by two block form. For such matrices, it can be seen that the two approaches are similar. Consider namely 󶀄 󶀔 󶀜 𝐴 𝐴 = 1 𝐴 1 2 𝐴 2 1 𝐴 2 󶀅 󶀕 󶀝 , ( 7 7 ) where we assume that 𝐴 1 and the Schur complement matrix 𝑆 = 𝐴 2 𝐴 2 1 𝐴 1 1 𝐴 1 2 are nonsingular. (This holds, in particular, if 𝐴 is symmetric and positive definite.) We can construct either a block approximate factorization of 𝐴 or approximate the inverse of 𝐴 on additive form. As the following shows, the approaches are related. First, a block matrix factorization of 𝐴 is 󶀄 󶀔 󶀜 𝐴 𝐴 = 1 0 𝐴 2 1 𝑆 󶀅 󶀕 󶀝 󶀄 󶀔 󶀜 𝐼 1 𝐴 1 1 𝐴 1 2 0 𝐼 2 󶀅 󶀕 󶀝 , ( 7 8 ) where 𝐼 1 , 𝐼 2 denote the unit matrices of proper order. For its inverse, it holds 𝐴 1 = 󶀄 󶀔 󶀜 𝐼 1 𝐴 1 1 𝐴 1 2 0 𝐼 2 󶀅 󶀕 󶀝 󶀄 󶀔 󶀜 𝐴 1 1 0 𝑆 1 𝐴 2 1 𝐴 1 1 𝑆 1 󶀅 󶀕 󶀝 = 󶀄 󶀔 󶀜 𝐴 1 1 + 𝐴 1 1 𝐴 1 2 𝑆 1 𝐴 2 1 𝐴 1 1 𝐴 1 1 𝐴 1 2 𝑆 1 𝑆 1 𝐴 2 1 𝐴 1 1 𝑆 1 󶀅 󶀕 󶀝 , ( 7 9 ) or 𝐴 1 = 󶀄 󶀔 󶀜 𝐴 1 1 0 󶀅 󶀕 󶀝 + 󶀄 󶀔 󶀜 0 0 𝐴 1 1 𝐴 1 2 𝐼 2 󶀅 󶀕 󶀝 𝑆 1 󶁢 𝐴 2 1 𝐴 1 1 , 𝐼 2 󶁲 . ( 8 0 ) A straightforward computation reveals that 𝐴 󵰒 𝑉 󵰒 𝑉 𝑇 𝐴 󵰒 𝑉 = 𝑆 and, hence, 𝐴 1 = 󶀄 󶀔 󶀜 𝐴 1 1 0 󶀅 󶀕 󶀝 + 󵰒 𝑉 󶀣 󵰒 𝑉 0 0 𝑇 𝐴 󵰒 𝑉 󶀳 1 󵰒 𝑉 𝑇 = 󶀄 󶀔 󶀜 𝐴 1 1 0 󶀅 󶀕 󶀝 + 󵰒 0 0 𝑉 𝐴 󵰒 𝑉 1 󵰒 𝑉 𝑇 , ( 8 1 ) where 󵰒 󶀄 󶀔 󶀜 𝑉 = 𝐴 1 1 𝐴 1 2 𝐼 2 󶀅 󶀕 󶀝 . ( 8 2 )

Let 𝑀 1 𝐴 1 be an approximation of 𝐴 1 ( for which linear systems are simpler to solve than for 𝐴 1 ) and let 𝐺 1 𝐴 1 1 be a sparse approximate inverse. Possibly 𝐺 1 = 𝑀 1 1 . Then, 󶀄 󶀔 󶀜 𝑀 𝑀 = 1 0 𝐴 2 1 𝐵 2 󶀅 󶀕 󶀝 󶀄 󶀔 󶀜 𝐼 1 𝑀 1 1 𝐴 1 2 0 𝐼 2 󶀅 󶀕 󶀝 = 󶀄 󶀔 󶀜 𝑀 1 𝐴 1 2 𝐴 2 1 𝐴 2 󶀅 󶀕 󶀝 + 󶀄 󶀔 󶀜 0 0 0 𝐵 2 + 𝐴 2 1 𝑀 1 1 𝐴 1 2 𝐴 2 󶀅 󶀕 󶀝 ( 8 3 ) is a preconditioner to 𝐴 and 󶀄 󶀔 󶀜 𝐺 𝐵 = 1 0 󶀅 󶀕 󶀝 0 0 + 𝑉 𝐵 2 1 𝑉 𝑇 ( 8 4 ) is an approximate inverse, where 𝑉 = [ 𝐺 1 𝐴 1 2 𝐼 2 ] and 𝐵 2 is an approximation of 𝑆 . If 𝐵 2 = 𝑉 𝑇 󵰒 𝐴 𝑉 , where 󵰒 𝐴 = [ 𝐺 1 1 𝐴 1 2 𝐴 2 1 𝐴 2 ] , then 󶀄 󶀔 󶀜 𝐺 𝐵 = 1 0 󶀅 󶀕 󶀝 󶀢 𝑉 0 0 + 𝑉 𝑇 󵰒 󶀲 𝐴 𝑉 1 𝑉 𝑇 = 󶀄 󶀔 󶀜 𝐺 1 0 󶀅 󶀕 󶀝 󶀢 󵰒 𝐴 󶀲 𝑉 0 0 + 𝑉 𝑆 𝑇 , ( 8 5 ) where 󵰒 𝑆 ( 𝐴 ) = 𝐴 2 𝐴 2 1 𝐺 1 𝐴 1 2 . If 𝑀 1 = 𝐺 1 1 , then in this case 𝐵 = 𝑀 1 . ( 8 6 ) Hence, a convergence estimate for one method can be directly applied for the other method as well. For further discussions of block matrix preconditioners, see, for example, [4952]. As can be seen from the above, Schur complement matrices play a major role in both matrix factorizations. For sparse approximations of Schur complement matrix, in particular element, e.g., element type approximations, see, for example [5355].

We consider now multilevel extensions of the additive approximate inverse subspace correction method. It is illustrative to consider first the exact inverse and its relation to Gaussian (block matrix) elimination.

4.1. The Exact Inverse on Additive Form

Let then 𝐴 ( 0 ) = 𝐴 and consider a matrix 𝐴 ( 𝑘 ) = 󶀄 󶀔 󶀜 𝐴 1 ( 𝑘 ) 𝐴 ( 𝑘 ) 1 2 𝐴 ( 𝑘 ) 2 1 𝐴 2 ( 𝑘 ) 󶀅 󶀕 󶀝 , ( 8 7 ) in a sequence defined by 𝐴 ( 𝑘 + 1 ) 𝑆 2 ( 𝑘 ) = 𝐴 2 ( 𝑘 ) 𝐴 ( 𝑘 ) 2 1 𝐴 ( 𝑘 ) 1 1 𝐴 ( 𝑘 ) 2 1 , 𝑘 = 0 , 1 , , 𝑘 0 , ( 8 8 ) where each 𝐴 1 ( 𝑘 ) in nonsingular, being a block diagonal of a symmetric positive definite matrix. Hence, the following recursion holds 𝐴 ( 𝑘 ) 1 = 󶀄 󶀔 󶀜 𝐴 ( 𝑘 ) 1 1 0 󶀅 󶀕 󶀝 + 󶀄 󶀔 󶀜 0 0 𝐴 ( 𝑘 ) 1 1 𝐴 ( 𝑘 ) 1 2 𝐼 2 ( 𝑘 + 1 ) 󶀅 󶀕 󶀝 𝐴 ( 𝑘 + 1 ) 1 󶁣 𝐴 ( 𝑘 ) 2 1 𝐴 ( 𝑘 ) 1 1 , 𝐼 2 ( 𝑘 + 1 ) 󶁳 , ( 8 9 ) 𝑘 = 0 , 1 , , 𝑘 0 . Here, 𝐼 2 ( 𝑘 + 1 ) is the identity matrix on level 𝑘 + 1 . Note that in this example the dimensions decrease with increasing level number and the final matrix (i.e., Schur complement) in the sequence is 𝐴 ( 𝑘 0 ) = 𝑆 ( 𝑘 0 2 + 1 ) . The above recursion can be rewritten in compact form 𝐴 1 = 󶀄 󶀔 󶀜 𝐴 ( 0 )