A Polynomial Preconditioned Global CMRH Method for Linear Systems with Multiple Right-Hand Sides
The restarted global CMRH method (Gl-CMRH(m)) (Heyouni, 2001) is an attractive method for linear systems with multiple right-hand sides. However, Gl-CMRH(m) may converge slowly or even stagnate due to a limited Krylov subspace. To ameliorate this drawback, a polynomial preconditioned variant of Gl-CMRH(m) is presented. We give a theoretical result for the square case that assures that the number of restarts can be reduced with increasing values of the polynomial degree. Numerical experiments from real applications are used to validate the effectiveness of the proposed method.
We consider the linear systems of the form where is a nonsingular and sparse matrix of order , with usually . Such a situation arises from, for instance, wave-scattering problems, image restorations, recursive least squares computations, numerical methods for integral equations, and structural mechanics problems; see [1, 2] and references therein.
Numerical solvers for (1) can be roughly divided into two categories. The first is the direct method; for instance, the LU factorization is competent since is only factorized once to recast (1) into a triangular system which is easy to solve. However, if the coefficient matrix is large and sparse or sometimes not readily available, then iterative solvers may become the only choice and possibly fall into the following three classes.
The first class is the block method. For symmetric problems, the first block methods are due to O’Leary, including the block conjugate gradient method and block biconjugate gradient method . For nonsymmetric cases, the block generalized minimal residual method [4–6], the block BiCGSTAB method , the block quasiminimal residual method , the block least squares method , the block Lanczos method , and the block IDR() method  have been proposed recently. In general, the block solvers are more suitable for dense systems with precondition.
The second class is the seed method. The main idea of this kind of methods is briefed below. We first select a single system (seed system) and develop the corresponding Krylov subspace. Then we project all the residuals of the other systems onto the same Krylov subspace to find new approximations as initial guess; see [2, 12, 13] for details.
The last class is the global method. To our knowledge, the term global is at least due to Saad [14, Chapter 10] and has been further populated by Jbilou et al.  with the global FOM and global GMRES methods for matrix equations. Following the work , many other global methods have been developed, including, to name just a few of them, the global BiCG and global BiCGSTAB methods [16, 17], the global Hessenberg and global CMRH (changing minimal residual method based on the Hessenberg process) methods  and their weighted variants , the skew-symmetric methods , and the global SCD method . Generally, the global methods are more appropriate for large and sparse systems.
It is well known that the performance of the above Krylov subspace methods can be reinforced with a suitable preconditioner  or through effective matrix splitting techniques [22, 23]. In this paper, we are interested in preconditioning the global methods. Specifically, we aim at improving the convergence behavior of the restarted global CMRH method (Gl-CMRH()) , which is originally proposed to reduce the increasing storage requirement in its full version. However, because of the use of a small subspace (say ), Gl-CMRH() is likely to slow down or even stalls out. Heyouni and Essai give a weighted version of Gl-CMRH() (WGl-CMRH())  to alleviate such disadvantage. Instead, we propose a different approach, that is, by polynomial preconditioner to improve Gl-CMRH() in this paper.
The remainder of this work is organized as follows. In Section 2, we first recall some notations and properties of the global method, and then we sketch the Gl-CMRH() method. In Section 3, we construct the polynomial preconditioner tailored to Gl-CMRH() by exploiting the relation between the Krylov matrix and the global basis. For square right-hand side matrices, we also give a theoretical result that justifies the use of the proposed polynomial preconditioner. In Section 4, several numerical examples are employed to substantiate the effectiveness of the proposed method. Some concluding remarks and potential future work are briefed in the last section.
2. Notations and the Global CMRH Method
In this section, we first give some notations and properties used in the global methods, which will henceforth be adopted extensively in deriving the main results. Then we present a brief introduction of the Gl-CMRH() method . More details about the global methods can be found, for instance, in [15, 18, 19].
2.1. Notations and Properties
Throughout this paper, the following notations will be used. The norms and represent the vector 2-norm and matrix -norm, respectively. Let be the set of rectangular matrices. If , then stands for its transpose. For a square matrix , indicates the inverse of if existed. Unless otherwise stated, subscripts denote the corresponding iteration step; for example, denotes the th iterate of the matrix (vector) . Moreover, the entries of matrices and are denoted by and , respectively. If a column or a row of a matrix is invoked, then we denote it in a dot format; that is, and mean correspondingly the th column and the th row of . Besides, extracts the submatrix from to rows and from to columns of .
Next we present some notations and basic properties used in the global methods . Given the block matrix , where , , then we define the matrix product as where . For any matrix , we define analogously the product by
It can be verified that such matrix product satisfies the following properties: where .
2.2. The Global CMRH Method
The Gl-CMRH method  is an efficient extension of the CMRH method  for solving (1). It is based on the global Hessenberg process . As for the numerical performance, Gl-CMRH is in general competitive with the classic global GMRES method (Gl-GMRES) .
Now we give a brief sketch of Gl-CMRH. Let be the initial guess of (1) with the associated residual matrix . The th iteration is searched in the affine subspace ; that is, , where is the th correction matrix. The matrix Krylov subspace is defined as . Using the basis of given by the global Hessenberg process , we get where . The global Hessenberg process in  also yields where is an upper Hessenberg matrix, is obtained by deleting the last row of , and 0 is the zero matrix in . Thus it follows immediately that where and . To obtain the vector , a restriction is imposed on the Gl-CMRH method; that is,
As in the Gl-GMRES method , a restarting strategy is used to address the problem that the computational and storage requirements increase with iterations. Algorithm 1 gives a framework of the restarted version of Gl-CMRH (Gl-CMRH()). We refer to [18, 19] for elaborate explanation for the Gl-CMRH method.
3. A New Polynomial Preconditioned Gl-CMRH() Method
In many cases, the accuracy of Gl-CMRH() is sufficient. Due to the limited dimension of the matrix Krylov subspace , however, Gl-CMRH() may suffer from slow convergence or even stagnation in practice, just like in GMRES()  and Gl-GMRES() . To remedy this drawback, some accelerating techniques are demanded, for instance, a weighting strategy exploited in . Besides, a polynomial preconditioner can also be adapted to improve the convergence . In this section we focus on constructing an efficient polynomial preconditioner pertinent to the Gl-CMRH() method.
The essence of the polynomial preconditioned method is to devise a polynomial such that an easier system is solved instead of solving the original system (1). In what follows, we obtain a polynomial preconditioner by extracting some useful information from Gl-CMRH().
Now suppose that the block Krylov matrix is of the form where is the initial residual matrix. By comparing the last columns of the second equation in (6), we have
The equality (14) can be rearranged as
Let us consider the relationship between and the basis . Since and span the same space, it follows that where is an upper triangular matrix. The relation (16), however, does not shed too much light because how to compute still remains unclear. Fortunately, an explicit recurrence for can be derived in terms of and . By combining (16) and (4), we get
Since , then
Therefore, in (16) can be updated recursively by (20). Recall that in (5) is updated on the basis . Here we will show another way to update which is based on the block Krylov matrix . It follows from (5), (16), and (4) that where and is solved from (10). Denote by a polynomial in of degree ; that is, . Hence (21) can be recast as
Remark 2. In (23), the term becomes smaller with growing and hence the upper bound diminishes correspondingly, which in turn implies that approximates asymptotically. This justifies the use of the polynomial preconditioner. In general, (23) assures that the number of restarts will be reduced correspondingly with increasing . Yet this does not necessarily mean that the CPU time will be reduced simultaneously since the time saved from the reduction of restarts may be offset by the extra time spent in constructing the polynomial. In practice, we are often more concerned with the CPU time than the restarting number. Therefore, we restrict ourselves to small values of . For , an inequality similar to (23) is generally unavailable. Nevertheless, numerical examples seem to demonstrate that the asymptotical property of (23) is also shared by the case ; see Example 3 in Section 4 for more discussions.
By putting all together, we propose the new polynomial preconditioned global CMRH method (PGl-CMRH()) that is shown in Algorithm 2.
4. Numerical Examples
In this section, we present some numerical experiments which are coded with MATLAB 7.8.0. For fair comparisons, some other global solvers mentioned earlier like Gl-CMRH() , WGl-CMRH() , and Gl-GMRES()  have also been implemented. From now on we drop the parameters and in brackets without ambiguity. In all examples, we assume that . The terminating criterion for the th iteration is . Though other alternatives are possible, we use as the the weighting matrix for WGl-CMRH, which is also preferred in . The coefficient matrices in the first two examples are derived from the discretizations of the Poisson’s equation and convection-diffusion problems which occur frequently in applied science and engineering. The coefficient matrices in the third example are quoted from the Matrix Market .
Example 1. We consider the linear systems of (1) in which its coefficient matrix is obtained from the discretization of on the unit square with on the boundary. It can be discretized through the centered difference scheme at the gird points with , where the mesh size for . This yields a block tridiagonal matrix of size . The right-hand side matrix is chosen with entries uniformly distributed on ; see [14, Chapter 2] for more details about (24). Related parameters are given by , and . The number of restarts and CPU time for matrices of different sizes are given in Table 1. As observed from Table 1, PGl-CMRH improves the original CMRH method by time ratios from 17.8% to 55.4%. Compared with WGl-CMRH, PGl-CMRH requires less number of restarts and CPU time to achieve the required accuracy. Note that WGl-CMRH does not speed up the convergence of Gl-CMRH. This indicates that a different weighting matrix should be used. To find the optimal weighting matrix, however, remains an open problem .
Example 2. Consider the linear systems of (1) where its coefficient matrix is obtained from the discretization of the three-dimensional convection-diffusion problem on the unit cube . Here is a constant coefficient and (25) subjects to Dirichlet-type boundary conditions. This equation can be discretized by applying seven-point finite difference discretizations. For instance, we use the centered difference to the diffusive terms and the first-order upwind approximations to the convective terms. This approach yields a coefficient matrix of size , where the equidistant mesh size is used, and the natural lexicographic ordering is adopted to the unknowns; we refer to [22, Section 4] for more details. The right-hand side matrix is chosen with entries uniformly distributed on . Here, , , and . The number of restarts and CPU time for and is given in Table 2. For this large problem, as expected, PGl-CMRH performs better than CMRH and other variants concerning CPU time.
Example 3. In practice, the degree of the polynomial preconditioner has a great impact on the numerical performance of PGl-CMRH. Thus it deserves our attention to investigate how to choose the “optimal” degree (if existed) for generic matrices. Nevertheless, theoretical analysis to this end can be very hard. Instead, we show empirically how to choose a range of degrees for the polynomial such that PGl-CMRH at least yields a modest performance. To this end, we use ten unsymmetrical testing matrices from  and illustrate how PGl-CMRH performs for each matrix with varying from 2 to 15; see Figure 1. Some properties of these testing matrices are listed in Table 3. The right-hand side matrix is chosen with entries uniformly distributed on . Since we are only concerned with the value of that makes PGl-CMRH performs stably with the shortest CPU time, we have normalized values of CPU time by dividing the maximum value of CPU time for a certain curve. Take the matrix pde2961 for example. The longest time is 4.2 seconds (with ); then we divide all values of CPU time by 4.2 for pde2961 and plot the result in Figure 1. This approach facilitates our comparison since different curves become more clustered now. Some remarks can be made from Figure 1. First, the curves seem rather problem-dependent and are not necessarily nonincreasing with increasing values of for instance, the curve of rdb2048l is rather irregular and hence unpredictable. However, this does not contradict Theorem 1 where it is stated that can approximate better with growing values of In other words, Theorem 1 explains theoretically that the total number of restarts will be reduced with increasing values of However, this does not apply to the change of CPU time. In fact, it is likely that PGl-CMRH with high degree preconditioner takes more CPU time in generating the polynomial preconditioner (even with less number of restarts) and hence uses more time to converge than that of its low degree counterpart. Second, most curves locate the corresponding shortest CPU time point with between and . This can be the first reason for favoring small values of Finally, more rounding errors can be introduced in developing high-degree polynomial preconditioners from the numerical point of view. This is the second reason for the approval of low-degree polynomial preconditioners. Therefore it is useful to test with from to 10. Under extreme situations, however, higher degree may be demanded if a low-degree preconditioner fails to bring the required accuracy.
To remedy the slow convergence of the original Gl-CMRH() method, a new variant of Gl-CMRH() for linear systems with multiple right-hand sides is developed. The proposed method often yields better performance than its predecessor Gl-CMRH() and other global variants in terms of CPU time. We show experimentally that polynomial preconditioners with degree lower than 10 should be considered if no prior knowledge is known.
The authors would like to thank Professor Jinyun Yuan and the referees for their valuable remarks that improved this paper. The work is supported by the National Natural Science Foundation (11371243), the Key Disciplines of Shanghai Municipality (S30104), the Innovation Program of Shanghai Municipal Education Commission (13ZZ068), and the Anhui Provincial Natural Science Foundation (1308085QF117).
B. Vital, Etude de quelques méthodes de résolution de problµemes linéaires de grande taille sur multiprocesseurs [Ph.D. thesis], Universit de Rennes, Rennes, France, 1990.
K. Jbilou and H. Sadok, “Global Lanczos-based methods with applications,” Tech. Rep. LMA 42, Université du Littoral, Calais, France, 1997.View at: Google Scholar
Matrix Market, http://math.nist.gov/MatrixMarket/.