Abstract

A preconditioned gradient-based iterative method is derived by judicious selection of two auxil- iary matrices. The strategy is based on the Newton’s iteration method and can be regarded as a generalization of the splitting iterative method for system of linear equations. We analyze the convergence of the method and illustrate that the approach is able to considerably accelerate the convergence of the gradient-based iterative method.

1. Introduction

In this paper, we consider preconditioned iterative methods for solving Sylvester equations of form 𝐴𝑋+𝑋𝐵=𝐶,(1.1) where 𝐴𝑚×𝑚, 𝐵𝑛×𝑛, and 𝑋=[𝑥1𝑥2]𝑚×𝑛, with 𝑚𝑛 generally. A Lyapunov equation is a special case of (1.1) with 𝑚=𝑛, 𝐵=𝐴𝑇, and 𝐶=𝐶𝑇. Such kind of problems frequently arise from many areas of applications in control and system theory [1], stability of linear systems [2], analysis of bilinear systems [3], power systems [4], signal and image processing [5], and so forth.

Throughout the paper, we assume that Sylvester equation (1.1) possess a unique solution, that is, 𝜆(𝐴)𝜆(𝐵)=,(1.2) where 𝜆(𝐴) and 𝜆(𝐵) denote the spectra of 𝐴 and 𝐵, respectively [6]. In theory, the exact solution of (1.1) can be computed by “linearization,” that is, by solving an equivalent system of linear equations of form 𝒜vec(𝑋)=vec(𝐶),(1.3) where 𝒜=𝐼𝑛𝐴+𝐵𝑇𝐼𝑚𝑚𝑛×𝑚𝑛, vec(𝑋)=[𝑥𝑇1,,𝑥𝑇𝑛]𝑇 with 𝑋=[𝑥1,,𝑥𝑛]𝑚×𝑛, and is the Kronecker product. However, the above direct method requires considerable computational efforts, due to the high dimension of the problem.

For small-to-medium-scale problems of the form (1.1), direct approaches such as Bartels-Stewart method [7, 8] and Hessenberg-Schur method [6, 9] have been the methods of choice. The main idea of these approaches is to transform the original linear system into a structured system that can be solved efficiently by forward or backward substitutions.

In the numerical linear community, iterative methods are becoming more and more popular. Several iterative schemes for Sylvester equations have been proposed; see, for example, [1015]. Recently, Some gradient based iterative methods [35, 1626] have been investigated for solving general coupled matrix equations and general matrix equations. For Sylvester equations of form (1.1), the gradient based iterative methods use a hierarchical identification principle to compute the approximate solution. The convergence condition of these methods is investigated in [1618]. It is proved that the gradient based iterative methods are convergent under certain conditions. However, we observe that the convergence speed of the gradient based iterative methods is generally very slow, which is similar to the behavior of classical iterative methods applied to systems of linear equations. In this paper, we consider preconditioning schemes for solving Sylvester equations of form (1.1). We illustrated that the preconditioned gradient based iterative methods can be derived by selecting two auxiliary matrices. The selection of preconditioners is natural from the view point of splitting iteration methods for systems of linear equations. The convergent property of the preconditioned method is proved and the optimal relaxation parameter is derived. The performance of the method is compared with the original method in several examples. Numerical results show that preconditioning is able to considerably speed up the convergence of the gradient based iterative method.

The paper is organized as follows. In Section 2, a gradient based iterative method is recalled, and the preconditioned gradient based method is introduced and analyzed. In Section 3, performance of the preconditioned gradient based method is compared with the unpreconditioned one, and the influence of an iterative parameter is experimentally studied. Finally, we conclude the paper in Section 4.

2. A Brief Review of the Gradient Based Iterative Method

We firstly recall an iterative method proposed by Ding and Chen [18] for solving (1.1). The basic idea is regarding (1.1) as two linear matrix equations: 𝐴𝑋=𝐶𝑋𝐵,𝑋𝐵=𝐶𝐴𝑋.(2.1) Then define two recursive sequences 𝑋𝑘(1)=𝑋(1)𝑘1+𝜅𝐴𝑇𝐶𝐴𝑋(1)𝑘1𝑋(1)𝑘1𝐵𝑋,(2.2)𝑘(2)=𝑋(2)𝑘1+𝜅𝐶𝐴𝑋(2)𝑘1𝑋(2)𝑘1𝐵𝐵𝑇,(2.3) where 𝜅 is the iterative step size. The above procedures can be regarded as two separate iterative procedures for solving two matrix equations in (2.1).

With 𝑋𝑘(1) and 𝑋𝑘(2) at hand, then the 𝑘th approximate solution 𝑋𝑘 can be defined by taking the average of two approximate solutions, that is, 𝑋𝑘=𝑋𝑘(1)+𝑋𝑘(2)2.(2.4) By selecting an appropriate initial approximate solution 𝑋0, and using 𝑋𝑘1 to substitute 𝑋(1)𝑘1 in (2.2) and 𝑋(2)𝑘1 in (2.3), then the above (2.2)-(2.3) constitute the gradient based iterative method proposed in [18]. It is shown [18] that the gradient based iterative algorithm converges as long as 20<𝜅<𝜆max𝐴𝐴𝑇+𝜆max𝐵𝑇𝐵,(2.5) where 𝜆max(𝐴𝐴𝑇) is the largest eigenvalue of 𝐴𝐴𝑇.

3. Preconditioned Gradient Based Iterative Method

We start with a general algebraic equation [27] 𝑓(𝑥)=0,𝑥.(3.1)

Suppose 𝑥𝑛 is an approximate solution for (3.1), for example, 𝑓(𝑥𝑛)0 (𝑓(𝑥𝑛)0). Then it follows that the accuracy of 𝑥𝑛 can be improved by the following scheme: 𝑥𝑛+1=𝑥𝑛𝑥+𝜆𝑓𝑛(3.2) with 𝜆 being a Lagrangian multiplier. It is well known that the optimal value of 𝜆 is determined by 𝑑𝑥𝑛+1𝑑𝑥𝑛=0.(3.3) From the above condition, the Newton's iteration follows: 𝑥𝑛+1=𝑥𝑛𝑓𝑥𝑛𝑓𝑥𝑛.(3.4)

Now, let us consider a general system of linear equations of form 𝐾𝑥=𝑏,(3.5) where 𝐾𝑛×𝑛 is a general square matrix. Let 𝑓(𝑥)=𝑏𝐴𝑥, and 𝑥0 be an initial approximate solution. Then by one-step Newton's iteration (3.4), we can theoretically find the exact solution 𝑥1=𝑥𝐾1(𝑏𝐾𝑥)=𝐾1𝑏.(3.6) However, the previous procedure has the same difficulty as solving (3.5) by direct inverse.

By utilizing the idea of preconditioning, the iteration formula (3.6) can be rebuilt by replacing 𝐾1 by an approximate matrix 𝑀1. Taking 𝑥0 as a starting vector and 𝑟0=𝑏𝐴𝑥0, then it follows that 𝑥𝑘+1=𝑥𝑘+𝑀1𝑟𝑘,𝑘=0,2,,(3.7) which is the preconditioned iterative method. Formula (3.7) can also be derived from the splitting iteration [28] with 𝐴=𝑀𝑁.(3.8) More generally, a relaxation parameter can be introduced as follows: 𝑥𝑘+1=𝑥𝑘+𝜅𝑀1𝑟𝑘,𝑘=0,2,.(3.9)

Recall that Sylvester equations (1.1) can be rewritten as the following two fictitious matrix equations: 𝐴𝑋=𝐸1,𝑋𝐵=𝐸2,(3.10) where 𝐸1=𝐶𝑋𝐵 and 𝐸2=𝐶𝐴𝑋.

By (3.9),we define preconditioned iterations as follows: 𝑋𝑘=𝑋𝑘1+𝜅𝑀11𝐶𝐴𝑋𝑘1𝑋𝑘1𝐵,𝑋(3.11)𝑘=𝑋𝑘1+𝜅𝐶𝐴𝑋𝑘1𝑋𝑘1𝐵𝑀21,(3.12) where 𝑀1 and 𝑀2 are well-chosen preconditioners for 𝐴 and 𝐵, respectively. The 𝑘th approximate solution can be defined by taking the average of two approximate solutions [16, 18], for example, 𝑋𝑘=𝑋𝑘+𝑋𝑘2.(3.13) By selecting two initial approximate solutions, the above procedures (3.11)–(3.13) constitute the framework of the Newton's iteration method.

The above process can be accomplished by the following algorithm.

Algorithm 3.1. Preconditioned gradient based iterative algorithm (PGBI) can be given as follows. (1)Give two initial approximate solutions 𝑋0 and 𝑋0. (2)for 𝑘=1,2,, until converges (3)𝑋𝑘1𝑋=(𝑘1+𝑋𝑘1)/2(4)𝑋𝑘=𝑋𝑘1+𝜅𝑀11(𝐶𝐴𝑋𝑘1𝑋𝑘1𝐵)(5)𝑋𝑘=𝑋𝑘1+𝜅(𝐶𝐴𝑋𝑘1𝑋𝑘1𝐵)𝑀21(6)end.

Remark 3.2. (1) The above algorithm follows the framework of the gradient based iterative method proposed in [1618]. However, in [16, 18] the matrix 𝑀11 is chosen as 𝐴𝑇 (𝑀21=𝐵𝑇), and in [17] the matrix 𝑀11 is chosen as 𝐴𝑇𝐴 (𝑀21=𝐵𝑇𝐵). The previous selection of the matrices is obviously not wise. In this paper, we have illustrated that 𝑀1 and 𝑀2 should be the reasonable preconditioner for 𝐴 and 𝐵, respectively.
(2) Replacing 𝐶𝐴𝑋𝑘1𝑋𝑘1𝐵 by 𝐶𝐴𝑋𝑘1𝐵𝑋𝑘1, then the above algorithm can be used to solve generalized Sylvester equation of form 𝐴𝑋𝐵+𝑋=𝐶.(3.14) We will present an example of form (3.14) in the last section of the paper.

Lemma 3.3 (see [29]). Leting 𝐴𝑛×𝑛, then 𝐴 is convergent if and only if 𝜌(𝐴)<1.

Based on Theorem 3.4 in [18], we have the following theorem.

Theorem 3.4. Suppose Sylvester equation (1.1) has a unique solution 𝑋, and max𝑖||||1𝜅𝜆𝑖2||||<1(3.15) then the iterative sequence 𝑋𝑘 generated by Algorithm 3.1 converges to 𝑋, that is, lim𝑘𝑋𝑘=𝑋 converges to zero for any initial value 𝑋(0).

Proof. Since 𝑋𝑘𝑋=1/2(𝑘+𝑋𝑘), it follows that 𝑋𝑘=𝑋𝑘1+𝜅2𝑀11𝐶𝐴𝑋𝑘1𝑋𝑘1𝐵+𝜅2𝐶𝐴𝑋𝑘1𝑋𝑘1𝐵𝑀21.(3.16)
Letting 𝐸𝑘=𝑋𝑋𝑘 and submitting 𝐶=𝐴𝑋𝑋𝐵 into the above formula, we have 𝑋𝑘=𝑋𝑘1+𝜅2𝑀11𝜅(𝐴𝐸(𝑘1)𝐸(𝑘1)𝐵)+2(𝐴𝐸(𝑘1)𝐸(𝑘1)𝐵)𝑀21.(3.17)
Using 𝑋 to subtract both sides of the above equation, we have 𝐸𝑘=𝐸𝑘1𝜅2𝑀11𝐴𝐸𝑘1𝐸𝑘1𝐵𝜅2𝐴𝐸𝑘1𝐸𝑘1𝐵𝑀21.(3.18)
Let vec(𝑋) be defined as the vector formed by stacking the columns of 𝑋 on the top of one another, that is, 𝑥vec(𝑋)=1𝑥𝑚.(3.19)
Then (3.18) is equivalent to 𝐸vec𝑘=𝜅𝐼2Φ𝐸vec𝑘1,(3.20) where Φ=𝐼(𝑀11𝐴)𝐵𝑇𝑀11+𝑀2𝑇𝐴(𝑀2𝑇𝐵𝑇)𝐼. Therefore, the iteration is convergent if and only if 𝜌(𝐼(𝜅/2)Φ)<1, that is, max𝑖||||1𝜅𝜆𝑖2||||<1.(3.21) The proof is complete.

Remark 3.5. The choice of parameter 𝜅 is an important issue. We will experimentally study its influence on the convergence. However, the parameter is problem dependent; so seeking a parameter that is suitable for a broad range of problems is a difficult task.

4. Numerical Examples

In the following tests, the parameter 𝜅 is set to be 1maxeig𝐴𝐴𝑇+maxeig𝐵𝐵𝑇(4.1) in the gradient based iterative method (GBI), and 𝜅 is chosen experimentally in the preconditioned gradient based iterative method (PGBI). In all the test, the stopping criterion is set to be 1𝑒6. The exact solution is set as 𝑋=rand(𝑚,𝑛)+speye(𝑚,𝑛)2 such that the right hand side 𝐶=𝐴𝑋+𝑋𝐵, and the initial approximate solution is set to be 𝑋0=ones(𝑚,𝑛)1𝑒6. The relative error norms are recorded and plotted by Matlab command semilogy [30]. We always choose ILU(0) [31] as the preconditioner. However, other suitable preconditioners can also be adapted.

Example 4.1. The coefficient matrices used in this example are taken from [18]. We outline the matrix for completeness: 𝑚=50,𝑛=30,rand("state",0),𝐴=triu(rand(𝑚,𝑚),1)+diag(𝛼+diag(rand(𝑚))),𝐵=triu(rand(𝑛,𝑛),1)+diag(𝛼+diag(rand(𝑛))).(4.2) In the matrices, there is a parameter 𝛼 which can be used to change the weight of the diagonal entries. The case with 𝛼=3. For 𝜅=0.1, the behavior of the error norms is recorded in Figure 1(a). From this figure, we can see that the convergence of GBI method tends to slow down after some iteration steps. The PGBI method converges linearly, and much faster than GBI method. In Figure 1(b), the convergence curves with different parameter 𝜅 are recorded. For this problem, we can see that the optimal value of 𝜅 is very close to 0.5. By comparing the convergence of GBI method, we can see that PGBI is able to converge within 300 iteration steps, whereas GBI needs more than 1000 iteration steps. Therefore, even not with the optimal 𝜅, the convergence of PGBI method is also much faster than that of GBI method.

Example 4.2. In this example, we test a problem with 𝐵=𝐴, where coefficient matrix 𝐴 is generated from the discretization of the following two-dimensional Poisson problem: Δ𝑢=𝑓(4.3) posed on the unit square Ω=0𝑥,𝑦1 with Dirichlet boundary conditions 𝑢(𝑥,𝑦)=0.(4.4) Discretizing this problem by the standard second-order finite difference (FE) scheme on a uniform grid with step size =1/(𝑚+1) in each direction, we obtain a coefficient matrix 𝐴=𝐼𝑚𝐷+𝜅2𝑆𝐼𝑚(4.5) with 𝐷=tridiag𝜅1,𝑑,𝜅1,𝑆=tridiag(1,0,1),(4.6) and 𝑑=2(𝜅1+𝜅2), 𝜅1=𝜅2=1. By choosing 𝑚=30, we compared the performance of GBI method and the preconditioned GBI method. The convergence curves are recorded in Figure 2(a). Figure 2(b), the influence of parameter 𝜅 is investigated. From this figure we can see that the optimal 𝜅 is close to 0.4. By comparing the convergence of GBI method, it is easy to see that for a wide range of 𝜅 the preconditioned GBI method is much better than GBI method.

Example 4.3. In this example, we consider the convection diffusion equation with Dirichlet boundary conditions [32] Δ𝑢+2𝜈𝑢𝑥+2𝜈𝑢𝑦[]=𝑓in0,12[],𝑢=𝑔on𝜕0,12.(4.7) The equation is discretized by using central finite differences on [0,1]2, with mesh size =1/(𝑚+1) in the X-direction, and 𝑝=1/(𝑛+1) in the Y-direction, produces two tridiagonal matrices 𝐴 and 𝐵 given by 1𝐴=21tridiag{1+𝜈,2,1𝜈},𝐵=𝑝2tridiag{1+𝜈𝑝,2,1𝜈𝑝}.(4.8) By taking 𝜈=3, 𝑚=60, and 𝑛=40, we have tested the performance of GBI method and the preconditioned GBI method. The convergence curves are recorded in Figure 3(a). From this figure, we can see that the GBI method converges very slowly and nearly stagnate. The preconditioned GBI method has much better performance. We also investigate the influence of parameter 𝜅 on this problem. The convergence behavior with different 𝜅 is recorded in Figure 3(b). From this figure, we can see that the parameter becomes very sensitive when it is larger than the optimal value. Therefore, a relative small parameter is more reliable.

Example 4.4. In this example, we intend to test the algorithm for solving generalized Sylvester equation (3.14). All the initializations are the same except setting 𝐶=𝐴𝑋𝐵+𝑋. The coefficient matrix 𝐴 has the following structure: 𝐴=tridiag(1𝑑,4,1+𝑑)(4.9) and 𝐵=𝐴. We set 𝑑=3 in this example. The tested results are shown in Figure 4. The faster convergence behavior is observable from Figure 4(a).

5. Conclusions

In this paper, a preconditioned gradient based iteration (PGBI) method is proposed. The convergence of PGBI is analyzed. The choice of parameter 𝜅 is an important issue, and its influence is experimentally studied. The principle idea of this paper can be extended to the more general setting like coupled Sylvester matrix equations.

Acknowledgment

This work is supported by NSF no. 11101204, and XJTLU RDF.