#### Abstract

Computational effort of solving large-scale Sylvester equations is frequently hindered in dealing with many complex control problems. In this work, a parallel preconditioned algorithm for solving it is proposed based on combination of a parameter iterative preconditioned method and modified form of conjugate gradient (MCG) method. Furthermore, Schur’s inequality and modified conjugate gradient method are employed to overcome the involved difficulties such as determination of parameter and calculation of inverse matrix. Several numerical results finally show that high performance of proposed parallel algorithm is obtained both in convergent rate and in parallel efficiency.

#### 1. Introduction

Sylvester equations play vital roles in a number of applications such as control theory [1], matrix eigendecompositions [2], model reduction [3–5], and image processing [6]. Particularly, when solving parabolic and elliptic PDEs by the discretization method, the issue of solving a large scale Sylvester matrix equation is often encountered. Nowadays, there are two types of methods for solving these matrix equations: the first common one is the standard direct method proposed by Bartels and Stewart [7]. However, this is not applicable in large-scale settings. In order to overcome this limitation, the other concerning iterative form is discussed such as a few simple iterative schemes proposed by Ding and Chen [8]. ADI method was first introduced by Peaceman and Rachford [9] for solving parabolic and elliptic PDEs and was later adapted solving the Sylvester equation by Wachspress [10]. Although using ADI method needs to choose optimal parameters, it is too difficult to find them for a large scale problem. Therefore, the parameter iterative method was first proposed in [11], and the available choice of parameter was also given for a large scale problem. Recently, a new effective method based on Krylov subspace [12–14] is discussed. In particular, conjugate gradient (CG) method [15] is its essential one, which has the advantages of small store, small computing, and the fact that it is suitable for parallel. But it is only applied for the case that the coefficient matrix is positive definite symmetric matrices. So we can choose the modified conjugate gradient method (MCG) [16, 17] in other cases. The convergence speed of MCG has been closely related to the singular values of the coefficient matrix, and the more concentrated the singular values are, the better the convergence speed of MCG is. In order to enhance the speed of convergence, the idea of preconditioning [18] is introduced. By this way, it will concentrate the singular values of the coefficient matrix. All of these brought about a need for the development of efficient parallel algorithm as a computational time is prohibitive, which motivates us to propose a parallel preconditioned algorithm for solving large Sylvester matrix equation.

The remainder of the paper is organized as follows. In Section 2, there is the introduction of MCG algorithm for solving large matrix equation , and then our proposed preconditioned MCG algorithm is presented in Section 3, followed by the implementation of parallel algorithm (in Section 4). Convergence Analysis is given in Section 5. Finally, some numerical examples are discussed with the results analysis and comparison, followed by concluding comments.

Throughout this paper we use the following notations. Let and stand for two matrices and , , and represent the transpose of matrix , spectral radius of matrix, and the identity matrix, respectively. At the same time, we define the inner product as ; then the matrix norm of induced by the inner product is Frobenius norm and denoted by ; that is, .

#### 2. Introduction of MCG Algorithm for Solving

An Sylvester matrix equation takes the form where , , and are given matrices, and is unknown matrix. MCG algorithm [15, 19, 20] can be expressed as follows.

*Algorithm 1. * Choose initial vector ; let , and calculate

If , or but stop; otherwise, calculate

Then perform

Furthermore, the convergent condition of MCG algorithm referred in [15, 19, 20] is illustrated as Theorem 2.

Theorem 2. *Assume that is consistent; then, for any arbitrary initial matrix , a solution of can be obtained with finite iteration steps in the absence of round-off error.*

It is important to accelerate the speed of the convergence. Form the matrix equation , we know that the speed of convergence depends on the singular values centralization of the coefficient matrix . With the point of theorem, it is difficult to calculate its singular values. In order to easily enhance the speed of convergence, we propose a preconditioned MCG algorithm mentioned in the following section.

#### 3. Proposed Preconditioned MCG Algorithm

In order to choose preconditioner for MCG, we introduce the parameter iterative method in [11] for solving (1), which is formed as follows: where and is a parameter. The speed of convergence depends on the value of , since , . It is much simple to obtain the best convergence effect after decision of an optimal value of . However, two inverse matrices of and must be solved to get the matrices , . Let and ; we can find and by solving the matrix equations and .

Next subsection will give the methods determining the parameter and solving the matrices , , and , respectively.

##### 3.1. Technique for Getting the Parameter and the Inverse of Matrix

###### 3.1.1. Determination of the Parameter

If the negative real numbers are the eigenvalues of and the negative real numbers are the eigenvalues of , based on the optimal parameter measurement in [11], we have either the optimal parameter or .

On basis of* Schur’s inequality* in [21], we determine approximate value.

Theorem 3 (Schur’s inequality). *Let , and let the eigenvalues of be ; then it holds that:
**If satisfy , from Schur’s inequality, one has
*

*then we choose either or ; it is very easy for us to find .*

###### 3.1.2. Algorithm for Solving Matrix Equation

We use the MCG algorithm to solve the matrix equations and . The specific algorithm for it will be given as follows.

*Algorithm 4. * Give an initial matrix , let calculate

If , or but stop; otherwise, do the form

Then perform

Let and turn back to .

Through Algorithm 4, we can get the matrix . In similar manner, the inverse of can be obtained. Therefore, the matrices , , and can be obtained.

##### 3.2. Algorithm for Solving the New Matrix Equation

We still use MCG algorithm for solving the matrix equation , which is obtained from the parameter iterative method of the above introduced, for solving (5). Thereby the precondition MCG is developed; it can be expressed as follows.

*Algorithm 5. * Given initial matrix , calculate

For , calculate Step 1: If , or but stop; otherwise, turn to Step 2; Step 2: ; Step 3: ; Step 4: , ; Step 5: ; Step 6: .

From every step of the previously mentioned algorithms, we can find that there are only matrix multiplication, addition, and inner product, which shows that the algorithm has very good parallelism. Furthermore, at each iteration, amount of calculation with preconditioned modified conjugate gradient method is the same as that with modified conjugate gradient method. There is only computational cost on obtaining the inverse of two matrices. The numerical results will also confirm its advantage both in convergent rate and parallel efficiency.

#### 4. Parallel Implementation of Proposed Algorithm

For describing convenience, let the processor number be , , where is integer; represents th processor. And mark where , , , , , , , , , , and are subblock matrices, and is subblock matrix . Because of subblock matrices and of need be stored on ; then, the data of will be redistributed during the precondition process. And the details can be seen in precondition process. In addition, since the storage manner directly influences the ways of matrix multiplication in parallel, two parallel computing algorithms of matrix multiplication proposed in [22] are used in the paper: Multiplication 1: matrix multiplication with block row-row matrices stored; Multiplication 2: matrix multiplication with block row-column matrices stored.

Its detailed implementation can be expressed as follows.

*(1) Storage Method.* Store , , , , , on processor .

*(2) Precondition Process.* Solve matrix equations , by Algorithm 4 in parallel, then we can get the matrices and . Calculate on , the subblock matrices of can be obtained by ; all the processors pass times in turn, and the sub-block matrices of can be obtained by . Thus, the two storage ways of are realized. Calculate on ; the subblock matrices of can be obtained by .

*(3) Calculate Process.* Calculate matrix multiplication by Multiplication 1; then calculate matrix multiplication by Multiplication 1. Finally, we can obtain ; then, get matrix multiplication by Multiplication 1, and calculate by Multiplication 2, so can be obtained.

*(4) Cycle Process: *(1)On , calculate and ; we can get and after all-reduce. If , or but stop; otherwise, turn back to .(2)Calculate in each processor.(3)On calculate .(4)Then get and ; Processor calculates and , and finally obtain after all-reduce.(5)After that and are done; then calculate on .(6)Calculate the factor in each processor; then on processor , renew the value .(7)Let ; turn back to .Note that if is not divisible by , some processors store rows-block of , and , sequentially, and others store rows-block to make the load balance of each processor and short wait-time.

#### 5. Convergent Analysis

Let the matrices and be both positive definite symmetric matrices. , are the eigenvalues of and , respectively. , satisfy , .

For the matrix equations , the speed of the convergence depends on While our proposed matrix forms , its convergent rate is decided by The following comparison could indicate that our algorithm is more effective.

Theorem 6. *If the eigenvalues of matrices and satisfy
**
then there exists such that the inequality
**
holds.*

*Proof. *Since , for , we have
From [17], we have known that , for . Since and, we have that
So
that is,
Therefore

From previously mentioned theorem, we can observe that proposed preconditioned MCG is better than traditional MCG if and are less, that is, if the singular values of and are centralized slightly.

#### 6. Numerical Examples and Results Analysis

In this section, the comparison with three methods the parameter iterative method, the MCG algorithm, and the proposed algorithm (PMCG) is discussed for solving two examples. The parallel platform is the parallel machine Lenovo Shenteng 1800 cluster. In the following test cases, and termination condition .

*Example 7. *Consider the elliptical partial differential equation
the boundary condition are

Let the step size be , and five-point difference form is used to discrete above equation, which is finally transformed to solving a Sylvester matrix equation seen in (1). Since the eigenvalues of matrix and matrix are all greater than zero, then we have and , respectively. The numerical results are shown in Table 1.

*Example 8. *If matrices and , where
is any given matrix, . The comparison of three different methods for solving Sylvester matrix equation is discussed here.

Since the eigenvalues of matrix and matrix have not only negative real numbers but also positive ones, so there are four choices for . We have , , , and , respectively. The numerical results are shown in Table 2.

##### 6.1. Results Analysis

From the results shown in the above tables, we observed the following.(i)Compared with the modified conjugate gradient method and parameter iterative method, the iteration number and the computational time by our algorithm are significantly reduced. It is shown that the preconditioned method enhances its validity and convergent rate. Obviously, better parallel efficiency can be obtained with our algorithm.(ii)When the eigenvalues of matrix and matrix are all positive real numbers (seen in Example 7), the parallel efficiency of our algorithm is much higher than it with traditional MCG. Furthermore, the numerical results are consistent with the theory; in the case that the eigenvalues have not only negative real numbers but also positive ones in Example 8, the parallel efficiency of our algorithm is still higher than that with MCG. That means the preconditioned factor is efficient.(iii)From the results in the above Tables, we can observe that the parameter method is ineffective in Example 8. This illustrates that our algorithm has a wider range of applications than parameter method.

#### Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

#### Acknowledgments

This work has been supported by the Basic Project Foundation of Northwestern Polytechnical University (Grants no. JC20120241), by the Natural Science Foundation of Shan’xi Province (Grants no. 2009JM1008), and by the National Natural Science Foundation of China (Grants no. 11302173). The authors also acknowledge the High Performance Computing Center of Northwestern Polytechnical University for implementation of above two examples.