Abstract

We present the generalized low-rank alternating direction implicit method and the low-rank cyclic Smith method to solve projected generalized continuous-time Sylvester equations with low-rank right-hand sides. Such equations arise in control theory including the computation of inner products and norms and the model reduction based on balanced truncation for descriptor systems. The requirements of these methods are moderate with respect to both computational cost and memory. Numerical experiments presented in this paper show the effectiveness of the proposed methods.

1. Introduction

In this paper we consider the projected generalized continuous-time Sylvester equation of the form where , , , and are the sought-after solution. Here, and are the spectral projectors onto the right deflating subspaces corresponding to the finite eigenvalues of the pencils and , respectively; and are the spectral projectors onto the right deflating subspaces corresponding to the eigenvalue at infinity. The spectral projectors onto the left deflating subspaces corresponding to the finite eigenvalues of and are denoted by and , respectively, while and are the spectral projectors onto the left deflating subspaces corresponding to the eigenvalue at infinity. We assume that the matrices and are singular, but the pencils and are regular; that is, and are not identically zero. Under the assumption, the pencils and have the Weierstrass canonical forms: there exist nonsingular matrices and matrices such that where , , , and are block diagonal matrices with each diagonal block being a Jordan block. The eigenvalues of and are the finite eigenvalues of the pencils and , respectively. and correspond to the eigenvalue at infinity. The index of nilpotency of is called the index of the pencil . Using (2), , , , and can be expressed as

A number of numerical solution methods have been proposed for the standard/generalized Lyapunov and Sylvester equations. Two classical direct methods are the Bartels-Stewart method [1] and the Hammarling method [2]. These methods need to compute the real Schur forms/generalized real Schur forms of the underlying matrices/matrix pencils by means of the QR/QZ algorithm [3]. Besides direct methods, we mention, among several iterative methods, the Smith method [4], the alternating direction implicit iteration (ADI) method [5, 6], the Smith() method [7], the low-rank Smith methods [79], the Cholesky factor-alternating direction implicit method [1013], and the (generalized) matrix sign function method [1425].

As , , and , (1) is referred to as the projected generalized continuous-time Lyapunov equation, which arises in stability analysis and control design problems for descriptor systems including the characterization of controllability and observability properties, balanced truncation model order reduction, determining the minimal and balanced realizations, and computing and Hankel norms; see [2631] and the references therein. If the pencil is c-stable, that is, all its finite eigenvalues have negative real part, then the projected generalized Lyapunov equation has a unique solution for each , and if, additionally, is symmetric and positive semidefinite, then the solution is symmetric and positive semidefinite; see, for example, [32] for details. Recently, several numerical methods have been proposed in the literature for solving the projected generalized Lyapunov equation. In [33], two direct methods, the generalized Bartels-Stewart method and the generalized Hammarling method, were proposed for the projected generalized Lyapunov equation. The generalized Hammarling method is designed to obtain the Cholesky factor of the solution. These two methods are based on the generalized real Schur form of the pencil and require flops and memory. Iterative methods to solve the projected generalized Lyapunov equation have also been proposed. Stykel [31] extended the ADI method and the Smith method to the projected equation. Moreover, their low-rank versions were also presented, which could be used to compute low-rank approximations to the solution. These methods are especially suitable for large sparse equations with low-rank . Another iterative method for the projected generalized Lyapunov equation is the modified generalized matrix sign function method [25]. Unlike the classical generalized matrix sign function method, the variant converges quadratically independent of the index of the underlying matrix pencil; see [25] for more details.

The projected generalized Sylvester equation (1) also plays an important role in control theory for descriptor systems.

Let and be two c-stable systems. As shown in [34], and can be decomposed into where Here, are called the strictly proper parts of , while are called the polynomial parts of , respectively. Define the inner product of and by where the notation denotes the trace of a matrix. Then, the norm of induced by the inner product is For ,  , it can be shown that could be expressed as where is the solution of the projected generalized continuous-time Sylvester equation (1) with . Therefore, the norm can be computed by where is the solution of the following projected generalized continuous-time Sylvester equation:

We now present the application of the projected generalized continuous-time Sylvester equation in model reduction of c-stable descriptor systems. Let and denote the proper controllability and observability Gramians of the continuous-time descriptor system , respectively. For the definitions of these Gramians and their applications in model reduction of , the reader is referred to [30]. The proper Gramians and are the unique symmetric, positive semidefinite solutions of of the projected generalized continuous-time controllability and observability Lyapunov equations associated with the system , respectively.

There exists another kind of Gramians, called cross-Gramians; for a standard square system, see [35]. We now extend the definition to square descriptor systems. The proper cross-Gramian for a c-stable square descriptor system is the solution of the projected generalized continuous-time Sylvester equation (10). It can been shown that the proper cross-Gramian satisfies By making use of the relation above and a similar approach as proposed in [35], we can design a cross-Gramian-based model reduction method for the square descriptor system .

In this paper, we propose two iterative methods for solving the projected generalized Sylvester equation (1). This work presented here is an extension of [31]. Numerical experiments show the effectiveness of the proposed methods.

We note that the perturbation analysis for (1) has been presented in [36]. The following theorem, for example, [37], gives sufficient conditions for the existence, uniqueness, and analytic formula of the solution of the projected generalized continuous-time Sylvester equation (1).

Theorem 1. Let and be regular pencils with finite eigenvalues and counted according to their multiplicities, respectively. Then, the projected generalized continuous-time Sylvester equation (1) has a unique solution for every if for any and .
Moreover, if and are c-stable, that is, all their finite eigenvalues have negative real part, then can be expressed as

Throughout this paper, we adopt the following notations. The square identity and zero matrices are denoted by and , respectively. The spaces of real matrices are denoted by . The -norm and the Frobenius matrix norm are denoted by and , respectively. The superscript “” denotes the transposition of a vector or a matrix. is the spectral radius of square matrix , and is the spectral condition number of . The open left and right half-plane are denoted by and , respectively. We will also adopt MATLAB-like convention to access the entries of vectors and matrices. For a matrix , is ’s th entry; ’s submatrices , , and consist of intersections of row to row and column to column , row to row , and column to column , respectively.

The remainder of the paper is organized as follows. In Section 2, we propose the generalized low-rank alternating direction implicit method for the solution of (1). In Section 3, we discuss the choice of shift parameters. In Section 4, we show how to solve the projected Sylvester equations by the low-rank cyclic Smith method. Section 5 is devoted to some numerical tests. Some concluding remarks are given in the last section.

2. Alternating Direction Implicit Method

The alternating direction implicit (ADI) method was first introduced by Peaceman and Rachford [38] to solve linear systems arising from the discretization of elliptic boundary value problems and then used in [57, 12, 13, 31] to solve Lyapunov or Sylvester matrix equations. In this section, we consider how to generalize the ADI method for iteratively solving (1).

We always assume that the pencils and are c-stable; that is, all their finite eigenvalues have negative real part. Hence, the matrices and are nonsingular. Multiplying the first equation in (1) on the left by and on the right by , we get the following projected standard Sylvester equation: The iterates of the ADI iteration for (13) are usually generated by the alternating solution of two linear systems with multiple right-hand sides: where and the shift parameters and are elements of and , respectively. These two equations are equivalent to the following single iteration step: We have from that , that is, the second equation in (1) and (13) is satisfied exactly. We can rewrite iteration (15) as Let denote the exact solution of (1). Then it is easily verified that the error matrix obeys the recursion: where By using the Weierstrass canonical form (2)–(3), we have with This implies that if contains all finite eigenvalues (multiple eigenvalues counted by their algebraic multiplicities) of or if contains all finite eigenvalues of , then . This is due to the Cayley-Hamilton theorem [39], which states that for ’s characteristic polynomial and for . From (18), we can see that the parameters and should be chosen to achieve ; we will discuss the details of the choice of parameters in the next section.

About the th approximate solution of the ADI method, by using (18)–(20), we have the following estimate.

Theorem 2. Assume that the pencils and are in Weierstrass form (2) with and being diagonal. Then

Note that the solution is explicitly computed by the ADI iteration (17), so the storage requirement is . One should notice that in many cases the storage requirement is the limiting factor rather than the amount of computation. We note that low-rank schemes are the only existing methods that can effectively solve large-scale Lyapunov/Sylvester equations. Assume that the low-rank right-hand side has the factored form with and . Instead of explicitly forming the solution , the low-rank method computes and stores approximate solutions in low-rank factored form. If the numerical rank of is much smaller than , that is, , then the storage is reduced from to or .

The key idea in the low-rank version of the ADI iteration is to rewrite the iteration in (17) as an outer product: This is always possible since starting with the initial guess . The low-rank alternating direction implicit (LR-ADI) method is based on (17). Replacing with , (17) can be reformulated in terms of the low-rank factors as where From the fact that , , and are all zero matrices, it can be seen that is , is , and is . Thus the rank of is no more than . Since the order of the ADI parameters and is not important, the ordering of and can be reversed. As shown in [12], we have the following iterative scheme: and so we have with The algorithm is summarized as in Algorithm 1.

Input: , , and with
  and are c-stable. The number of ADI steps and the ADI shifts
   and computed by Algorithm 2.
Output: , and such that is an approximate solution of the projected
  generalized Sylvester equation (1) with .
(1) Let , and .
(2) For
   ,
   ,
   .
End For

We make a few comments on Algorithm 1.(1)Since the computation of the Weierstrass canonical form is sensitive under small perturbations, we should make use of the generalized real Schur factorization to compute the spectral projectors; see, for example, [33]. For large-scale problems, the computation of the spectral projectors by the generalized real Schur factorization may be very expensive. However, in some applications the spectral projectors can be expressed in explicit form by using the special block structure of the matrices , , , and ; see numerical examples in this paper or [31]. (2)If the number of shift parameters is smaller than the number of iterations required to obtain a prescribed tolerance, then we reuse these parameters in a cyclic manner.

3. Choice of Shift Parameters

The selection of good parameters is vitally important to the successful application of the generalized LR-ADI iteration. The rate of convergence is dominated by spectral radii of matrices and given by (20). The minimization of these spectral radii with respect to the parameters and leads to the generalized ADI minimax problem where and denote two sets of finite eigenvalues of the pencils and , respectively. In practice since the eigenvalues of the pencils and are unknown and expensive to compute, and are often replaced by domains that contain the eigenvalues of and , respectively. Noting that and are nonsingular, this problem is equivalent to find and such that where and denote spectrums of the matrices and , respectively.

To compute the suboptimal ADI shift parameters for the standard Lyapunov equation with and , a heuristic algorithm has been proposed in [7]. It chooses suboptimal ADI parameters from a set , which is taken to be the union of the Ritz values of and the reciprocals of the Ritz values of , obtained by two Arnoldi processes, with and . As shown in [7], the Ritz values obtained by the Arnoldi process with tend to be located near the “outer” eigenvalues, that is, the eigenvalues near the convex hull of the spectrum. In particular, the eigenvalues of large magnitude are usually approximated well. In contrast, they are generally poor approximations to the eigenvalues near the origin. Therefore, one computes the reciprocals of the Ritz values obtained by the Arnoldi process with to approximate the eigenvalues near the origin.

In [12], this idea has been extended to the case for Sylvester equations. A heuristic procedure which is easy to implement has been proposed in [12]. It can also be naturally extended to the generalized problem (30). Here, we need to compute the largest and smallest (in modulus) nonzero approximate eigenvalues of and , respectively. Noting that and are assumed to be singular, inverses of and do not exist. In [31], Stykel proposed a strategy to overcome this difficulty. Define where , , , and are the transformation matrices as in (2). A simple calculation gives that Then it is clear that the reciprocals of the smallest nonzero eigenvalues of and are the largest eigenvalues of and , respectively. Thus, we can run two Arnoldi processes with the matrices and to compute the smallest nonzero eigenvalues of and , respectively. In some applications, similar to the projectors , , and , the matrices , and can be also obtained in explicit form by making use of the special block structure of the matrices , , , and . The algorithm for choosing and is summarized as in Algorithm 2.

Input: , with and c-stable,
  and .
Output: ADI parameters and .
(1) Run Arnoldi process with on to give the set of Ritz values,
(2) Run Arnoldi process with on to give the set of Ritz values,
(3) ,
(4) Run Arnoldi process with on to give the set of Ritz values,
(5) Run Arnoldi process with on to give the set of Ritz values,
(6) ,
(7) Set ,
(8) For
  Set ,
  where is with deleted, and similarly for ,
End For

For more details about Algorithm 2, the interesting reader is referred to [12].

4. Smith Method

The Smith method [4] is derived from the projected discrete-time Sylvester equation with which is equivalent to (1) for any two parameters and . Under this assumption, the sequence generated by the iteration converges to the solution .

If all of the shifts in the generalized ADI iteration (17) are constant, that is, , (), then the generalized ADI iteration reduces to the Smith method. In [7], Penzl illustrated that the ADI iteration with a single shift (Smith method) converges very slowly, while a moderate increase in the number of shifts accelerates the convergence nicely. However, it is also observed that the speed of convergence is hardly improved by a further increase of ; see Table  2.1 in [7]. These observations lead to the idea of the cyclic Smith() iteration, a special case of ADI where different shifts are used in a cyclic manner.

The low-rank scheme based on the Smith() iteration was also introduced in [7]. The method is called the cyclic low-rank Smith method (LR-Smith()) and is a special case of LR-ADI where number of shifts are reused in a cyclic manner. This idea can be generalized for (1). The generalized LR-Smith method consists of two steps. First the matrices , , and are obtained by an step generalized LR-ADI iteration. Then one solves the discrete-time Sylvester equation where and are as in (19) with , respectively.

The generalized LR-Smith method for solving the projected generalized Sylvester (1) is described as in Algorithm 3.

Input: , , and . and
  are c-stable.
Output: , and such that is an approximate solution of the projected
  generalized Sylvester equation (1) with .
(1) Compute , and by using Algorithm 1 with .
(2) Set , and .
(3) For
  For
     ,
     ,
  End For
   ,
   ,
 End For

5. Numerical Experiments

In this section, we present some numerical examples to illustrate the performance of the generalized LR-ADI method for the projected generalized Sylvester equation (1). Let LR-ADI and LR-Smith() denote Algorithms 1 and 3, respectively. In the following examples, we compare the numerical behavior of LR-ADI with LR-Smith() with respect to the number of iterations (Iter for short), CPU time (CPU for short), and the relative residuals (Res for short). All numerical experiments are performed on a PC with the usual double precision, where the floating point relative accuracy is .

The stopping criterion for both methods is

Let mvp denote the number of matrix vector products and let lss denote the number of linear system solvers at every iteration for the two methods. Table 1 has a rough count of the expenses of LR-ADI and LR-Smith() at every iteration. Only the major expenses are considered. From Table 1, we can see that LR-Smith() needs more mvp and lss than LR-ADI at every iteration, so LR-ADI is more efficient than LR-Smith() for the computation cost at every iteration.

Example 1. For the first experiment, we consider the 2D instationary Stokes equation that describes the flow of an incompressible fluid in a domain. The spatial discretization of this equation by the finite difference method on a uniform staggered grid leads to the descriptor systems The matrix coefficients in (38) are given by These matrices are sparse and have a special block structure. Using this structure, the projectors and onto the left and right deflating subspaces of the pencil can be computed as where is the orthogonal projector onto the kernel of along the image of ; see [40]. The matrices and have full rank, and the pencil is of index . Analogously, we can get the projectors and of the pencil . In our experiment the state space dimensions of the problems are and , respectively. The matrix in (1) is with and .
See Figure 1 for a graph of the convergence. The results in Table 2 clearly indicate that the LR-ADI method is more efficient than LR-Smith() for this example.

Example 2. For the second experiment, we consider a holonomically constrained damped mass-spring system with masses as in [25]. The th mass is connected to the th mass by a spring and a damper and also to the ground by another spring and damper. Moreover, the first mass is connected to the last one by a rigid bar, and it can be influenced by a control. The vibration of this system is described by the descriptor system (38) with the matrices where is the symmetric positive definite mass matrix, is the stiffness matrix, is the damping matrix, and is the matrix of constraints. If has full row rank, the pencil is of index , and the spectral projectors and can be computed as Here and are a projector onto the kernel of along the image of ; see [41] for details. The matrix has the form , where denotes the th column of the identity matrix . Analogously, we can get the projectors and of the pencil . In this experiment the state space dimensions of the problems are and , respectively. The matrix in (1) is with and .
The computational results were reported in Figure 2 and Table 3. We note that the iteration steps of the LR-Smith(6) method is while the steps of LR-ADI is , but the CPU time of LR-Smith(6) method is much more than that of LR-ADI method. This is because the computation cost at every iteration of two methods is very different.

6. Conclusions

In this paper, we have proposed the generalized low-rank alternating direction implicit method and the generalized low-rank cyclic Smith method to solve the projected generalized Sylvester equations. Numerical experiments presented in this paper show the effectiveness of the proposed methods.

Acknowledgments

Liang Bao is supported by the National Natural Science Foundation of China under Grants 10926150 and 11101149 and the Fundamental Research Funds for the Central Universities. Yiqin Lin is supported by the Construct Program of the Key Discipline in Hunan University of Science and Engineering and the Chinese Postdoctoral Science Foundation under Grant 2012M511386.