Abstract

This note is concerned with the linear matrix equation , where the operator denotes the transpose () of a matrix. The first part of this paper sets forth the necessary and sufficient conditions for the unique solvability of the solution . The second part of this paper aims to provide a comprehensive treatment of the relationship between the theory of the generalized eigenvalue problem and the theory of the linear matrix equation. The final part of this paper starts with a brief review of numerical methods for solving the linear matrix equation. In relation to the computed methods, knowledge of the residual is discussed. An expression related to the backward error of an approximate solution is obtained; it shows that a small backward error implies a small residual. Just like the discussion of linear matrix equations, perturbation bounds for solving the linear matrix equation are also proposed in this work.

1. Introduction

Our purpose of this work is to study the so-called -Stein matrix equation where are known matrices and is an unknown matrix to be determined. Our interest in the -Stein equation originates from the study of completely integrable mechanical systems, that is, the analysis of the -Sylvester equation where , , and are matrices in [1, 2]. By means of the generalized inverses or QZ decomposition [3], the solvability conditions of (2) are studied in [1, 2, 4]. Suppose that the matrix pencil is regular; that is, is invertible for some scalars and . The -Sylvester equation (2) can be written as

Premultiplying both sides of (3) by , we have where , , and . This is of the form (1). In other words, numerical approaches for solving (2) can be obtained by transforming (2) into the form of (1) and then applying numerical methods to (1) for the solution [46]. With this in mind, in this note we are interested in the study of -Stein matrix equation (1).

Our major purpose in this work can be divided into three parts. First, we determine necessary and sufficient conditions for the unique solvability of the solution to (1). In doing so, Zhou et al. [7] transform (1) to the standard Stein equation with respect to the unknown matrix and give the following necessary condition: Here, is the set of all eigenvalues of . Zhou and his coauthors show that if (5) has a unique solution, then (1) has a unique solution. However, a counterexample is provided in [7] to show that the relation (6) is only a necessary condition for the unique solvability of (1).

In [4, 8], the periodic QZ (PQZ) decomposition [3] is applied to consider the necessary and sufficient conditions of the unique solvability of (1); conditions given in [8] ignore the possibility of the existence of the unique solution, while is a simple root of . This condition is included in our subsequent discussion and the following remark is provided to support our observation.

Remark 1. Let and let ; that is, . It is clear that the scalar equation has a unique solution . But condition (6) is not satisfied by choosing .

It can also be observed from Remark 1 that even if (1) is uniquely solvable, it does not imply that (5) (namely, ) is uniquely solvable. Conditions in [4, Equation (4.6)] provided those conditions for the unique solvability of the solution to (1) via a structured algorithm. In our work, through a complete analysis of square coefficient matrices in terms of the analysis of the spectra of the matrix , the new approach to the condition of unique solvability of the -Stein equation (1) can be obtained.

Second, we present the invariant subspace method and, more generally, the deflating subspace method to solve the -Stein equation. Our methods are based on the analysis of the eigeninformations for a matrix pencil. We carry out a thorough discussion to address the various eigeninformation encountered in the subspace methods. These ideas can be implemented in algorithms easily.

Finally, we take full account of the error analysis of (1). Expressions and implications such as the residual, the backward error, and perturbation bounds are derived in this work. Note that for an approximate solution of (1), the backward error tells us how much the matrices , , and must be perturbed. An important point found in Section 5 is that a small backward error indicates a small value for the residual , but the reverse is not usually true.

Beginning in Section 2, we formulate the necessary and sufficient conditions for the existence of the solution of (1) directly by means of the spectrum analysis. In Section 3 we provide a deflating subspace method for computing the solution of (1). Numerical methods for solving (1) and the related residual analysis are discussed in Section 4. The associated error analysis of (1) is given in Section 5 and concluding remarks are given in Section 6.

2. Solvability Conditions of the Matrix Equation (1)

In order to formalize our discussion, let be the Kronecker product of matrices and , let be the identity matrix, and let denote the Frobenius norm.

With the Kronecker product, (1) can be written as the enlarged linear system: where stacks the columns of into a column vector and is the Kronecker permutation matrix [9] which maps into ; that is, where denotes the th column of the identity matrix . Due to the specific structure of , it has been shown in [10, Corollary  4.3.10] that It then follows that since and . Note that eigenvalues of matrices and are the same. By (10) and the property of the Kronecker product [11, Theorem 4.8], we know that That is, the eigenvalues of are related to the square roots of the eigenvalues of , but from (10), no more information can be used to decide the positivity or nonnegativity of the eigenvalues of . A question immediately arises as to whether it is possible to obtain the explicit expression of the eigenvalues of , provided the eigenvalues of are given. In the following two lemmas, we first review the periodic QZ decomposition for two matrices and then apply it to discuss the eigenvalues of .

Lemma 2 (see [3]). Let and be two matrices in . Then, there exist unitary matrices such that and are two upper triangular matrices.

Lemma 3. Let and be two matrices in . Then (1), (2). Here, denotes the principal square root of a complex number .

Proof. Part 1 follows immediately from Lemma 2 since and for some unitary matrices and ; that is,
Let the diagonal entries of and be denoted by and , respectively. Then, is an upper triangular matrix with given diagonal entries, specified by and . After multiplying by from the right, the position of the entry is changed to be in the th row and the th column of the matrix . They are then reshuffled by a sequence of permutation matrices to form a block upper triangular matrix with diagonal entries arranged in the following order: Note that the reshuffling process is not hard to see, when , , and , we have However, it is conceptually simple and regular but operationally tedious to reorder to show this result even for and that will be left as an exercise.
By (13), it can be seen that where for .
Before demonstrating the unique solvability conditions, we need to define that a subset of complex numbers is said to be -reciprocal free if and only if whenever , . This definition also regards and as reciprocals of each other. Then, we have the following solvability conditions of (1).

Theorem 4. The -Stein matrix equation (1) is uniquely solvable if and only if the following conditions are satisfied: (1)the set of is -reciprocal free; (2) can be an eigenvalue of the matrix but must be simple.

Proof. From (7), we know that the -Stein matrix equation (1) is uniquely solvable if and only if By Lemma 3, if , then . Otherwise, . On the other hand, if and is not a simple eigenvalue, then . This verifies (16) and the proof of the theorem is complete.

It is worthy noting that the condition (1) of Theorem 4 is contained in the condition (6) (also appear in [7, Theorem 1]), which is the necessary and sufficient conditions for the solvability of the solution to the Stein equation (5). However, as mentioned before in Remark 1 or [7, Example 2], condition (1) is just a necessary condition for unique solvability of the solution to (1). The -Stein matrix equation (1) is uniquely solvable provided that both conditions (1) and (2) of Theorem 4 are satisfying.

3. The Connection between Deflating Subspace and (1)

The relationship between solution of matrix equations and the matrix eigenvalue problems has been widely studied in many applications. It is famous that solution of Riccati and polynomial matrix equations can be found by computing invariant subspaces of matrices and deflating subspaces of matrix pencils [12]. This reality leads us to find some algorithms for computing solution of (1) based on the numerical computation of invariant or deflating subspaces.

Given a pair of matrices and , recall that the function in the variable is said to be the matrix pencil related to the pair . For a -dimensional subspace is called a deflating subspace for the pencil if there exists a -dimensional subspace such that that is, where are two full rank matrices whose columns span the spaces and , respectively, and matrices . In particular, if in (18), and for an identity matrix , then we have the simplified formula Here, the space spanned by the columns of the matrix is called an invariant subspace for and satisfies One strategy to analyze the eigeninformation is to transform one matrix pencil to its simplified and equivalent form. That is, two matrix pencils and are said to be equivalent if and only if there exist two nonsingular matrices and such that In the subsequent discussion, we will use the notion to describe this equivalence relation; that is, .

Our task in this section is to identify eigenvectors of problem (18) and then associate these eigenvectors (left and right) with the solution of (1). We begin this analysis by studying the eigeninformation of two matrices and , where is a regular matrix pencil.

Note that for the ordinary eigenvalue problem, if the eigenvalues are different, then the eigenvectors are linearly independent. This property is also true for every regular matrix pencil and is demonstrated as follows. For a detailed proof, the reader is referred to [13, Theorem 7.3] and [14, Theorem 4.2].

Theorem 5. Given a pair of matrix and , if the matrix pencil is regular, then its Jordan chains corresponding to all finite and infinite eigenvalues carry the full spectral information about the matrix pencil and consist of linearly independent vectors.

Lemma 6. Let be a regular matrix pencil. Assume that matrices , , are full rank and satisfy the following equations:where and , , are square matrices of size . Then (i) are regular matrix pencils for ,(ii)if , then the matrix is full rank.

We also need the following useful lemma.

Lemma 7. Given two regular matrix pencils , , consider the following equations with respect to Then, if , (23a) has the unique solution .

Proof. For , we get where , . We may without loss of generality assume that ; then and thus . Now, for any , consider the generalized Schur decomposition of . We can assume that and are upper triangular matrices (i.e., ). Denote that the th columns of and are and , respectively. Thus,for .
If , we obtained from the above discussion. Given an integer such that and assume that for , we claim , indeed, from (25a) and (25b), we have Again, the case is following the special case . By using mathematical induction, we prove this lemma.

Corollary 8. Given and , if , then the equation with respect to has the unique solution .

Now we have enough tools to analyze the solution of (1) associated with some deflating spaces. We first establish an important matrix pencil; let the matrix pencil be defined as it is clear that a direct calculation shows that is a solution of the (1) if and only if or if and only if its dual form is

Armed with the property given in Theorem 5 and Lemma 7, we can now tackle the problem of determining how the deflating subspace is related to the solution of (1).

Theorem 9. Let , , and be given in (1) and letwhere is full rank, . Assume that the set of is -reciprocal free. Then, one has(1) if ,(2) and are nonsingular if . Moreover, if is nonsingular, then is the unique solution of (1).

Proof. From (32a) and (32b) we get(i) It follows from (33a) and (33c) that since , we have by Lemma 7. (ii) It can be seen that there exist two nonsingular matrices and such that Hence, together with (32a) and (32b) we have Since and , by Theorem 5 and Lemma 6, the matrix is nonsingular. Together with (33c), and are nonsingular.
Let ; then form (33b) and (33d) or Since the set of is -reciprocal free, together with we get . If is nonsingular, it is easy to verify that two matrices and are both satisfying -Stein equation (1). The proof of part (ii) is complete.

Remark 10. (1) It is easily seen that and both span the unique deflating subspace of corresponding to the set of . Otherwise, in part (ii) we know that is nonsingular. We then are able to transform the formulae defined in (32a) and (32b) into the generalized eigenvalue problem as follows:
That is, some numerical methods for the computation of the eigenspace of corresponding to the set of can be designed and solved (1).
(2) Since the transport of the unique solution of (1) is equal to the unique solution of the following matrix equation analogous to the consequences of Theorem 9, the similar results can be obtained with respect to (40) if is nonsingular. However, we point out that (1) can be solved by computing deflating subspaces of other matrix pencils. For instance we let Assume that the set of is -reciprocal free; it can be shown that and it has similar results to the conclusion of Theorem 9. The unique solution of (1) can be found by computing deflating subspaces of the matrix pencil without the assumption of the singularity of and .

4. Computational Methods for Solving (1)

Numerical methods for solving (1) have received great attention in theory and in practice and can be found in [5, 6] for the Krylov subspace methods and in [1517] for the Smith-type iterative methods. In particular, Smith-type iterative methods are only workable in the case , where denotes the spectral radius of . In the recent years, a structure algorithm has been studied for (1) [4] via PQZ decomposition, which consists of transforming into Schur form by a PQZ decomposition and then solving the resulting triangular system by way of back-substitution. In this section, we revisit these numerical methods and point out the advantages and drawbacks of all algorithms.

4.1. Krylov Subspace Methods

Since the -Stein equation is essentially a linear system (7), we certainly can use the Krylov subspace methods to solve (7). See, for example, [5, 6] and the reference cited therein. The general idea for applying the Krylov subspace methods is defining the -Stein operator as and its adjoint liner operator as such that . Here, , and the notion is denoted as the Frobenius inner product. Then, the iterative method based on the Krylov subspaces for (1) is as follows.

(i) The conjugate gradient (CG) method [5]: with an initial matrix and the corresponding initial conditions

Note that when the solvability conditions of Theorem 4 are met, the CG method is guaranteed to converge in a finite number of iterations for any initial matrix .

4.2. The Bartels-Stewart-Like Algorithm [18]

In this section, we focus on the discussion of the Bartels-Stewart algorithm, which is known to be a numerical stable algorithm, to solve -Stein equations. This method is to solve (1) by means of the PQZ decomposition [18]. Its approach has been discussed in [4] and can be summarized as follows. From Lemma 3, we know that there exist two unitary matrices and (see [3] for the computation procedure) such that With and being uppertriangular, the transformed equation looks like with . We then have Thus, the Bartels-Stewart algorithm can easily be constructed by first solving from (46), using to obtain and from (47) and (48), and then repeating the same discussion as (46)–(48) by taking advantage of the property of and being lower triangular matrices from (49).

4.3. Smith-Type Iterative Methods

Recently, a class method referred to as the Smith accelerative iteration has been studied for solving the Sylvester matrix equation [15]. The Smith accelerative iteration has attracted much interests in many papers (see [7, 17] and the references therein) because of its nice numerical behavior, a quadratic convergence rate, low computational cost, and high numerical reliability, despite the lack of a rigorous error analysis. Since the Sylvester matrix equation can be transformed into the Stein matrix equation with a suitable transformation, Zhou et al. proposed Smith-type iterative methods (including the Smith accelerative iteration, Smith () iteration, and -Smith iteration) for solving the Stein matrix equation [17] and applying Smith-type iterative methods to (1) [7]. In this section, we try to explain the Smith accelerative iteration based on the invariant subspace method and summarize the recent results from [7].

Originally, the Smith-type iterative methods are developed to solve the standard Stein equation As mentioned before, the unknown is highly related to the generalized eigenspace problems or

Premultiplying (51a) by the matrix and postmultiplying (51b) by the matrix , we get Then, for any positive integer , we obtain where the sequence is defined byThe explicit expression of is given as follows: Under the condition , it is easy to see that is convergent, and that is, converges quadratically to as . This iterative method (54a) and (54b) is called the Smith accelerative iteration [15]. In recent years, some modified iterative methods are so-called Smith-type iteration, which are based on Smith iteration and improve its speed of convergence. See, for example, [17] and the references cited therein.

Since the condition implies that the assumptions of Theorem 4 hold, (1) is equivalent to (5). We can apply the Smith iteration to the (1) with the substitution . One possible drawback of the Smith-type iterative methods is that they cannot always handle the case when there exist eigenvalues such that even if the unique solution exists. Based on the solvable conditions given in this work, it is possible to develop a specific technique working on the particular case and it is a subject currently under investigation.

5. Error Analysis

Error analysis is a way for testing the stability of a numerical algorithm and evaluating the accuracy of an approximated solution. In the subsequent discussion, we want to consider the backward error and perturbation bounds for solving (1).

As indicated in (44), matrices and are both uppertriangular. We can then apply the error analysis for triangular linear systems in [19, Section 3.1] and [20] to obtain where is a content depending on the dimensions and and is the unit roundoff. Since the PQZ decomposition is a stable process, it is true that with a modest multiple .

Note that the inequality of the form (58) can serve as a stopping criterion for terminating iterations generated from the Krylov subspace methods [5, 6] and Smith-type iterative methods [1517]. In what follows, we will derive the error associated with numerical algorithms, following the development in [20, 21].

5.1. Backward Error

Like the discussion of the ordinary Sylvester equations [20], the normwise backward error of an approximate solution of (1) is defined by where , , and . Let , which implies that . It can be seen that the residual satisfies From (60), we know that a small backward error indeed implies a small relative residual . Since the coefficient matrices in (1) include nonlinearity, it appears to be an open problem to obtain the theoretical backward error with respect to the residual. Again, similar to the Sylvester equation discussed in [20, Section 16.2], the conditions under which a -Stein equation has a well-conditioned solution remain unknown.

5.2. Perturbation Bounds

Consider the perturbed equation Let be the corresponding -Stein operator. We then have . With the application of norm, it follows that where . When is small enough so that , we can rearrange the previous result to be With and the condition number , we arrive at the standard perturbation result Thus the relative error in is controlled by those in , , and , magnified by the condition number .

On the other hand, we can also drop the high-order terms in the perturbation to obtain We then rewrite the system in terms of where . Let . It can be shown that where .

A possible disadvantage of the perturbation bound (67), which ignores the consideration of the underlying structure of the problem, is that it overestimates the effect of the perturbation on the data. But this “universal” perturbation bound is accessible to any given matrices , , and of (1).

Unlike the perturbation bound (67), it is desirable to obtain a posteriori error bound by assuming and in (61). This assumption gives rise to It is true that while doing numerical computation, this bound given in (68) provides a simpler way for estimating the error of the solution of (1).

6. Conclusion

In this note, we propose a novel approach to the necessary and sufficient conditions for the unique solvability of the solution of the -Stein equation for square coefficient matrices in terms of the analysis of the spectra . Solvability conditions have been derived and algorithms have been proposed in [4, 8] by using PQZ decomposition. On the other hand, one common procedure to solve the Stein-type equations is by means of the invariant subspace method. We believe that our discussion is the first which implements the techniques of the deflating subspace for solving -Stein matrix equation and might also give rise to the possibility of developing an advanced and effective solver in the future. Also, we obtain the theoretical residual analysis, backward error analysis, and perturbation bounds for measuring accurately the error in the computed solution of (1).

Acknowledgments

The author wishes to thank Proffessor Eric King-wah Chu (Monash University, Australia), Proffessor Matthew M. Lin (National Chung Cheng University, Taiwan), and two anonymous referees for many interesting and valuable suggestions on the paper. This research work is partially supported by the National Science Council (NSC101-2115-M-150-002) and the National Center for Theoretical Sciences in Taiwan.