Abstract

We analyze the best approximation (in the Frobenius sense) to the identity matrix in an arbitrary matrix subspace ( nonsingular, being any fixed subspace of ). Some new geometrical and spectral properties of the orthogonal projection are derived. In particular, new inequalities for the trace and for the eigenvalues of matrix are presented for the special case that is symmetric and positive definite.

1. Introduction

The set of all real matrices is denoted by , and denotes the identity matrix of order . In the following, and denote, as usual, the transpose and the trace of matrix . The notations and stand for the Frobenius inner product and matrix norm, defined on the matrix space . Throughout this paper, the terms orthogonality, angle, and cosine will be used in the sense of the Frobenius inner product.

Our starting point is the linear system where is a large, nonsingular, and sparse matrix. The resolution of this system is usually performed by iterative methods based on Krylov subspaces (see, e.g., [1, 2]). The coefficient matrix of the system (1) is often extremely ill-conditioned and highly indefinite, so that in this case, Krylov subspace methods are not competitive without a good preconditioner (see, e.g., [2, 3]). Then, to improve the convergence of these Krylov methods, the system (1) can be preconditioned with an adequate nonsingular preconditioning matrix , transforming it into any of the equivalent systems the so-called left and right preconditioned systems, respectively. In this paper, we address only the case of the right-hand side preconditioned matrices , but analogous results can be obtained for the left-hand side preconditioned matrices .

The preconditioning of the system (1) is often performed in order to get a preconditioned matrix as close as possible to the identity in some sense, and the preconditioner is called an approximate inverse of . The closeness of to may be measured by using a suitable matrix norm like, for instance, the Frobenius norm [4]. In this way, the problem of obtaining the best preconditioner (with respect to the Frobenius norm) of the system (1) in an arbitrary subspace of is equivalent to the minimization problem; see, for example, [5]

The solution to the problem (3) will be referred to as the “optimal” or the “best” approximate inverse of matrix in the subspace . Since matrix is the best approximation to the identity in subspace , it will be also referred to as the orthogonal projection of the identity matrix onto the subspace . Although many of the results presented in this paper are also valid for the case that matrix is singular, from now on, we assume that the optimal approximate inverse (and thus also the orthogonal projection ) is a nonsingular matrix. The solution to the problem (3) has been studied as a natural generalization of the classical Moore-Penrose inverse in [6], where it has been referred to as the -Moore-Penrose inverse of matrix .

The main goal of this paper is to derive new geometrical and spectral properties of the best approximations (in the sense of formula (3)) to the identity matrix. Such properties could be used to analyze the quality and theoretical effectiveness of the optimal approximate inverse as preconditioner of the system (1). However, it is important to highlight that the purpose of this paper is purely theoretical, and we are not looking for immediate numerical or computational approaches (although our theoretical results could be potentially applied to the preconditioning problem). In particular, the term “optimal (or best) approximate inverse” is used in the sense of formula (3) and not in any other sense of this expression.

Among the many different works dealing with practical algorithms that can be used to compute approximate inverses, we refer the reader to for example, [4, 79] and to the references therein. In [4], the author presents an exhaustive survey of preconditioning techniques and, in particular, describes several algorithms for computing sparse approximate inverses based on Frobenius norm minimization like, for instance, the well-known SPAI and FSAI algorithms. A different approach (which is also focused on approximate inverses based on minimizing ) can be found in [7], where an iterative descent-type method is used to approximate each column of the inverse, and the iteration is done with “sparse matrix by sparse vector” operations. When the system matrix is expressed in block-partitioned form, some preconditioning options are explored in [8]. In [9], the idea of “target” matrix is introduced, in the context of sparse approximate inverse preconditioners, and the generalized Frobenius norms ( symmetric positive definite) are used, for minimization purposes, as an alternative to the classical Frobenius norm.

The last results of our work are devoted to the special case that matrix is symmetric and positive definite. In this sense, let us recall that the cone of symmetric and positive definite matrices has a rich geometrical structure and, in this context, the angle that any symmetric and positive definite matrix forms with the identity plays a very important role [10]. In this paper, the authors extend this geometrical point of view and analyze the geometrical structure of the subspace of symmetric matrices of order , including the location of all orthogonal matrices not only the identity matrix.

This paper has been organized as follows. In Section 2, we present some preliminary results required to make the paper self-contained. Sections 3 and 4 are devoted to obtain new geometrical and spectral relations, respectively, for the orthogonal projections of the identity matrix. Finally, Section 5 closes the paper with its main conclusions.

2. Some Preliminaries

Now, we present some preliminary results concerning the orthogonal projection of the identity onto the matrix subspace . For more details about these results and for their proofs, we refer the reader to [5, 6, 11].

Taking advantage of the prehilbertian character of the matrix Frobenius norm, the solution to the problem (3) can be obtained using the orthogonal projection theorem. More precisely, the matrix product is the orthogonal projection of the identity onto the subspace , and it satisfies the conditions stated by the following lemmas; see [5, 11].

Lemma 1. Let be nonsingular and let be a linear subspace of . Let be the solution to the problem (3). Then,

An explicit formula for matrix can be obtained by expressing the orthogonal projection of the identity matrix onto the subspace by its expansion with respect to an orthonormal basis of [5]. This is the idea of the following lemma.

Lemma 2. Let be nonsingular. Let be a linear subspace of   of dimension and a basis of   such that is an orthogonal basis of . Then, the solution to the problem (3) is and the minimum (residual) Frobenius norm is

Let us mention two possible options, both taken from [5], for choosing in practice the subspace and its corresponding basis . The first example consists of considering the subspace of matrices with a prescribed sparsity pattern, that is, Then, denoting by , the matrix whose only nonzero entry is , a basis of subspace is clearly , and then will be a basis of subspace (since we have assumed that matrix is nonsingular). In general, this basis of is not orthogonal, so that we only need to use the Gram-Schmidt procedure to obtain an orthogonal basis of , in order to apply the orthogonal expansion (6).

For the second example, consider a linearly independent set of real symmetric matrices and the corresponding subspace which clearly satisfies

Hence, we can explicitly obtain the solution to the problem (3) for subspace , from its basis , as follows. If is an orthogonal basis of subspace , then we just use the orthogonal expansion (6) for obtaining . Otherwise, we use again the Gram-Schmidt procedure to obtain an orthogonal basis of subspace , and then we apply formula (6). The interest of this second example stands in the possibility of using the conjugate gradient method for solving the preconditioned linear system, when the symmetric matrix is positive definite. For a more detailed exposition of the computational aspects related to these two examples, we refer the reader to [5].

Now, we present some spectral properties of the orthogonal projection . From now on, we denote by and the sets of eigenvalues and singular values, respectively, of matrix arranged, as usual, in nonincreasing order, that is,

The following lemma [11] provides some inequalities involving the eigenvalues and singular values of the preconditioned matrix .

Lemma 3. Let be nonsingular and let be a linear subspace of . Then,

The following fact [11] is a direct consequence of Lemma 3.

Lemma 4. Let be nonsingular and let be a linear subspace of . Then, the smallest singular value and the smallest eigenvalue’s modulus of the orthogonal projection of the identity onto the subspace are never greater than . That is,

The following theorem [11] establishes a tight connection between the closeness of matrix to the identity matrix and the closeness of ) to the unity.

Theorem 5. Let be nonsingular and let be a linear subspace of . Let be the solution to the problem (3). Then,

Remark 6. Theorem 5 states that the closer the smallest singular value of matrix is to the unity, the closer matrix will be to the identity, that is, the smaller will be, and conversely. The same happens with the smallest eigenvalue’s modulus of matrix . In other words, we get a good approximate inverse of when ) is sufficiently close to .

To finish this section, let us mention that, recently, lower and upper bounds on the normalized Frobenius condition number of the orthogonal projection of the identity onto the subspace have been derived in [12]. In addition, this work proposes a natural generalization (related to an arbitrary matrix subspace of ) of the normalized Frobenius condition number of the nonsingular matrix .

3. Geometrical Properties

In this section, we present some new geometrical properties for matrix , being the optimal approximate inverse of matrix , defined by (3). Our first lemma states some basic properties involving the cosine of the angle between matrix and the identity, that is,

Lemma 7. Let be nonsingular and let be a linear subspace of . Let be the solution to the problem (3). Then,

Proof. First, using (15) and (4) we immediately obtain (16). As a direct consequence of (16), we derive that is always nonnegative. Finally, using (5) and (16), we get and the proof is concluded.

Remark 8. In [13], the authors consider an arbitrary approximate inverse of matrix and derive the following equality: that is, the typical decomposition (valid in any inner product space) of the strong convergence into the convergence of the norms and the weak convergence . Note that for the special case that is the optimal approximate inverse defined by (3), formula (18) has stated that the strong convergence is reduced just to the weak convergence and, indeed, just to the cosine .

Remark 9. More precisely, formula (18) states that the closer is to the unity (i.e., the smaller the angle is), the smaller will be, and conversely. This gives us a new measure of the quality (in the Frobenius sense) of the approximate inverse of matrix , by comparing the minimum residual norm with the cosine of the angle between and the identity, instead of with , (Lemma 1), or ,   (Theorem 5). So for a fixed nonsingular matrix and for different subspaces , we have Obviously, the optimal theoretical situation corresponds to the case

Remark 10. Note that the ratio between and is independent of the order of matrix . Indeed, assuming that and using (16), we immediately obtain

The following lemma compares the trace and the Frobenius norm of the orthogonal projection .

Lemma 11. Let be nonsingular and let be a linear subspace of  . Let be the solution to the problem (3). Then,

Proof. Using (4), we immediately obtain the four leftmost equivalences. Using (16), we immediately obtain the two rightmost equivalences.

The next lemma provides us with a relationship between the Frobenius norms of the inverses of matrices and its best approximate inverse in subspace .

Lemma 12. Let be nonsingular and let be a linear subspace of . Let be the solution to the problem (3). Then,

Proof. Using (4), we get and hence and the proof is concluded.

The following lemma compares the minimum residual norm with the distance (with respect to the Frobenius norm) between the inverse of and the optimal approximate inverse of in any subspace . First, note that for any two matrices ( nonsingular), from the submultiplicative property of the Frobenius norm, we immediately get

However, for the special case that (the solution to the problem (3)), we also get the following inequality.

Lemma 13. Let be nonsingular and let be a linear subspace of . Let be the solution to the problem (3). Then,

Proof. Using the Cauchy-Schwarz inequality and (5), we get

The following extension of the Cauchy-Schwarz inequality, in a real or complex inner product space , was obtained by Buzano [14]. For all , we have

The next lemma provides us with lower and upper bounds on the inner product , for any real matrix .

Lemma 14. Let be nonsingular and let be a linear subspace of . Let be the solution to the problem (3). Then, for every , we have

Proof. Using (32) for ,  ,  , and (4), we get

The next lemma provides an upper bound on the arithmetic mean of the squares of the terms in the orthogonal projection . By the way, it also provides us with an upper bound on the arithmetic mean of the diagonal terms in the orthogonal projection . These upper bounds (valid for any matrix subspace ) are independent of the optimal approximate inverse , and thus they are independent of the subspace and only depend on matrix .

Lemma 15. Let be nonsingular with and let be a linear subspace of . Let be the solution to the problem (3). Then,

Proof. Using (32) for ,  ,  and , and the Cauchy-Schwarz inequality for and (4), we get

Remark 16. Lemma 15 has the following interpretation in terms of the quality of the optimal approximate inverse of matrix in subspace . The closer the ratio is to zero, the smaller will be, and thus, due to (5), the larger will be, and this happens for any matrix subspace .

Remark 17. By the way, from Lemma 15, we obtain the following inequality for any nonsingular matrix . Consider any matrix subspace s.t. . Then, , and using Lemma 15, we get

4. Spectral Properties

In this section, we present some new spectral properties for matrix , being the optimal approximate inverse of matrix , defined by (3). Mainly, we focus on the case that matrix is symmetric and positive definite. This has been motivated by the following reason. When solving a large nonsymmetric linear system (1) by using Krylov methods, a possible strategy consists of searching for an adequate optimal preconditioner such that the preconditioned matrix is symmetric positive definite [5]. This enables one to use the conjugate gradient method (CG-method), which is, in general, a computationally efficient method for solving the new preconditioned system [2, 15].

Our starting point is Lemma 3, which has established that the sets of eigenvalues and singular values of any orthogonal projection satisfy

Let us particularize (38) for some special cases.

First, note that if is normal (i.e., for all : [16]), then (38) becomes In particular, if is symmetric (), then (38) becomes In particular, if is symmetric and positive definite (), then the equality holds in all (38), that is,

The next lemma compares the traces of matrices and .

Lemma 18. Let be nonsingular and let be a linear subspace of  . Let be the solution to the problem (3). Then,(i) for any orthogonal projection (ii) for any symmetric orthogonal projection (iii)for any symmetric positive definite orthogonal projection

Proof. (i) Using (38), we get
(ii) It suffices to use the obvious fact that and the following equalities taken from (40):
(iii) It suffices to use (43) and the fact that (see, e.g., [17, 18]) if and are symmetric positive definite matrices then for .

The rest of the paper is devoted to obtain new properties about the eigenvalues of the orthogonal projection for the special case that this matrix is symmetric positive definite.

First, let us recall that the smallest singular value and the smallest eigenvalue’s modulus of the orthogonal projection are never greater than (see Lemma 4). The following theorem establishes the dual result for the largest eigenvalue of matrix (symmetric positive definite).

Theorem 19. Let be nonsingular and let be a linear subspace of . Let be the solution to the problem (3). Suppose that matrix is symmetric and positive definite. Then, the largest eigenvalue of the orthogonal projection of the identity onto the subspace is never less than . That is,

Proof. Using (41), we get Now, since (Lemma 4), then . This implies that at least one summand in the rightmost sum in (48) must be less than or equal to zero. Suppose that such summand is the th one (). Since is positive definite, then , and thus and the proof is concluded.

In Theorem 19, the assumption that matrix is positive definite is essential for assuring that , as the following simple counterexample shows. Moreover, from Lemma 4 and Theorem 19, respectively, we have that the smallest and largest eigenvalues of (symmetric positive definite) satisfy and , respectively. Nothing can be asserted about the remaining eigenvalues of the symmetric positive definite matrix , which can be greater than, equal to, or less than the unity, as the same counterexample also shows.

Example 20. For , let let be identity matrix of order , and let be the subspace of all scalar matrices; that is, . Then the solution to the problem (3) for subspace can be immediately obtained by using formula (6) as follows: and then we get

Let us arrange the eigenvalues and singular values of matrix , as usual, in nonincreasing order (as shown in (11)).

On one hand, for , we have and then Hence, is indefinite and .

On the other hand, for , we have (see matrix (52)) and then Hence, for positive definite, we have (depending on ) ,  , or .

The following corollary improves the lower bound zero on both , given in (4), and , given in (17).

Corollary 21. Let be nonsingular and let be a linear subspace of  . Let be the solution to the problem (3). Suppose that matrix is symmetric and positive definite. Then,

Proof. Denote by the spectral norm. Using the well-known inequality [19], Theorem 19, and (4), we get Finally, (58) follows immediately from (57) and (25).

Let us mention that an upper bound on all the eigenvalues moduli and on all singular values of any orthogonal projection can be immediately obtained from (38) and (4) as follows:

Our last theorem improves the upper bound given in (60) for the special case that the orthogonal projection is symmetric positive definite.

Theorem 22. Let be nonsingular and let be a linear subspace of  . Let be the solution to the problem (3). Suppose that matrix is symmetric and positive definite. Then, all the eigenvalues of matrix satisfy

Proof. First, note that the assertion is obvious for the smallest singular value since for any orthogonal projection (Lemma 4). For any eigenvalue of , we use the fact that for all . Then from (41), we get

5. Conclusion

In this paper, we have considered the orthogonal projection (in the Frobenius sense) of the identity matrix onto an arbitrary matrix subspace ( nonsingular, ). Among other geometrical properties of matrix , we have established a strong relation between the quality of the approximation and the cosine of the angle . Also, the distance between and the identity has been related to the ratio (which is independent of the subspace ). The spectral analysis has provided lower and upper bounds on the largest eigenvalue of the symmetric positive definite orthogonal projections of the identity.

Acknowledgments

The authors are grateful to the anonymous referee for valuable comments and suggestions, which have improved the earlier version of this paper. This work was partially supported by the “Ministerio de Economía y Competitividad” (Spanish Government), and FEDER, through Grant Contract CGL2011-29396-C03-01.