Abstract

Using the relation between a principal matrix square root and its inverse with the geometric mean, we present a fast algorithm for computing the geometric mean of two Hermitian positive definite matrices. The algorithm is stable and possesses a high convergence order. Some experiments are included to support the proposed computational algorithm.

1. Introduction

This paper tries to provide a fast algorithm for finding the geometric mean of two Hermitian positive definite matrices via the application of matrix sign function. It is known that the scalar arithmetic-geometric mean of two (nonnegative) numbers and is defined by starting with and and then iterating until to the desired precision. The scalar sequences of , converge to each other. Note that in (1), is the geometric mean of two positive numbers and per computing cycle. Although this iterative formula is quite easy and reliable for two scalars, its extension for general square and nonsingular matrices is not an easy task. In this work, we are concerned with the matrix case of two square Hermitian positive definite matrices.

A right definition of the matrix geometric mean of two positive definite matrices and can be expressed by where given a square matrix having no nonpositive real eigenvalues, denotes the unique solution of the quadratic matrix equation whose eigenvalues lie in the right half plane. The definition (2) was given in the seventies by Pusz and Woronowicz [1]. There are some other important definitions for computing the matrix geometric mean of two matrices; see for example, the work of Lawson-Lim [2]. Note that when just two matrices are involved the theory is well developed, but in case of finding the matrix geometric mean of more than two matrices, the formulation is kind of hard; see for more [3].

A variant formulation for the scalar case of geometric mean could be expressed by

As could be seen in (4), the computation of scalar geometric mean is fully related to the square root of the positive scalar and its inverse. A similar and significant definition for a matrix geometric mean has been developed and suggested by Bhatia in [4] as follows: for two Hermitian positive definite (HPD) matrices and . We use the notation to show the geometric mean of two HPD matrices.

Note that the formulas (2) and (5) coincide and are equal. For more, refer to [5].

The rest of this paper is organized as follows. In Section 2 we combine the root-finding Newton’s method and a Chebyshev-Halley-type method to devise a fast iterative formula for solving nonlinear equations. In Section 3 we provide the link between nonlinear equation solvers and the calculation of some special matrix functions. This will illustrate how the new algorithms could be constructed and implemented. An implementation of the proposed iterative formula in symbolic software Mathematica [6] along a discussion about the stability of the scheme will be provided therein. Note that the idea of computing the geometric mean using the sign function can also be found in [7]. In Section 4 we show the numerical results and highlight the benefit of the proposed technique. Conclusions are drawn in Section 5.

2. Construction of a New Method

It is known that a common way for improving the convergence speed of iterative methods for solving nonlinear scalar smooth equations is to combine the already developed schemes. Hence, let us first combine the well-known method of Newton into a special method of Chebyshev-Halley-type scheme [8] to derive a new iterative scheme as follows: wherein . For obtaining a background about nonlinear equations, one may consult [9, 10].

Theorem 1. Let be a simple zero of a sufficiently differentiable function for an open interval , which contains as an initial approximation of . Then, the iterative expression (6) without memory satisfies the error equation below wherein , .

Proof. The proof of this theorem is based on Taylor’s series expansion of the iterative method (6) around the solution in the th iterate. To save the space and not to be distracted from the main topic, we here exclude the proof. The steps of the proof are similar to those taken in [11].

The iterative method (6) reaches sixth-order convergence using five functional evaluations and thus achieves the efficiency index , which is higher than that of Newton; that is, . Furthermore, applying (6) for solving the polynomial equation has global convergence (except the points lying on the imaginary axis). This global behavior which could easily be shown by drawing the basins of attraction (6) for is useful in practical matrix problems so as to allow us deal with all kinds of HPD matrices with any spectra.

Remark 2. Note that since the global behavior can be observed from the basins of attraction, then we do not pursue this fact by theoretical analysis for (6).

A recent discussion about the relation between matrix sign and nonlinear equation solvers is given in [12, 13]. We state that all of such extensions in the scalar case and studying their orders are of symbolic computational nature. Such an extension to the matrix environment will also be done in the next section using symbolic computations.

3. Construction of an Algorithm for Geometric Mean

The relation between Section 2 and our main aim for computing matrix geometric mean is not clear at the first sight. To illustrate this, we recall that many of the important matrix functions such as matrix square root, that is, the solution to the matrix equation (3), and matrix sign function which is the solution to the following matrix equation: can be calculated approximately by matrix iteration functions. Such fixed-point type methods are convergent under special conditions to the aimed matrix function and are basically originated from root-finding methods [7].

On the other hand, the important definition of the matrix geometric mean (5) requires the computation of the matrix square root of and its inverse. Hence, in order to design an efficient algorithm for (5), we wish to construct an iterative expression in which we compute both and at the same time.

Toward this aim, we apply the new nonlinear equation solver (6) for solving the matrix equation (8). This application would yield the following matrix iteration in its reciprocal form: with a proper for finding the sign matrix.

Remark 3. The iteration (9) is not a member of the Padé family of iterations introduced in [14] for matrix sign.

In order to connect (9) with our aim, we remind an important identity as follows [7]: which indicates an important relationship between principal matrix square root and the matrix sign function.

The identity (10) has an advantage which is the computation of the principal inverse matrix square root along with the principal matrix square root at the same time. Let us in what follows first study the stability behavior of (9).

Lemma 4. Let have no eigenvalues on . Then, the sequence generated by (9) using is asymptotically stable.

Proof. Let be a numerical perturbation introduced at the th iterate of (9). We obtain All terms that are quadratic in their errors are removed from our analysis. This formal manipulation is meaningful if is sufficiently small. We have Note that the commutativity between and is not used throughout this lemma. To simplify this, the following identity will be used (for any nonsingular matrix and the matrix ): Simplifying yields where , , and for enough large , we assumed . After some algebraic manipulations and using , we conclude that Applying (15) recursively, we have From (16), we can conclude that the perturbation at the iterate is bounded. This allows to conclude that a perturbation at a given step is bounded at the subsequent steps. Therefore, the sequence generated by (9) is asymptotically stable.

Note that Lemma 4 just shows the asymptotical stability. In general, the fixed-point iteration is stable in a neighborhood of a fixed point if Fréchet derivative has bounded powers [7]. Furthermore, if is any super-linearly convergent iteration for , then , where is the Fréchet derivative of the matrix sign function at . Hence is idempotent () and the iteration is stable, thus all sign iterations, such as (9), are automatically stable.

It is now easy to deduce the following convergence theorem for (9).

Theorem 5. Let have no eigenvalues on . If , then the iterative method (9) converges to .

Proof. The convergence of rational iterations can be analyzed in terms of the convergence of the eigenvalues of the matrices . The reason for this is that if has a Jordan decomposition , then . Let have a Jordan canonical form arranged as , where is a nonsingular matrix. It is also known that [7] If we define , then from the method (9), we obtain Notice that if is a diagonal matrix then, based on an inductive proof, all successive are diagonal too. From (18), it is enough to prove that converges to , in order to ensure the convergence of the sequence generated by (9).
We can write (18) as uncoupled scalar iterations to solve , given by where and . From (18) and (19), it is enough to study the convergence of to , for all .
From (19) and since the eigenvalues of are not pure imaginary, we have that . Thus, we attain Since , we have and . This shows that is convergent. Now, it could be easy to conclude that . Finally, we have The proof is complete.

The iteration (9) requires one matrix inversion before computing step and obtains both and which are of interest in (5). The implementation of (9) for computing principal square roots requires a sharp attention so as to save much effort. Since the intermediate matrices are all sparse (at least half of the entries are zero), then one could simply use sparse approximation techniques to save up the memory and time.

An implementation of (9) to compute for two HPD matrices in the programming package Mathematica is brought forward as in (Algorithm 1).

ClearAll["Global*"]
FunM[fun_, X_]:= Module[{faux, dim, mataux, JordanD, sim, JordanF, eps,
  fdiag, diagQ, fauxD}, (dim = Length@X;
faux[xx_, i_, j_]:=
  Which[i <= j, 1/Abs[i - j]! (D[fun, {x, Abs[i - j]}]) /. x -> xx, True, 0];
mataux[Y_]:= Table[faux[Y[[i, j]], i, j], {i, 1, dim}, {j, 1, dim}];
JordanD = JordanDecomposition[X] // N; sim = JordanD[[ ]];
JordanF = JordanD[[ ]]; eps = 1*10-10;
diagQ = Norm[JordanF  -  DiagonalMatrix[Diagonal[JordanF]]];
fauxD[xx_]:= (fun) /. x -> xx;
fdiag:= DiagonalMatrix[Map[fauxD, Diagonal[JordanF]]];
Which[diagQ < eps, sim.fdiag.Inverse[sim], True,
  sim.mataux[JordanF].Inverse[sim]])]
MGM[A_, B_, maxIterations_, tolerance_]:= Module[{k = 0},
 {n, n}  = Dimensions[A]; Id = SparseArray[ i_, i_}  -> 1.}, {n, n}];
A1 = SparseArray@ArrayFlatten[ 0, N@A}, {Id, 0 ]; Y[ ] = A1;
R[ ] = 1; Id2 = SparseArray[ i_, i_}  -> 1.}, {2 n, 2 n}];
 {Quiet@While[k < maxIterations  &&  R[k] >= tolerance,
  Y2 = Y[k].Y[k]; Y3 = Y2.Y[k]; Y4 = Y3.Y[k];
  Y5 = Y4.Y[k]; Y6 = Y5.Y[k]; Y7 = Y6.Y[k]; Y8 = Y7.Y[k];
  l1 = SparseArray[7 Id2 + 148 Y2 + 330 Y4 + 148 Y6 + 7 Y8];
  l2 = SparseArray@ArrayFlatten[ Inverse@l1[[1;; n, 1;; n]], 0},
    {0, Inverse@l1[[n + 1;; 2 n, n + 1;; 2 n]] ];
  Y[k + 1] = SparseArray[(48 Y[k] + 272 Y3 + 272 Y5 + 48 Y7).l2];
  R[k + 1] = Norm[Y[k + 1] - Y[k], Infinity]/ Norm[Y[k + 1], Infinity]; k++];
  AS = Y[k][[1;; n, n + 1;; 2 n]]; IAS = Y[k][[n + 1;; 2 n, 1;; n]];
  Mat = (IAS.B.IAS)};
AS.FunM[Sqrt[x], Mat].AS]

In Algorithm 1, the four-argument function  computes by entering the four arguments “matrix ,” “matrix ,” “the maximum number of iterations that (9) is allowed to take,” and the “tolerance” of the stopping termination in the infinity norm .

Note that for computing the principal matrix square root of , we have used the Jordan Canonical approach, which has been provided in the general form for computing matrix functions in the code .

4. Experiments

We test the contributed method (9) denoted by PM using Mathematica 8 in machine precision. Apart from this scheme, another iterative method which is known as DB method [15] and given by is considered. This method generates the sequences and which converge to and , respectively.

We remark that the first and the second substeps of (6) result in the quadratic Newton’s method (NM) and cubic Chebyshev-Halley’s method (CHM) for matrix sign [7] in what follows:

Example 6. As the first example, we consider the matrix , , whereas its exact matrix geometric mean is given by [16]: .

The proposed approach converges to the solution matrix in 2 iterations, which shows a completely fast convergence.

Example 7. We now consider the two HPD matrices as follows: when and compute .

We compare the behavior of different methods and report the numerical results using for all norms involved with the stopping criterion in Figure 1. As could be seen, the numerical results are in harmony with the theoretical aspects of Section 3 and show a fast convergence for the proposed method (9) instead of DB, NM, and CHM.

5. Conclusions

Based on the quadratical Newton’s scheme and the cubical method of Chebyshev-Halley, we have developed an iterative method with sixth order of convergence for solving nonlinear scalar smooth equations. The computational efficiency showed its superiority in contrast to NM. Then, the method with global behavior for finding matrix sign function has been extended for computing the principal matrix square root and its inverse. This procedure was followed so as to build a fast algorithm for finding the matrix geometric mean of two HPD matrices. We also have studied the asymptotical stability for the proposed technique.

To illustrate the new technique some numerical examples were presented. Computational results have justified robust and efficient convergence behavior of the proposed method. Similar numerical experimentations, carried out for a number of problems of different types, confirmed the above conclusions to a great extent.

We conclude the paper with the remark that in many numerical applications high precision in computations is required. The results of numerical experiments show that the high order efficient methods such as (9) associated with a multiple-precision arithmetic are very useful, because they yield a clear reduction in the number of iterates.

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

Acknowledgment

The authors are grateful to anonymous reviewers for carefully reading the paper and helping to improve the presentation.