An Algorithm for Computing Geometric Mean of Two Hermitian Positive Definite Matrices via Matrix Sign

Soleymani, F.; Sharifi, M.; Shateyi, S.; Khaksar Haghani, F.

doi:https://doi.org/10.1155/2014/978629

Abstract and Applied Analysis

On this page

Abstract Introduction Conclusions References Copyright Related Articles

Special Issue

Iterative Fixed-Point Methods for Solving Nonlinear Problems: Dynamics and Applications

View this Special Issue

Research Article | Open Access

Volume 2014 | Article ID 978629 | https://doi.org/10.1155/2014/978629

An Algorithm for Computing Geometric Mean of Two Hermitian Positive Definite Matrices via Matrix Sign

F. Soleymani,¹M. Sharifi,¹S. Shateyi,²and F. Khaksar Haghani¹

Academic Editor: Alicia Cordero

Received07 May 2014

Revised12 Jul 2014

Accepted23 Jul 2014

Published06 Aug 2014

Abstract

Using the relation between a principal matrix square root and its inverse with the geometric mean, we present a fast algorithm for computing the geometric mean of two Hermitian positive definite matrices. The algorithm is stable and possesses a high convergence order. Some experiments are included to support the proposed computational algorithm.

1. Introduction

This paper tries to provide a fast algorithm for finding the geometric mean of two Hermitian positive definite matrices via the application of matrix sign function. It is known that the scalar arithmetic-geometric mean of two (nonnegative) numbers and is defined by starting with and and then iterating until to the desired precision. The scalar sequences of , converge to each other. Note that in (1), is the geometric mean of two positive numbers and per computing cycle. Although this iterative formula is quite easy and reliable for two scalars, its extension for general square and nonsingular matrices is not an easy task. In this work, we are concerned with the matrix case of two square Hermitian positive definite matrices.

A right definition of the matrix geometric mean of two positive definite matrices and can be expressed by where given a square matrix having no nonpositive real eigenvalues, denotes the unique solution of the quadratic matrix equation whose eigenvalues lie in the right half plane. The definition (2) was given in the seventies by Pusz and Woronowicz [1]. There are some other important definitions for computing the matrix geometric mean of two matrices; see for example, the work of Lawson-Lim [2]. Note that when just two matrices are involved the theory is well developed, but in case of finding the matrix geometric mean of more than two matrices, the formulation is kind of hard; see for more [3].

A variant formulation for the scalar case of geometric mean could be expressed by

As could be seen in (4), the computation of scalar geometric mean is fully related to the square root of the positive scalar and its inverse. A similar and significant definition for a matrix geometric mean has been developed and suggested by Bhatia in [4] as follows: for two Hermitian positive definite (HPD) matrices and . We use the notation to show the geometric mean of two HPD matrices.

Note that the formulas (2) and (5) coincide and are equal. For more, refer to [5].

The rest of this paper is organized as follows. In Section 2 we combine the root-finding Newton’s method and a Chebyshev-Halley-type method to devise a fast iterative formula for solving nonlinear equations. In Section 3 we provide the link between nonlinear equation solvers and the calculation of some special matrix functions. This will illustrate how the new algorithms could be constructed and implemented. An implementation of the proposed iterative formula in symbolic software Mathematica [6] along a discussion about the stability of the scheme will be provided therein. Note that the idea of computing the geometric mean using the sign function can also be found in [7]. In Section 4 we show the numerical results and highlight the benefit of the proposed technique. Conclusions are drawn in Section 5.

2. Construction of a New Method

It is known that a common way for improving the convergence speed of iterative methods for solving nonlinear scalar smooth equations is to combine the already developed schemes. Hence, let us first combine the well-known method of Newton into a special method of Chebyshev-Halley-type scheme [8] to derive a new iterative scheme as follows: wherein . For obtaining a background about nonlinear equations, one may consult [9, 10].

Theorem 1. Let be a simple zero of a sufficiently differentiable function for an open interval , which contains as an initial approximation of . Then, the iterative expression (6) without memory satisfies the error equation below wherein , .

Proof. The proof of this theorem is based on Taylor’s series expansion of the iterative method (6) around the solution in the th iterate. To save the space and not to be distracted from the main topic, we here exclude the proof. The steps of the proof are similar to those taken in [11].

The iterative method (6) reaches sixth-order convergence using five functional evaluations and thus achieves the efficiency index , which is higher than that of Newton; that is, . Furthermore, applying (6) for solving the polynomial equation has global convergence (except the points lying on the imaginary axis). This global behavior which could easily be shown by drawing the basins of attraction (6) for is useful in practical matrix problems so as to allow us deal with all kinds of HPD matrices with any spectra.

Remark 2. Note that since the global behavior can be observed from the basins of attraction, then we do not pursue this fact by theoretical analysis for (6).

A recent discussion about the relation between matrix sign and nonlinear equation solvers is given in [12, 13]. We state that all of such extensions in the scalar case and studying their orders are of symbolic computational nature. Such an extension to the matrix environment will also be done in the next section using symbolic computations.

3. Construction of an Algorithm for Geometric Mean

The relation between Section 2 and our main aim for computing matrix geometric mean is not clear at the first sight. To illustrate this, we recall that many of the important matrix functions such as matrix square root, that is, the solution to the matrix equation (3), and matrix sign function which is the solution to the following matrix equation: can be calculated approximately by matrix iteration functions. Such fixed-point type methods are convergent under special conditions to the aimed matrix function and are basically originated from root-finding methods [7].

On the other hand, the important definition of the matrix geometric mean (5) requires the computation of the matrix square root of and its inverse. Hence, in order to design an efficient algorithm for (5), we wish to construct an iterative expression in which we compute both and at the same time.

Toward this aim, we apply the new nonlinear equation solver (6) for solving the matrix equation (8). This application would yield the following matrix iteration in its reciprocal form: with a proper for finding the sign matrix.

Remark 3. The iteration (9) is not a member of the Padé family of iterations introduced in [14] for matrix sign.

In order to connect (9) with our aim, we remind an important identity as follows [7]: which indicates an important relationship between principal matrix square root and the matrix sign function.

The identity (10) has an advantage which is the computation of the principal inverse matrix square root along with the principal matrix square root at the same time. Let us in what follows first study the stability behavior of (9).

Lemma 4. Let have no eigenvalues on . Then, the sequence generated by (9) using is asymptotically stable.

Proof. Let be a numerical perturbation introduced at the th iterate of (9). We obtain All terms that are quadratic in their errors are removed from our analysis. This formal manipulation is meaningful if is sufficiently small. We have Note that the commutativity between and is not used throughout this lemma. To simplify this, the following identity will be used (for any nonsingular matrix and the matrix ): Simplifying yields where , , and for enough large , we assumed . After some algebraic manipulations and using , we conclude that Applying (15) recursively, we have From (16), we can conclude that the perturbation at the iterate is bounded. This allows to conclude that a perturbation at a given step is bounded at the subsequent steps. Therefore, the sequence generated by (9) is asymptotically stable.

Note that Lemma 4 just shows the asymptotical stability. In general, the fixed-point iteration is stable in a neighborhood of a fixed point if Fréchet derivative has bounded powers [7]. Furthermore, if is any super-linearly convergent iteration for , then , where is the Fréchet derivative of the matrix sign function at . Hence is idempotent () and the iteration is stable, thus all sign iterations, such as (9), are automatically stable.

It is now easy to deduce the following convergence theorem for (9).

Theorem 5. Let have no eigenvalues on . If , then the iterative method (9) converges to .

Proof. The convergence of rational iterations can be analyzed in terms of the convergence of the eigenvalues of the matrices . The reason for this is that if has a Jordan decomposition , then . Let have a Jordan canonical form arranged as , where is a nonsingular matrix. It is also known that [7] If we define , then from the method (9), we obtain Notice that if is a diagonal matrix then, based on an inductive proof, all successive are diagonal too. From (18), it is enough to prove that converges to , in order to ensure the convergence of the sequence generated by (9).
We can write (18) as uncoupled scalar iterations to solve , given by where and . From (18) and (19), it is enough to study the convergence of to , for all .
From (19) and since the eigenvalues of are not pure imaginary, we have that . Thus, we attain Since , we have and . This shows that is convergent. Now, it could be easy to conclude that . Finally, we have The proof is complete.

The iteration (9) requires one matrix inversion before computing step and obtains both and which are of interest in (5). The implementation of (9) for computing principal square roots requires a sharp attention so as to save much effort. Since the intermediate matrices are all sparse (at least half of the entries are zero), then one could simply use sparse approximation techniques to save up the memory and time.

An implementation of (9) to compute for two HPD matrices in the programming package Mathematica is brought forward as in (Algorithm 1).

ClearAll["Global’*"]
FunM[fun_, X_]:= Module[{faux, dim, mataux, JordanD, sim, JordanF, eps,
fdiag, diagQ, fauxD}, (dim = Length@X;
faux[xx_, i_, j_]:=
Which[i <= j, 1/Abs[i - j]! (D[fun, {x, Abs[i - j]}]) /. x -> xx, True, 0];
mataux[Y_]:= Table[faux[Y[[i, j]], i, j], {i, 1, dim}, {j, 1, dim}];
JordanD = JordanDecomposition[X] // N; sim = JordanD[[]];
JordanF = JordanD[[]]; eps = 1*10^⋀-10;
diagQ = Norm[JordanF - DiagonalMatrix[Diagonal[JordanF]]];
fauxD[xx_]:= (fun) /. x -> xx;
fdiag:= DiagonalMatrix[Map[fauxD, Diagonal[JordanF]]];
Which[diagQ < eps, sim.fdiag.Inverse[sim], True,
sim.mataux[JordanF].Inverse[sim]])]
MGM[A_, B_, maxIterations_, tolerance_]:= Module[{k = 0},
{n, n} = Dimensions[A]; Id = SparseArray[i_, i_} -> 1.}, {n, n}];
A1 = SparseArray@ArrayFlatten[0, N@A}, {Id, 0]; Y[] = A1;
R[] = 1; Id2 = SparseArray[i_, i_} -> 1.}, {2 n, 2 n}];
{Quiet@While[k < maxIterations && R[k] >= tolerance,
Y2 = Y[k].Y[k]; Y3 = Y2.Y[k]; Y4 = Y3.Y[k];
Y5 = Y4.Y[k]; Y6 = Y5.Y[k]; Y7 = Y6.Y[k]; Y8 = Y7.Y[k];
l1 = SparseArray[7 Id2 + 148 Y2 + 330 Y4 + 148 Y6 + 7 Y8];
l2 = SparseArray@ArrayFlatten[Inverse@l1[[1;; n, 1;; n]], 0},
{0, Inverse@l1[[n + 1;; 2 n, n + 1;; 2 n]]];
Y[k + 1] = SparseArray[(48 Y[k] + 272 Y3 + 272 Y5 + 48 Y7).l2];
R[k + 1] = Norm[Y[k + 1] - Y[k], Infinity]/ Norm[Y[k + 1], Infinity]; k++];
AS = Y[k][[1;; n, n + 1;; 2 n]]; IAS = Y[k][[n + 1;; 2 n, 1;; n]];
Mat = (IAS.B.IAS)};
AS.FunM[Sqrt[x], Mat].AS]

In Algorithm 1, the four-argument function computes by entering the four arguments “matrix ,” “matrix ,” “the maximum number of iterations that (9) is allowed to take,” and the “tolerance” of the stopping termination in the infinity norm .

Note that for computing the principal matrix square root of , we have used the Jordan Canonical approach, which has been provided in the general form for computing matrix functions in the code .

4. Experiments

We test the contributed method (9) denoted by PM using Mathematica 8 in machine precision. Apart from this scheme, another iterative method which is known as DB method [15] and given by is considered. This method generates the sequences and which converge to and , respectively.

We remark that the first and the second substeps of (6) result in the quadratic Newton’s method (NM) and cubic Chebyshev-Halley’s method (CHM) for matrix sign [7] in what follows:

Example 6. As the first example, we consider the matrix , , whereas its exact matrix geometric mean is given by [16]: .

The proposed approach converges to the solution matrix in 2 iterations, which shows a completely fast convergence.

Example 7. We now consider the two HPD matrices as follows: when and compute .

We compare the behavior of different methods and report the numerical results using for all norms involved with the stopping criterion in Figure 1. As could be seen, the numerical results are in harmony with the theoretical aspects of Section 3 and show a fast convergence for the proposed method (9) instead of DB, NM, and CHM.

5. Conclusions

Based on the quadratical Newton’s scheme and the cubical method of Chebyshev-Halley, we have developed an iterative method with sixth order of convergence for solving nonlinear scalar smooth equations. The computational efficiency showed its superiority in contrast to NM. Then, the method with global behavior for finding matrix sign function has been extended for computing the principal matrix square root and its inverse. This procedure was followed so as to build a fast algorithm for finding the matrix geometric mean of two HPD matrices. We also have studied the asymptotical stability for the proposed technique.

To illustrate the new technique some numerical examples were presented. Computational results have justified robust and efficient convergence behavior of the proposed method. Similar numerical experimentations, carried out for a number of problems of different types, confirmed the above conclusions to a great extent.

We conclude the paper with the remark that in many numerical applications high precision in computations is required. The results of numerical experiments show that the high order efficient methods such as (9) associated with a multiple-precision arithmetic are very useful, because they yield a clear reduction in the number of iterates.

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

Acknowledgment

The authors are grateful to anonymous reviewers for carefully reading the paper and helping to improve the presentation.

References

W. Pusz and S. L. Woronowicz, “Functional calculus for sesquilinear forms and the purification map,” Reports on Mathematical Physics, vol. 8, no. 2, pp. 159–170, 1975.
View at: Publisher Site | Google Scholar | Zentralblatt MATH | MathSciNet
J. D. Lawson and Y. Lim, “The geometric mean, matrices, metrics, and more,” The American Mathematical Monthly, vol. 108, no. 9, pp. 797–812, 2001.
View at: Publisher Site | Google Scholar | Zentralblatt MATH | MathSciNet
D. A. Bini and B. Iannazzo, “A note on computing matrix geometric means,” Advances in Computational Mathematics, vol. 35, no. 2–4, pp. 175–192, 2011.
View at: Publisher Site | Google Scholar | MathSciNet
R. Bhatia, Positive Definite Matrices, Princeton Series in Applied Mathematics, Princeton University Press, Princeton, NJ, USA, 2007.
View at: MathSciNet
B. Iannazzo and B. Meini, “The palindromic cyclic reduction and related algorithms,” Calcolo, 2014.
View at: Publisher Site | Google Scholar
J. Hoste, Mathematica Demystified, McGraw-Hill, New York, NY, USA, 2009.
N. J. Higham, Functions of Matrices: Theory and Computation, Society for Industrial and Applied Mathematics, Philadelphia, Pa, USA, 2008.
View at: Publisher Site | MathSciNet
J. M. Gutiérrez and M. A. Hernández, “A family of Chebyshev-Halley type methods in Banach spaces,” Bulletin of the Australian Mathematical Society, vol. 55, no. 1, pp. 113–130, 1997.
View at: Publisher Site | Google Scholar | MathSciNet
R. F. Lin, H. M. Ren, Z. Šmarda, Q. B. Wu, Y. Khan, and J. L. Hu, “New families of third-order iterative methods for finding multiple roots,” Journal of Applied Mathematics, vol. 2014, Article ID 812072, 9 pages, 2014.
View at: Publisher Site | Google Scholar | MathSciNet
F. Soleymani, “Some high-order iterative methods for finding all the real zeros,” Thai Journal of Mathematics, vol. 12, no. 2, pp. 313–327, 2014.
View at: Google Scholar
F. Soleymani, S. Shateyi, and G. Özkum, “An iterative solver in the presence and absence of multiplicity for nonlinear equations,” The Scientific World Journal, vol. 2013, Article ID 837243, 9 pages, 2013.
View at: Publisher Site | Google Scholar
F. Soleymani, E. Tohidi, S. Shateyi, and F. K. Haghani, “Some matrix iterations for computing matrix sign function,” Journal of Applied Mathematics, vol. 2014, Article ID 425654, 9 pages, 2014.
View at: Publisher Site | Google Scholar | MathSciNet
F. Soleymani, P. S. Stanimirović, S. Shateyi, and F. Khaksar Haghani, “Approximating the matrix sign function using a novel iterative method,” Abstract and Applied Analysis, vol. 2014, Article ID 105301, 9 pages, 2014.
View at: Publisher Site | Google Scholar
C. Kenney and A. J. Laub, “Rational iterative methods for the matrix sign function,” SIAM Journal on Matrix Analysis and Applications, vol. 12, no. 2, pp. 273–291, 1991.
View at: Publisher Site | Google Scholar | Zentralblatt MATH | MathSciNet
E. D. Denman and A. N. Beavers Jr., “The matrix sign function and computations in systems,” Applied Mathematics and Computation, vol. 2, no. 1, pp. 63–94, 1976.
View at: Publisher Site | Google Scholar | MathSciNet
B. Iannazzo, “The geometric mean of two matrices from a computational viewpoint,” http://arxiv.org/abs/1201.0101.
View at: Google Scholar

Copyright

Copyright © 2014 F. Soleymani et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

PDF Download Citation

Download other formats

Order printed copies

Views

1969

Downloads

1126

Citations