Research Article  Open Access
Some Matrix Iterations for Computing Matrix Sign Function
Abstract
Some iterative methods are introduced and demonstrated for finding the matrix sign function. It is analytically shown that the new schemes are asymptotically stable. Convergence analysis along with the error bounds of the main proposed method is established. Different numerical experiments are employed to compare the behavior of the new schemes with the existing matrix iterations of the same type.
1. Introduction
Recently, the theory of matrix functions becomes an active topic of research in the field of advanced numerical linear algebra (see, e.g., [1–4]). In fact, the most common matrix function is the matrix inverse or the MoorePenrose generalized inverse, routinely used in the scientific problems [5]. General matrix functions as well as the specific cases have been extensively discussed and developed in [6].
This paper is concerned with a special case known as matrix sign function, which is of clear importance in the theory of matrix functions [7]. Let us, as Higham considered in the fifth Chapter of [6], assume throughout this paper that the matrix has no eigenvalues on the imaginary axis. This assumption implies that the matrix sign function, can be uniquely defined, whereas is a nonsingular square matrix. In order to define , we remember the matrix sector function, for any positive integer , can be defined by
Choosing in the matrix sector function will yield in the matrix sign function as . This also clearly puts on show the importance and the relevance of this matrix function to the other important matrix functions such as matrix square root.
Bini et al. in [8] proved that the principal th root of the matrix is a multiple of the (2,1)block of the matrix sign function, , for the following block companion matrix
The matrix has the following properties.(1) ( is involutory).(2) is diagonalizable with eigenvalues .(3).(4)If is real, then is real.(5) and are projectors onto the invariant subspaces associated with the eigenvalues in the right halfplane and left halfplane, respectively.
Although has eigenvalues of , its norm can be arbitrarily large. Note that, for diagonalizable , eigenvectors of are eigenvectors of , with eigenvalues of and , respectively. For more, see [9].
There are some other definitions for in the literature based on the Jordan canonical form and the integral representation. As indicated in [6], one of the most useful and widely applicable methods for computing is the matrix iteration of Newton given by
In 1991, a fundamental family of matrix iterations for finding the matrix sign function was introduced in [10] using Padé approximants to and the following characterization: where . Let the Padé approximant to be , and . The iteration has been proved to be convergent to and −1 with order of convergence form . We remark that iterative methods of the type are fixedpoint type iterations, and if does not converge then its reciprocal; that is, converges to the sign matrix (e.g., forthcoming iterations (7) and (8)). Generally speaking, the iterations of Kenney and Laub (6), generated by the and Padé approximants, are globally convergent and their orders depend on . A discussion about such iterations was given in [11–13].
A lot of known methods could be extracted from the Padé family (6). For example, the wellknown Halley’s matrix iteration of order three can be deduced as follows: Another fourthorder method could be attained as follows:
Note that, for lower order methods such as (4), the convergence is slow; that is, initially convergence can be slow if . Hence, a scaling approach (a.k.a. norm scaling) to accelerate the beginning of this phase is necessary and can be done in what follows [6]:
Such an approach could be done to refine the initial matrix and to provide a much more robust initial matrix to arrive at the convergence phase rapidly.
The rest of this paper has been organized as follows. Section 2 gives the basic idea of obtaining other higher order solvers for computing , while a fourthorder family of methods has been introduced. Section 3 is devoted to find the best method of this family in terms of the lowest computational cost. An analysis will be given to show that the new matrix iteration is asymptotically stable with local quartic convergence. To find a method with fourth order and global convergence, we also give another matrix iteration therein. Convergence analysis along with the error bounds of the proposed method is established. Numerical studies will be included in Section 4 to compare the efficiency and the stability of the schemes for finding using different tests. Finally, a short conclusion of the study will be drawn in Section 5.
2. Basic Idea
The connection between the matrix iteration of Newton and Newton’s rootfinding method may not be clear at the first sight. Generally speaking, in the theory of matrix functions, many of the matrix functions could effectively be calculated by the existing iterative methods for finding the solution of nonlinear equations [14].
To illustrate further, apply Newton’s method on the following matrix equation: in which is the identity matrix of the appropriate size; it would yield in the matrix Newton’s iteration (4). Note that is one solution of (10). Note that, in the last decade, many efficient higherorder iterative methods have been developed for solving nonlinear equations [15], and some of them have been extended to solve nonlinear matrix equation; see, for example, [16, 17]. But our work is the first to discuss the application of high order root solvers for matrix sign function.
Let us consider the following fourthorder family of iterative methods [18]: wherein is a free parameter in . Applying the uniparametric family (11) on matrix equation (10) results in the following novel family of matrix iterations:
The free parameter plays an important role in the next section to derive the best possible matrix iteration out of (12).
3. Main Results
In order to reduce the computational complexity of (12), the parameter must be chosen as if the number of matrixmatrix products gets down along the number of matrix inversions.
Choosing will simplify the whole family into the following method with reasonable computational cost in contrast to its convergence order:
We now rewrite obtained iteration (13) as efficiently as possible to reduce the number of matrixmatrix multiplications in what follows: where and .
Definition 1 (stability [6]). Consider an iteration with a fixed point . Assume that is Fréchet differentiable at . The iteration is stable in a neighborhood of if the Fréchet derivative has bounded powers; that is, there exists a constant such that for all .
Now, we first investigate the stability of (14) for the matrix sign function in a neighborhood of the solution of (10). In fact, we analyze how a small perturbation at the th iterate is amplified or damped along the iterates. Note that a general way for assessing the stability of some matrix iterations has been studied by Iannazzo in [19]. The forthcoming approach follows these results of [19].
Lemma 2. The sequence generated by (14) is asymptotically stable.
Proof. Stability concerns behavior close to convergence and so is an asymptotic property. Let be the numerical perturbation introduced at the th iterate of (14). Next, one has
Here, we perform a firstorder error analysis; that is, we formally neglect quadratic terms such as . This formal manipulation is meaningful if is sufficiently small. We have
where the following identity has been used (for any nonsingular matrix and the matrix ):
Note that the commutativity between and is not used throughout this paper because it does not hold. Further simplifying yields to
where , , and for large enough we have , and , is close to zero (matrix) and can be neglected by choosing . After some algebraic manipulation and using , we conclude that
Applying (19) recursively, and after some algebraic manipulations, we have
From (20), we can conclude that the perturbation at the iterate is bounded. Therefore, the sequence generated by (14) is asymptotically stable.
Remark 3. If is a function of , then the iterates from (14) or (23) are all functions of and hence commute with .
The proposed matrix iteration requires one matrix inverse per iteration along four matrixmatrix products to achieve the convergence order four, while Halley’s method requires one matrix inversion and three matrixmatrix products to reach order three.
The basins of attraction for the two iterative methods of (4) and (14) to solve in the complex plane have been drawn in Figure 1. It shows that the new scheme has larger basins due to its higher convergence order. Note that the roots have been identified with two white points.
(a)
(b)
Iannazzo in [20] discussed that the matrix convergence is governed by scalar convergence. That is to say, the fourthorder convergence of new method (14) might not be global unlike Newton’s iteration (4). To illustrate further, if one chooses a matrix (in Figure 1(b)) with one eigenvalue with negative real part, but in a yellow petal, then the matrix iteration will not converge to . Therefore, it could be mentioned that scheme (14) converges with local fourth order. This restriction has encouraged us to propose another fourthorder method with global convergence in what follows and leave behind the previous iteration with interest in terms of theoretical analysis only.
Let us apply the following fourthorder nonlinear solver [16] on matrix equation (10): which reads the error equation wherein and , and then to obtain its corresponding matrix iteration, as follows where , , and . Figure 2 shows the basins of attraction for new method (23) and scheme (7), while both reveal global convergence. The higher order of convergence for (23) made its basins larger and lighter.
(a)
(b)
Remark 4. In this paper, we restrict the analyses to asymptotically small perturbations; that is, we use the differential error analysis.
Lemma 5. The sequence generated by (23) is asymptotically stable.
Proof. Using the same assumptions as in the Proof of Lemma 2 (perturbations are restricted to a neighborhood of the solution), we can write the following: Further simplifying and by considering the terminology of Lemma 2, we attain It shows that the main proposed iterative scheme (23) is asymptotically stable, since recursively one has . The proof is complete.
Remark 6. The analysis done does not take into account the influence of the rounding errors on the convergence. Sometimes these errors may lead to slower convergence or even to the failure of the method. We remark concerning this potential danger for problems solving in single precision arithmetic or in lower precision.
Theorem 7. Let have no pure imaginary eigenvalues. Then, for the proposed iterates in (23), is defined per stage.
Proof. We must show that which is obtained at each iteration is nonsingular, since the inverse of the matrix must be computed per computing step. Toward this goal, it is enough to show that the eigenvalues of the computed matrix at the end of each iteration are in the open halfplane.
Using the initial matrix and based on the fact that has no eigenvalues on the imaginary axis, the eigenvalues of the initial matrix are in the open halfplane. Let be the eigenvalue of the matrix in the th iterate. We have , where . Hence,
and consequently
Therefore, the eigenvalues of remain in their open halfplane under mapping (23). And is defined and is nonsingular for all .
Theorem 8. Let has no pure imaginary eigenvalues. Then, the proposed iterates of (23) converge to the sign matrix .
Proof. For our analysis, we assume that is diagonalizable; that is, there exists a nonsingular matrix such that
where are the eigenvalues of . Note that we know that [10]
for any nonsingular matrix . On the other hand, if we define , then we have from (23) that
Notice that if is a diagonal matrix, then all successive are diagonal too. From (30), it is enough to prove that converges to the sign of and then ensure the convergence of the sequence generated by (23) to .
Therefore, we can write (30) as uncoupled scalar iterations to solve , with , given by
where and . On the other hand, , for all . From (30) and (31), it is enough to study the convergence of to the sign of , for all .
From (31) and since the eigenvalues of are not pure imaginary, we have that . Thus, we attain
Since , we obtain
and . This shows that is convergent. Now, it could be easy to conclude that . Recalling , we have
and subsequently the convergence is established. The proof is complete.
Remark 9. It is clear that convergence will be slow if either or has eigenvalues close to the imaginary axis. Hence, it is better to first construct a robust seed by scaled method (9).
Theorem 10. Let have no pure imaginary eigenvalues. Then, new method (23) has fourth order to find the sign matrix .
Proof. Clearly, the are rational functions of and hence, like , commute with . On the other hand, we know that , , , and , . Choosing (for the sake of simplicity), we have Now, using a matrix operator norm from both sides of (35), we attain This reveals the fourth order of convergence for new method (23). The proof is ended.
4. Numerical Results
This section addresses issues related to the numerical precision of the computation of matrix sign function using Mathematica 8 builtin precision [21, 22]. The value of machine precision that produced the results included here is 15.96 digits, which corresponds to a 53digit binary double precision number with a mantissa [23].
For numerical comparisons in this section, we have used methods (4) denoted by “Newton,” (7) denoted by “Halley,” (8) denoted by “M4,” (14) denoted by “PM1,” and (23) denoted by “PM2.”
It must be noted that the NewtonSchulz iteration, which replaces the inverse of the matrix in each iteration of (4) by the Schulz inversefinder [6] and can be written as will not be considered in the numerical comparisons. Because due to the use of Schulz iteration instead of the matrix inversion, though the quadratic convergence of (4) will remain unchanged, it fully demands a good initial matrix and might be more risky to diverge in contrast to scheme (4). In fact, its convergence is guaranteed only if ; see Figure 3(a).
(a)
(b)
We report the running time using the command AbsoluteTiming for the elapsed CPU time (in seconds) in the experiments. The computer specifications are Microsoft Windows XP Intel(R), Pentium(R) 4, and CPU 3.20 GHz, with 4 GB of RAM.
Example 1. The aim of this example is to compare different methods for finding the matrix sign function of a randomly generated dense matrix as follows: n = 600; SeedRandom; A = RandomReal[{100, 100}, {n, n}].In this test example, the prescribed tolerance is and the maximum number of iterations is set to 100.
The results of comparisons in terms of the number of iterations and the computational time have been reported in Table 1, for various matrix iterations in finding the matrix sign function numerically. Note that whatever the eigenvalues of a matrix are closer to the imaginary axis, the speed of convergence for the different methods becomes slower and more risky to face with singular matrices , whose inverse could not be computed; see Figure 3(b).

Note that method (13) is not competitive in terms of computational cost and local convergence and hence we will remove it from further consideration. Unlike it, new method (23) has global convergence with asymptotical stability and could be considered as an alternative over the existing iterative methods for finding .
Example 2. In this example, 15 random dense matrices are considered to compare the behavior of different methods in what follows: n = 120; number = 15; SeedRandom; Table[A[l] = RandomReal[{5, 5},{n, n}];, {l, number}].For this test, the prescribed tolerance is and the maximum number of iterations is set to 100.
The results of comparisons are reported in Figure 4. New method (23) beats its competitors in terms of the number of iterations, while both (23) and (8) are the best iterations in terms of the computational time. Note that, in the last two examples, we have used double precision arithmetic in our calculations.
(a)
(b)
Although this example showed the robustness of new method (23), there is an approach to observe the order of convergence of different iteration methods numerically. To be more precise, the computational order of convergence for matrix iterations in finding the matrix sign function can be estimated by wherein are the last three approximations for finding in the convergence phase.
Example 3 (an academical test). Let us find the computational order of convergence for different methods when finding the matrix sign for the wellknown Wilson matrix as follows: In order to find the COC, we herein apply 64digit fixed point arithmetic in our calculations.
The convergence history along the COCs (in the infinity norm) using formula (38) for different methods is illustrated in Figure 5 and Table 2, applying the stopping termination . Results show that new method (23) is quite fast and its computational order of convergence for academical tests in high precision computing environment is around 4.

4.1. Scaling
Main proposed iteration (23) is quite fast and reliable due to the discussions in Sections 3 and 4. However, a way is open for speeding up its initial phase of convergence.
An effective way to enhance the initial speed of convergence is to scale the iterates prior to each iteration; that is, is replaced by . Such an approach can simply be done in what follows: where and .
5. Discussion
A function of a matrix can be defined and computed in several ways, such as Cauchy integral, polynomial interpolation, and Jordan canonical form. However, another approach is to use iteration methods for such computations. Several matrix functions can be computed by iteration , with an appropriate initial matrix where, for reasons of computational cost, is usually a polynomial or a rational function.
Under this motivation, in this paper we have introduced and demonstrated some fourthorder matrix methods for finding the matrix sign function. The proposed methods consist of one matrix inversion per cycle and are asymptotically stable. The consistency and efficiency of the contributed methods have also been tested numerically for finding the matrix sign functions to support the theoretical parts.
Conflict of Interests
The authors declare that there is no conflict of interests regarding the publication of this paper.
Acknowledgment
The authors are grateful to the anonymous referees for their valuable comments and suggestions for improving the paper.
References
 S. Barrachina, P. Benner, and E. S. QuintanaOrtí, “Efficient algorithms for generalized algebraic Bernoulli equations based on the matrix sign function,” Numerical Algorithms, vol. 46, no. 4, pp. 351–368, 2007. View at: Publisher Site  Google Scholar  MathSciNet
 F. Filbir, “Computation of the structured stability radius via matrix sign function,” Systems and Control Letters, vol. 22, no. 5, pp. 341–349, 1994. View at: Publisher Site  Google Scholar
 N. Kyurkchiev and A. Iliev, “A refinement of some overrelaxation algorithms for solving a system of linear equations,” Serdica Journal of Computing, vol. 7, no. 3, pp. 245–256, 2013. View at: Google Scholar
 M. S. Misrikhanov and V. N. Ryabchenko, “A matrix sign function in problems of the analysis and design of linear systems,” Automation and Remote Control, vol. 69, pp. 198–222, 2008. View at: Google Scholar  MathSciNet
 S. Chountasis, V. N. Katsikis, and D. Pappas, “Applications of the MoorePenrose inverse in digital image restoration,” Mathematical Problems in Engineering, vol. 2009, Article ID 170724, 12 pages, 2009. View at: Publisher Site  Google Scholar  MathSciNet
 N. J. Higham, Functions of Matrices: Theory and Computation, Society for Industrial and Applied Mathematics, Philadelphia, Pa, USA, 2008. View at: Publisher Site  MathSciNet
 J. L. Howland, “The sign matrix and the separation of matrix eigenvalues,” Linear Algebra and its Applications, vol. 49, pp. 221–232, 1983. View at: Publisher Site  Google Scholar  MathSciNet
 D. A. Bini, N. J. Higham, and B. Meini, “Algorithms for the matrix $p$th root,” Numerical Algorithms, vol. 39, no. 4, pp. 349–378, 2005. View at: Publisher Site  Google Scholar  MathSciNet
 O. Gomilko, F. Greco, and K. Zietak, “A Pad\'e family of iterations for the matrix sign function and related problems,” Numerical Linear Algebra with Applications, vol. 19, no. 3, pp. 585–605, 2012. View at: Publisher Site  Google Scholar  MathSciNet
 C. Kenney and A. J. Laub, “Rational iterative methods for the matrix sign function,” SIAM Journal on Matrix Analysis and Applications, vol. 12, no. 2, pp. 273–291, 1991. View at: Publisher Site  Google Scholar  MathSciNet
 F. Greco, B. Iannazzo, and F. Poloni, “The Padé iterations for the matrix sign function and their reciprocals are optimal,” Linear Algebra and Its Applications, vol. 436, no. 3, pp. 472–477, 2012. View at: Publisher Site  Google Scholar  MathSciNet
 C. S. Kenney and A. J. Laub, “The matrix sign function,” Institute of Electrical and Electronics Engineers: Transactions on Automatic Control, vol. 40, no. 8, pp. 1330–1348, 1995. View at: Publisher Site  Google Scholar  MathSciNet
 M. Monsalve, “A secant method for the matrix sign function,” in Lecturas en Ciencias de la Computacin, RT 200903, Venezuela, 2009. View at: Google Scholar
 J. M. McNamee and V. Y. Pan, “Efficient polynomial rootrefiners: a survey and new record efficiency estimates,” Computers & Mathematics with Applications, vol. 63, no. 1, pp. 239–254, 2012. View at: Publisher Site  Google Scholar  MathSciNet
 F. Soleimani, F. Soleymani, and S. Shateyi, “Some iterative methods free from derivatives and their basins of attraction for nonlinear equations,” Discrete Dynamics in Nature and Society, vol. 2013, Article ID 301718, 10 pages, 2013. View at: Publisher Site  Google Scholar  MathSciNet
 F. K. Haghani and F. Soleymani, “On a fourthorder matrix method for computing polar decomposition,” Computational and Applied Mathematics, 2014. View at: Publisher Site  Google Scholar
 M. Monsalve and M. Raydan, “A new inversionfree method for a rational matrix equation,” Linear Algebra and Its Applications, vol. 433, no. 1, pp. 64–71, 2010. View at: Publisher Site  Google Scholar  MathSciNet
 F. Soleymani, S. K. Vanani, and A. Afghani, “A general threestep class of optimal iterations for nonlinear equations,” Mathematical Problems in Engineering, vol. 2011, Article ID 469512, 10 pages, 2011. View at: Publisher Site  Google Scholar  MathSciNet
 B. Iannazzo, Numerical solution of certain nonlinear matrix equations [Ph.D. thesis], Dipartimento di Matematica, Università di Pisa, 2007.
 B. Iannazzo, “A family of rational iterations and its application to the computation of the matrix $p$th root,” SIAM Journal on Matrix Analysis and Applications, vol. 30, no. 4, pp. 1445–1462, 2008. View at: Publisher Site  Google Scholar  MathSciNet
 http://reference.wolfram.com/mathematica/tutorial/LinearAlgebraInMathematicaOverview.html.
 M. Trott, The Mathematica GuideBook for Numerics, Springer, New York, NY, USA, 2006.
 H. D. Noble and R. A. Chipman, “Mueller matrix roots algorithm and computational considerations,” Optics Express, vol. 20, no. 1, pp. 17–31, 2012. View at: Publisher Site  Google Scholar
Copyright
Copyright © 2014 F. Soleymani et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.