Iterative FixedPoint Methods for Solving Nonlinear Problems: Dynamics and Applications
View this Special IssueResearch Article  Open Access
Approximating the Matrix Sign Function Using a Novel Iterative Method
Abstract
This study presents a matrix iterative method for finding the sign of a square complex matrix. It is shown that the sequence of iterates converges to the sign and has asymptotical stability, provided that the initial matrix is appropriately chosen. Some illustrations are presented to support the theory.
1. Introduction
The generic matrix function of a given matrix is defined formally by the integral representation where is an analytic function, , and is a closed curve which encircles all eigenvalues of (it should be contained in the domain of analyticity of ). The integral representation (1) is known as the Cauchy integral formula [1]. The integral of a matrix should be understood as the matrix whose entries are the integrals of the entries of . However, this mathematically appealing formula for computing the matrix functions is complicated and needs complex analysis to be fully understandable. Hence, several important other strategies for computing the matrix functions have been proposed and investigated, such as the Jordan canonical form and iterative methods for applied numerical problems (see, e.g., [1–3]).
In 1971, Roberts in [4] introduced the matrix sign function as a tool for model reduction and for solving Lyapunov and algebraic Riccati equations. He defined the sign function as a Cauchy integral and obtained the following integral representation of :
The matrix sign function is widely exploited in numerical linear algebra, especially in the computation of invariant subspaces and solutions of Riccati equations [5–7]. Note that the application of this function enables a matrix to be decomposed into two components whose spectra lie on opposite sides of the imaginary axis. The matrix sign function is a valuable tool for the numerical solution of Sylvester and Lyapunov matrix equations (see, e.g., [8]). An application of a generalization of the Newton iteration for the matrix sign function to the solution of the generalized algebraic Bernoulli equations was considered in [9].
Another application of this matrix function as a simple and direct method to derive some fundamental results in the theory of surface waves in anisotropic materials was presented in [10]. The authors of paper [11] investigated some practical iterations for matrix sector function which is a generalization of the matrix sign function.
Due to the applicability of the matrix sign function along with the difficulty of representation (2), stable iterative methods have become some viable choices.
The most important general family of matrix iterations for finding the matrix sign function was introduced in [12] using Padé approximants to and the following characterization: where and is less than 1 in magnitude. Let the Padé approximant to be and . The iteration has been proved to be convergent to and with the order of convergence for . Generally speaking, the iterations of Kenny and Laub (4), generated by the and Padé approximants, are globally convergent. A list of different iterations (in the scalar form) is given in Table 1.
Note that Iannazzo in [13] pointed out that these iterations can be obtained from the general König family (which goes back to Schröder [14, 15]) applied to the equation . For a recent method in this area, one may refer to [16].
A lot of known methods could be extracted from the Padé family (4). For example, the Newton’s iteration can be deduced as the reciprocal of the iteration corresponding to the case and in Table 1:
Choosing yields the following fifth order method:
Similarly, options and , or and , or and result in the following fifth order methods, respectively:
The remaining sections of this work are organized as follows. Section 2 presents the construction and the derivation of a new matrix iteration for finding using the following new nonlinear equation solver: where . (10) is a combination of Jarratt’s method [17] and a secant approach [18].
Note that (10) is a novel threestep iterative method (for the scalar case). Section 2 also studies the stability of the method and verifies its asymptotical stability. In Section 3, we discuss some other aspects of the new method, applicable in the implementation. Therein, we derive a new inversionfree method as well as a scaled method for finding . Section 4 is devoted to the numerical examples for illustrating the convergence behavior of the new method against the existing ones. We emphasize that our constructed solver possesses the same global convergence rate as (6), but numerical example will reveal that it has an equal or faster behavior for solving many randomly generated matrices. This would be a clear advantage of our solver in contrast to (6). Finally, concluding remarks will be drawn in Section 5.
2. A Novel Iterative Method
Throughout this work it is assumed that has no eigenvalues on the imaginary axis, so that is defined. Note that this assumption implies that is nonsingular.
As discussed in the fundamental article of Kenny and Laub in 1995 [19], the construction of new matrix methods for the matrix sign is related to the iterative methods for finding the solution of nonlinear scalar equations. Let us apply (10) to the following nonlinear matrix equation: in which is the identity matrix. In fact, the sign is a solution of the matrix equation (11). After application and simplification of its reciprocal, we obtain where .
Iannazzo in [20] mentioned that a matrix convergence is governed by the corresponding scalar convergence. Since the method (10) reads the following error equation: wherein and , therefore its corresponding matrix method (12) converges with fifth order of convergence. But the question is whether this convergence is local or global. To answer this question, we draw the basins of attraction for the new scheme along with the existing methods of various orders.
It is shown in Figures 1–3 that methods (5), (6), and (12) are globally convergent, while methods (7), (8), and (9) are locally convergent (if one chooses a matrix (in Figure 2 (left)) with one eigenvalue with negative real part, but in a green petal, then the matrix iteration will not converge to ).
The higher order convergence of (12) made its basins larger and lighter in contrast to (5). Hence, the new method could be of interest due to its global fifth order of convergence for finding .
Now, an important challenge, that must be proven, is to show the stability of the new method (12) for finding the matrix sign function. This will be done formally in what follows.
Definition 1 (stability, see [1]). Consider an iteration with a fixed point . Assume that is Fréchet differentiable at . The iteration is stable in a neighborhood of if the Fréchet derivative has bounded powers; that is, there exists a constant such that for all .
We investigate the stability of (12) for finding in a neighborhood of the solution of (11). In fact, we analyze how a small perturbation at the th iterate is amplified or damped along the iterates. Stability concerns behavior close to convergence and so is an asymptotic property.
Lemma 2. The sequence generated by (12) is asymptotically stable.
Proof. If is a function of , then the iterates from (12) are all functions of and hence commute with . Commutativity properties are frequently used when deriving a matrix iteration for finding .
In this study, we restrict the analysis to asymptotically small perturbations; that is, we use the differential error analysis.
Let be the numerical perturbation introduced at the th iterate of (12). Next, one has
Here, we perform a first order error analysis; that is, we formally use approximations , since , , is close to the zero (matrix). This formal manipulation is meaningful if is sufficiently small. We have
where the following identities are used (for any nonsingular matrix and the matrix ):
Note that after some algebraic manipulations and using , we have (assuming for enough large )
We used the following facts on the matrix sign function , and . We can now conclude that the perturbation at the iterate is bounded; that is,
Therefore, the sequence generated by (12) is asymptotically stable. This ends the proof.
Theorem 3. Let have no pure imaginary eigenvalues. Then, the matrix sequence defined by (12) converges to the matrix sign .
Proof . Let have a Jordan canonical form arranged as
where is a nonsingular matrix and , are square Jordan blocks corresponding to eigenvalues lying in (open lefthalf complex plane) and (open righthalf complex plane), respectively. Denote by and values lying on the main diagonals of blocks and , respectively.
Since is invertible, we know that is diagonalizable and [1]
Therefore,
On the other hand, if we define , then from the equation (12) we obtain
Notice that if is a diagonal matrix, then all successive are diagonal too. From (22), it is enough to prove that converges to , in order to ensure the convergence of the sequence generated by (12) to .
We can write (22) as uncoupled scalar iterations to solve , given by
where and . From (22) and (23), it is enough to study the convergence of to , for all .
From (23) and since the eigenvalues of are not pure imaginary, we have that . Thus, we attain
From the other point of view, since , we obtain
and . This shows that is convergent. Now, one concludes that .
Recalling , we have
and subsequently the convergence is established. The proof is complete.
Theorem 4. Consider the same conditions of Lemma 2 and Theorem 3. Then the proposed method (12) has fifth order of convergence to the sign matrix .
Proof . Clearly, are rational functions of and hence, like , commute with . On the other hand, we know that , , , and , . Using the replacement (for the sake of simplicity), we have Now, using any matrix norm from both sides of (27), we attain This reveals the fifth order of convergence for the new method (12). The proof is complete.
3. MultiplicationRich and Scaled Variants of the New Method
In general, reduction of a problem in numerical linear algebra to the matrix inversion problem is not an advisable technique. The iterations (5)–(9) as well as (12) require explicit computation of a matrix inverse in each iterative step. Since explicit usage of the inverse matrix is relatively rare in numerical analysis, there is a normal aspiration to approximate the inverse. Such discussions and variants for (12) will be given in the following subsections.
3.1. Solve a Matrix Equation Instead of the Matrix Inverse
One of the ways to avoid explicit usage of the inverse matrix is to solve corresponding system of linear matrix equations instead of the matrix inverse. This is done in Algorithm 1.

3.2. Use Approximation of the Matrix Inverse
Another tendency is to give matrix multiplicationrich iterations which retain the convergence rate of the method. For example, the inverse in (5) can be replaced by one step of Schulz method for the matrix inverse, which has the form . This replacement produces the NewtonSchulz iteration which is multiplicationrich and retains the quadratic convergence of Newton’s method. However, it is only locally convergent, with convergence guaranteed for (see [1]). We apply similar idea to (12). For the sake of simplicity, we use the notation . Replacing with in (12), we get the following iterative rule for computing the matrix sign function:
Formula (30) is multiplicationrich with convergence guaranteed for .
Note that inversefree algorithms are suitable for the implementation on vector and parallel computers. The iterative scheme (30) includes 9 matrix multiplications, while the complexity of (12) contains 6 matrix multiplications and one matrix inversion to achieve fifth convergence order.
3.3. A Hybrid Method
An efficient algorithm for finding the sign by avoiding the computation of matrix inverse is to use (12) or (33) in initial iterations, until is close enough to , and in a subsequent stage apply (30). Such a switching approach was proposed originally in [19]. A similar idea is exploited in Algorithm 2.

3.4. Scaling Method
For lower order methods, such as (5), the convergence is slow at the beginning. In fact, once the error is sufficiently small (in practice, less than, say, 0.5), successive errors decrease rapidly, each being approximately the square of the previous one. However, initial convergence can be slow if the iteration has a large eigenvalue, that is, in the case . Hence, a scaling approach to accelerate the beginning of this phase is necessary and can be done in what follows [7] for the Newton’s method: wherein
Such an approach could be done to refine the initial matrix and to provide a much more robust initial matrix to arrive at the convergence phase rapidly.
The new iteration (12) is quite fast and reliable due to the discussions in Sections 2 and 3. However, a way is open for speeding up its initial phase of convergence via the concept of scaling.
An effective way to enhance the initial speed of convergence is to scale the iterates prior to each iteration, that is, is replaced by . Such an idea can simply be done in what follows: where and . Algorithm 3 illustrates an efficient method based on (33) for finding .

4. Experimental Results
There are basically two general ways to terminate a matrix iteration for finding , that is, to stop when has relative error below a tolerance or to stop when some suitable residual (such as ) is below the tolerance. The relevant aim will in general be problem dependent. However, the stop termination is really important in matrix methods.
The considered stopping termination in this section would be the safe strategy introduced in [1] as follows:
For comparisons, we have used the matrix globally convergent methods (5), (6), and (12) using Mathematica 8 builtin precision, [21]. We used Mathematica function Inverse to compute the required matrix inverse. Implementation details of the function Inverse are based on efficient row reduction (Gaussian elimination) based on numerical approximation.
Example 5. In this test, we examine the behavior of different iterative methods for finding the matrix sign function of 10 randomly generated complex square matrices as follows: n = 70; number = 10; SeedRandom; Table[A[l] = RandomComplex[, , {n, n}];, {l, number}].
The results of comparisons in terms of the number of iterations have been reported in Table 2. We remark that whatever the eigenvalues of a matrix are closer to the imaginary axis, the speed of convergence for different methods becomes slower and more risky to face with singular matrices , whose inverses could not be computed.
In this example, we have used the stopping criterion (34) with , matrix norm one, and as the initial matrix.
Example 6. In order to confirm the acceleration via scaling, we have reran Example 5 with the norm scaling. The results are summarized in Table 3. The numerics reverify the effectiveness of the new method (12) in finding the matrix sign.
5. Summary
Interest in the sign function grew steadily starting from the 1970s and 1980s, initially among engineers and later among numerical analysts. Following such a trend, in this study we proposed a fifth order new iterative method for finding the matrix sign .
Applying the basins of attraction in the complex plane, based on the results from [20], we concluded that the introduced method (12) has global convergence. We then theoretically found that it is asymptotically stable. Some numerical experiments have been studied in order to show the faster convergence on the basis of a smaller number of iterations.
Several modifications of the introduced method have been established, such as a new inversionfree method, a composite method, and a scaled version.
Conflict of Interests
The authors declare that there is no conflict of interests regarding the publication of this paper.
Acknowledgments
The second author gratefully acknowledges support from the Research Project 174013 of the Serbian Ministry of Science. The authors are grateful to an anonymous reviewer for carefully reading their paper and helping to improve the presentation.
References
 N. J. Higham, Functions of Matrices: Theory and Computation, Society for Industrial and Applied Mathematics (SIAM), Philadelphia, Pa, USA, 2008. View at: Publisher Site  MathSciNet
 J. L. Howland, “The sign matrix and the separation of matrix eigenvalues,” Linear Algebra and Its Applications, vol. 49, pp. 221–232, 1983. View at: Publisher Site  Google Scholar  MathSciNet
 J. LeyvaRamos, “A note on mode decoupling of linear timeinvariant systems using the generalized sign matrix,” Applied Mathematics and Computation, vol. 219, no. 22, pp. 10817–10821, 2013. View at: Publisher Site  Google Scholar  MathSciNet
 J. D. Roberts, “Linear model reduction and solution of the algebraic Riccati equation by use of the sign function,” International Journal of Control, vol. 32, no. 4, pp. 677–687, 1980. View at: Publisher Site  Google Scholar
 Z. Bai and J. Demmel, “Using the matrix sign function to compute invariant subspaces,” SIAM Journal on Matrix Analysis and Applications, vol. 19, no. 1, pp. 205–225, 1998. View at: Publisher Site  Google Scholar  Zentralblatt MATH  MathSciNet
 R. Byers, C. He, and V. Mehrmann, “The matrix sign function method and the computation of invariant subspaces,” SIAM Journal on Matrix Analysis and Applications, vol. 18, no. 3, pp. 615–632, 1997. View at: Publisher Site  Google Scholar  Zentralblatt MATH  MathSciNet
 C. S. Kenney, A. J. Laub, and P. M. Papadopoulos, “Matrixsign algorithms for Riccati equations,” IMA Journal of Mathematical Control and Information, vol. 9, no. 4, pp. 331–344, 1992. View at: Publisher Site  Google Scholar  MathSciNet
 P. Benner and E. S. QuintanaOrtí, “Solving stable generalized Lyapunov equations with the matrix sign function,” Numerical Algorithms, vol. 20, no. 1, pp. 75–100, 1999. View at: Publisher Site  Google Scholar  MathSciNet
 S. Barrachina, P. Benner, and E. S. QuintanaOrt, “Efficient algorithms for generalized algebraic Bernoulli equations based on the matrix sign function,” Numerical Algorithms, vol. 46, no. 4, pp. 351–368, 2007. View at: Publisher Site  Google Scholar  MathSciNet
 A. N. Norris, A. L. Shuvalov, and A. A. Kutsenko, “The matrix sign function for solving surface wave problems in homogeneous and laterally periodic elastic halfspaces,” Wave Motion, vol. 50, no. 8, pp. 1239–1250, 2013. View at: Publisher Site  Google Scholar  MathSciNet
 B. Laszkiewicz and K. Zietak, “Algorithms for the matrix sector function,” Electronic Transactions on Numerical Analysis, vol. 31, pp. 358–383, 2008. View at: Google Scholar  MathSciNet
 C. Kenney and A. J. Laub, “Rational iterative methods for the matrix sign function,” SIAM Journal on Matrix Analysis and Applications, vol. 12, no. 2, pp. 273–291, 1991. View at: Publisher Site  Google Scholar  Zentralblatt MATH  MathSciNet
 B. Iannazzo, Numerical solution of certain nonlinear matrix equations [Ph.D. thesis], Universita degli Studi di Pisa, Facolta di Scienze Matematiche, Fisiche e Naturali, Pisa, Italy, 2007.
 E. Schröder, “Ueber unendlich viele Algorithmen zur Auflösung der Gleichungen,” Mathematische Annalen, vol. 2, no. 2, pp. 317–365, 1870. View at: Publisher Site  Google Scholar  MathSciNet
 E. Schröder, “On infinitely many algorithms for solving equations,” Tech. Rep. TR92121, Department of Computer Science, University of Maryland, College Park, Md, USA, 1992. View at: Google Scholar
 F. Soleymani, E. Tohidi, S. Shateyi, and F. Khaksar Haghani, “Some matrix iterations for computing matrix sign function,” Journal of Applied Mathematics, vol. 2014, Article ID 425654, 9 pages, 2014. View at: Publisher Site  Google Scholar
 P. Jarratt, “Some fourth order multipoint iterative methods for solving equations,” Mathematics of Computation, vol. 20, no. 95, pp. 434–437, 1966. View at: Google Scholar
 A. Iliev and N. Kyurkchiev, Nontrivial Methods in Numerical Analysis: Selected Topics in Numerical Analysis, LAP Lambert, 2010.
 C. S. Kenney and A. J. Laub, “The matrix sign function,” IEEE Transactions on Automatic Control, vol. 40, no. 8, pp. 1330–1348, 1995. View at: Publisher Site  Google Scholar  MathSciNet
 B. Iannazzo, “A family of rational iterations and its application to the computation of the matrix pth root,” SIAM Journal on Matrix Analysis and Applications, vol. 30, no. 4, pp. 1445–1462, 2008. View at: Publisher Site  Google Scholar  MathSciNet
 M. Trott, The Mathematica GuideBook for Numerics, Springer, New York, NY, USA, 2006.
Copyright
Copyright © 2014 F. Soleymani et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.