Abstract

This work is concerned with the construction of a new matrix iteration in the form of an iterative method which is globally convergent for finding the sign of a square matrix having no eigenvalues on the axis of imaginary. Toward this goal, a new method is built via an application of a new four-step nonlinear equation solver on a particulate matrix equation. It is discussed that the proposed scheme has global convergence with eighth order of convergence. To illustrate the effectiveness of the theoretical results, several computational experiments are worked out.

1. Preliminaries

The sign function for the scalar case is expressed bywherein is not located on the imaginary axis. Roberts in [1] for the first time extended this definition for matrices, which has several important applications in scientific computing, for example see [24] and the references cited therein. For example, the off-diagonal decay of the matrix function of sign is also a well-developed area of study in statistics and statistical physics [5].

To proceed formally, let us consider that is a square matrix possessing no eigenvalues on the axis of imaginary. We consideras a form of Jordan canonical written such thatand the eigenvalues of are in the open left half-plane, while the eigenvalues of are in the open right half-plane. It is now possible to write the following [6]:wherein . Noting that is definable once is nonsingular.

This procedure takes into account a clear application of the form of Jordan canonical and of the matrix . Herein none of the matrices and are unique. However, it is possible to investigate that as provided in (4) does not rely on the special selection of or .

Here, a simpler interpretation for the sign matrix in the case of Hermitian case (namely, all eigenvalues are real) can be given bywhereinis a diagonalization of the square matrix .

The significance of calculating and finding in (4) is because of the point that the function of sign plays a central role in matrix functions theory, specially for principal matrix roots and the polar decomposition; for more one may refer to [79].

Bini et al. in [10] proved that the principal -th root of the matrix could be written as a multiple of the (2,1)-block associated with the sign matrix , associated with the block companion matrix:

It is requisite to focus of the most general case, i.e., when all the eigenvalues are complex rather than being narrow over a range of specific matrices, such as in (5).

An important point goes to the fact that although is a square root of the unit matrix, it is not equal to or unless the ’s spectrum locates completely in the open right or left half-plane(s), respectively. Thus, the sign function is a nonprimary square root of .

Apart from (4), an efficient way to derive matrix iterations for some matrix functions is to apply the zero-finding iterative methods for solving operator equations which here is a matrix equation. Toward this goal, it is necessary to tackle a nonlinear equation as comes next:wherein is a unit matrix, so as to propose matrix methods for . The main motivation in this work is to extend the recently published results of the literature [11, 12] in this category by providing a useful novel method for calculating sign matrix. Furthermore, the proposed procedure can be applied for the calculation of polar decomposition, principal matrix square root, and several other scientific computing problems.

After this brief introduction about the matrix function of sign in Section 1, the remaining sections of this study are given as comes next. Section 2 shortly surveys the existing matrix iterations and their importance for computing . In Section 3, it is discussed how we construct a new method having global convergence behavior and not belonging to the class of Padé family of iterations for computing . It too manifests that the proposed scheme is convergent with high order of convergence. Computational reports are furnished to illustrate the higher computational precision of the constructed solvers in Section 4. A final remark of the manuscript is given in Section 5 with some directions for future studies.

2. The Literature

In the current research work, iterative methods are the main focus for calculating . As a matter of fact, such matrix iteration methods are Newton-type schemes that are in essence fixed-point schemes by providing a convergent matrix sequence by imposing a sharp initial value.

The connection of matrix iterative expressions with the function of sign is not that straightforward, but in practice, such methods could be constructed by considering a suitable root-finding method to the nonlinear matrix equation (8). Noting that is a solution of this equation (refer to [12] and the references cited therein).

Applying the classic Newton’s method (NM) to (8) yieldsAn inversion-free version of (9), called Newton-Schultz iteration [6], is defined byby applying the well-known Schulz inverse-finder in order to remove the computation of the inverse matrix per computing step.

The Newton-Schulz scheme is a second-order convergent, inversion-free method in calculating the sign matrix, but it suffers from the drawback that its convergence unlike the Newton’s method (9) is local.

Analogously, the third-order convergent Halley’s method (HM) [13] for calculating the sign matrix is defined byIt is noted that all the above-mentioned schemes are particular cases of the Padé family presented and discussed in [13, 14]. The Padé approximation belongs to a broader category of rational approximations. Coincidentally, the best uniform approximation of the sign function on a pair of symmetric but disjoint intervals can be expressed as a rational function.

Recently a fourth-order iterative method was furnished in [15] as follows:

3. Construction of a New Matrix Method

Assume the following nonlinear equation:wherein is a scalar function. In what follows, let us first present a new scheme in nonlinear equation solving. The idea of increasing the order (see, e.g., [16, 17]) is to consider several substeps, while the newly appearing first derivatives are approximated via a secant-like approximation. Thus, we may writewhereas the first-order divided difference operator is defined by

Here the point is that the new method should not aim at having the optimal convergence order (in the sense of Kung-Traub, see, e.g., [16]) since such schemes lose the global convergence behavior when applied to (13). Since the final objective is to propose a method for matrix sign, we should take two things into account, which are having global convergence behavior and the novelty. In fact, the optimality conjecture of Kung-Traub is not useful once we extend iterative methods for calculating the sign of a matrix. Because, the optimality conjecture here ruins the final structure of the matrix method.

One may also ask that why method (14) has been selected and what are the other ways for improving it. To answer these, we recall that the best way to improve the performance of (14) is to add one Newton’s substep at the end of its fourth step, which is costly since it includes the computation of the first derivative per cycle. In a similar way as in (14), we can add a secant-like substep and increase the convergence order. Generally speaking, a family of iterations can now be constructed in this way. On the other hand, since very higher order methods may not be useful in double precision arithmetic, namely, the practical environment that most researchers work in, we here only provide (14) and discuss its application and extension for matrix sign.

Theorem 1. Assume that is a simple root of a function , which is sufficiently differentiable and contains as an initial value. Accordingly, the iterative expression (14) readswhere , and .

Proof. The steps of proving the convergence order for this iterative method are via Taylor expansion, which is straightforward.

Applying (14) on the matrix equation (8) will yield a novel matrix scheme to calculate (4) in its reciprocal form as follows:whereinand the initial approximation is

Applying (14) on the nonlinear equation contributes a global convergence in the complex plane (excluding the values locating on the axis of imaginary). The basins of attraction for (10) (locally convergent) and (9) (globally convergent) are portrayed in Figure 1. This global behavior of the proposed scheme, that is kept for matrix case, has been shown in Figures 2-3

To draw the basins of attractions, we consider a square and allocate a color to any point according to the simple zero, at which the new methods (or the existing methods for comparisons) converge. Subsequently, we highlight the point as black once the scheme diverge. Herein, we consider the stopping termination for convergence as . Using a different stopping criterion , the density plot of the basins of attraction only for the new high order of convergence method (17) are brought forward in Figures 2-3 on different domains.

3.1. Convergence Study

Now, it is shown that the proposed schemes are convergent, under standard conditions, namely, when there are no pure imaginary eigenvalues in the absence of rounding errors.

Theorem 2. Assume that possess no eigenvalues on the axis of imaginary. Accordingly, the iterations expressed by (17) is convergent to , using (19).

Proof. Assume that is a rational operator in accordance with (17). Since has a form of Jordan canonical, there exists a matrix , so thatIt is recalled that diagonalizable and nondiagonalizable matrices have a Jordan normal form , whereas includes the Jordan blocks. So,An eigenvalue of is transferred into an eigenvalue of of via the iterative expression (17). This relevance and relation among the eigenvalues show that it is required to search how transforms the complex plane into itself. It is recalled that has the feature of sign preservation, namely,Moreover, it should have the global convergence, that is, the sequence defined bywith converges to while is not located on the axis of imaginary. At this moment, assume that the square matrix have a form of canonical Jordan considered just like [6, p. 107]:wherein is a not singular and are square Jordan blocks in association with eigenvalues locating in and , respectively. We show by and locating on the main diagonals of blocks and , respectively. Applying (24), it is possible to writeTaking (25) into account, it is easy to deduceFrom , we definein order to obtain a convergent sequence to . Thence, from the scheme (17), we simply could obtainWhen is a diagonal matrix, it is possible to show that all successive are diagonal matrices, via mathematical induction. The other case when is not diagonal and will be handled in the remaining part of the proof.
By rearranging (28) as uncoupled scalar iterations as comes next: whereUsing (28) and (29), we should investigate the convergence of to , for all . From (29) and because the eigenvalues of are not pure imaginary, it is possible to writeThus, we attainNoting that the factor , is bounded due to choosing an appropriate initial matrix (19). Sincewe attainand, therefore,At this point, it possible to derive thatNoting that if is not diagonal, the relation among the iterates’ eigenvalues must be dealt with for (17). The eigenvalues of are transformed from the iteration to the iteration viaThe relation (37) manifests that the eigenvalues generally are convergent to , namely,Ultimately, it would be easy to write thatThis ends the convergence proof for (17) to calculate .

Now by considering (18) and the facts that are rational functions of , so, similar to , commute with , and , , , and , . It can be shown that the new scheme reads the following error inequality:Inequality (40) shows the eighth order of convergence.

A scaling technique to accelerate the initial phase of convergence is normally requisite since the convergence rate cannot be seen in the initial iterates. Such an idea was discussed fully in [18] for Newton’s method. A robust procedure to improve the initial convergence speed is to scale the iterations before each iteration; i.e., should be moved to .

If the scaling parameter (for the Newton’s method) is defined by [18],then the accelerated forms of the proposed matrix iteration for is defined as follows:where

4. Experiments

Herein, several experiments are discussed for the calculation of the sign matrix. The direct application of the new formulas for finding is given below, though the application for computing the polar decomposition, finding the Yang-Baxter matrix equation can be given similarly. The simulations are run on an office laptop with Windows 7 Ultimate equipped Intel(R) Core(TM) i5-2430M CPU 315 2.40GHz processor and 16.00 GB of RAM on a 64-bit operating system. In addition, the simulations are done in Mathematica 11.0 [19].

Various schemes are compared in respect to the iteration numbers and the elapsed CPU time. Globally convergent schemes are only included for comparison. The compared matrix methods are NM, HM, ANM, and PM1 (i.e., (17)). We do not include comparisons with methods having local convergence behavior such as the Newton-Schulz method (10) or (computationally expensive) methods from different categories such as the ones based on the computation of the Cauchy integral

The stopping criterion for our simulations is defined byThe reason of choosing (46) lies in the fact that, at each iterate, the obtained approximation should satisfy the main matrix equation. Thus, this criterion is much more trustful than other Cauchy-like terminations when calculating the sign of a matrix.

Experiment 1. Here, we calculate the sign matrix of 20 generated randomly complex matrices (with uniform distributions via the following piece of codes in the Mathematica environment) as comes nextSeedRandom[123]; number = 20;Table[A[l] = RandomComplex[-5 - 5 I, 5 + 5 I,50 l, 50 l];, l,number]; Noting that here .

The numerical reports are provided in Tables 1-2 for various sizes of the input matrices based on the required number of steps and the elapsed CPU times. The results uphold the analytical parts and discussions of Sections 2-3. They manifest that there is a clear improvement in the iterations’ numbers and the total elapsed CPU time by applying (17). As a matter of fact, the mean of number of iterations and the CPU times listed in the last rows of each table indicate that the scheme (17) has the best performance in general.

It is pointed out that the calculation of per iteration for computing the stopping termination introduces one matrix product for NM, while the HM and the presented scheme calculate this matrix in the middle of each cycle.

Experiment 2. The aim of this test is to check the convergence history of different methods for a randomly generated complex matrix as follows:SeedRandom[1];A= RandomComplex[-100 - 100 I, 100 + 100 I, 1000, 1000]; using the stopping criterion

The results for Experiment 2 are given in Figure 4 showing a stable and consistent behavior of the proposed scheme for finding matrix sign function.

The numerical reports and evidences in Section 4 improve the mean of the CPU time, clearly. This was the main target of this paper in order to propose an efficient method.

5. Discussion

In various fields of numerical linear algebra and scientific computing, the theory and computation of matrix functions are very much useful. In the modern numerical linear algebra, it underlies an effective way introducing one to resolve the topical problems of the control theory. The function of a matrix can be defined in several ways, of which the following three are generally the most useful: Jordan canonical form, polynomial interpolation, and finally Cauchy integral.

In this research work, we have focused on iterative methods for this purpose. Hence, a high-order nonlinear equation solver has been employed for constructing a novel scheme in calculating the sign matrix, which does not have pure imaginary eigenvalues.

It was shown that the convergence is global via attraction basins in the complex plane and the rate of convergence is eight. Finally, some numerical experiments in double precision arithmetic were performed to manifest the superiority and applicability of (17). Outlines for future works can be forced to extend the discussed matrix iterations for calculating polar decompositions in future studies based on the application of the new schemes.

Data Availability

All the data used for comparison of different methods in this article have been generated using random generators, via the programming package Mathematica. The data can be generated in this way. Moreover, interested readers may contact the corresponding authors if they need any of such programming codes for further studies.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

This research project was supported by a grant from the “Research Center of the Female Scientific and Medical Colleges”, Deanship of Scientific Research, King Saud University.