Abstract

This paper introduces a novel neural-network-based approach for extracting some eigenpairs of real normal matrices of order n. Based on the proposed algorithm, the eigenvalues that have the largest and smallest modulus, real parts, or absolute values of imaginary parts can be extracted, respectively, as well as the corresponding eigenvectors. Although the ordinary differential equation on which our proposed algorithm is built is only n-dimensional, it can succeed to extract n-dimensional complex eigenvectors that are indeed 2n-dimensional real vectors. Moreover, we show that extracting eigen-pairs of general real matrices can be reduced to those of real normal matrices by employing the norm-reducing skill. Numerical experiments verified the computational capability of the proposed algorithm.

1. Introduction

The problem of extracting special eigenpairs of real matrices has attracted much attention both in theory [14] and in many engineering fields such as real-time signal processing [58] and principal or minor component analysis [912]. For example, we may wish to get eigenvectors and the corresponding eigenvalues that have the largest or smallest modulus; have the largest or smallest real parts; have the largest or smallest imaginary parts in absolute value. Two most popular methods for this problem are the power method and the Rayleigh quotient method in their direct forms or in the context of inverse iteration [13]. Recently, many neural-network-based methods have also been proposed to solve this problem [1423]. However, most of those neural network based methods focused on computing eigenpairs of real symmetric matrices. The following two ordinary differential equations (ODEs): were proposed by [19, 23], respectively, where is a real symmetric matrix. Both (1) and (2) are efficient to compute the largest eigenvalue of , as well as the corresponding eigenvector. In addition, they can succeed to compute the smallest eigenvalue of and the corresponding eigenvector by simply replacing with , for example,

The following ODE for solving the generalized eigenvalue problem was proposed by [22] where and are two real symmetric matrices, and can be a general form to some degree. Particularly, if is the identity matrix, (4) can be used to solve the standard eigenvalue problem as (1) and (2).

References [1618] extended those neural network based approaches to the case of real antisymmetric or special real matrices of order , where the proposed neural networks can be summarized by -dimensional ODEs since eigenvectors in those cases may be -dimensional complex vectors, that is, -dimensional real vectors.

In this paper, we propose an approach for extracting six types of eigenvalues of -by- real normal matrices and the corresponding eigenvectors based on (2) or (3). Although eigenvectors of real normal matrices may be -dimensional complex vectors, the computation of our proposed method can be achieved in -dimensional real vector space, which can reduce the scale of networks a lot. Then, we show that any real matrix can be made arbitrarily close to a normal matrix by a series of similarity transformations, based on which our proposed algorithm can be extended to the case of arbitrary real matrices.

2. Main Results

Let be the imaginary unit, the conjugate of , and a block diagonal matrix, where , , is a square matrix at the th diagonal block. Unless specially stated, is a real normal matrix in this paper.

Lemma 1. In [13], is a real normal matrix of order if and only if there exists an orthogonal matrix such that where
Here, , , are real eigenvalues of corresponding to the real eigenvectors , and ,  , are pairs of complex eigenvalues of corresponding to the pairs of complex eigenvectors .

For simplicity, let , for . Based on (5) and (6), it is straightforward to verify

Then, the following six definitions and two lemmas are presented, which will be involved much in the sequel. (C1) Let . Then for all ,  . (C2) Let . Then for all ,  . (C3) Let . Then for all ,  . (C4) Let . Then for all ,  . (C5) Let . Then for all ,  . (C6) Let . Then for all ,  .

Lemma 2 (Theorem  4 in [23]). Assume that nonzero is not orthogonal to the eigensubspace corresponding to the largest eigenvalue of . Then, the solution of (2) starting from converges to an eigenvector corresponding to the largest eigenvalue of that is equal to .

Lemma 3 (Theorem  5 in [23]). Assume that nonzero is not orthogonal to the eigensubspace corresponding to the smallest eigenvalue of . Then, the solution of (3) starting from converges to an eigenvector corresponding to the smallest eigenvalue of that is equal to .

Remark 4. If we randomly choose , the projection of on the eigensubspace corresponding to the largest or smallest eigenvalue of will be nonzero with high probability. Hence, (2) and (3) can almost work well with randomly generated .

2.1. Computing the Eigenvalues with the Largest or Smallest Imaginary Parts in Absolute Value, as well as the Corresponding Eigenvectors

Without loss of generality, in this subsection we assume that in (7). Note that .

Based on (7), we know that 0 (if any) is the eigenvalue of corresponding to the eigenvector , , and that , , is the eigenvalue of corresponding to the eigenvectors or . If replacing with the symmetric matrix in (2) as based on Lemma 2, we know that is an eigenvector of corresponding to and (thus, has been gotten), where is a solution of (10). If , ’s are all zero, meaning that is symmetric. In this case, we can directly use (2) or (3) to get the largest or smallest eigenvalue of and the corresponding eigenvector, respectively. Hence, we assume .

The following lemma introduces an approach for computing under the condition (C1).

Lemma 5. Assume that (C1) holds. Let , where is a solution of (10). Then, . In addition, there exists .

Proof. Assume that ,  ,  , are the largest eigenvalues of , that is; . Therefore,
Based on Lemma 2 and (7), we know that should be a linear combination of and ,  . Let
In addition, by (8) we have
And by (9), we have
Because is an orthogonal matrix and for all , holds due to (C1), it is straightforward to verify thus proving the lemma.

Remark 6. Based on Lemma 5, if , (C1) does not hold surely, which can be used to check whether (C1) holds or not.
The following lemma introduces an approach for computing a pair of conjugated eigenvectors of corresponding to the eigenvalues under the condition (C1).

Lemma 7. Assume that (C1) holds. Given any nonzero , the largest eigenvalue of , and the corresponding eigenvector obtained by (10), let and . Then, are two eigenvalues of corresponding to the eigenvectors , respectively.

Proof. Let and take the form as (11) and (12), respectively. Based on (13), we can write
Following the decomposition of as (5) and the definition of , we have where
Taking (12) into (17), we get
Based on (13) and (19), it is straightforward to verify
In addition, since is the eigenvalue of corresponding to the eigenvector , we have . Hence, . Since , we have
By (20) and (21), we have
By (16) and (17), we have
Then, it is straightforward to verify thus proving the lemma.

To get , we can in advance get , the smallest eigenvalue of by replacing with in (3) as follows:

Let , where is a solution of (25). From Lemma 3, we know . Then, the following lemma similar to Lemma 5 can be used to compute .

Lemma 8. Assume that (C2) holds. Then, . In addition, there exists .

Proof. The proof is almost the same to that in Lemma 5.

Note that may be zero; that is, has real eigenvalues. In this case, we have the following lemma.

Lemma 9. Assume that (C2) holds and 0 is the smallest eigenvalue of corresponding to the eigenvector . Then, is the eigenvalue of corresponding to the eigenvector , where .

Proof. Following the conditions, we have . Hence, ; that is,
Note that ,  , because 0 is the smallest eigenvalue of . Based on the definition of , we have
Applying Lemma 3 to (25), we know that should be a linear combination of . Let
By (8), we have
Therefore,
Then, by (26) and (30), it is straightforward to verify
thus proving the lemma.

In the case of , that is, all the eigenvalues of are complex numbers, we have the following lemma similar to Lemma 7.

Lemma 10. Assume that (C2) holds. Given any nonzero , the smallest eigenvalue of , and the corresponding eigenvector obtained by (25), let and . Then, are two eigenvalues of corresponding to the eigenvectors , respectively.

Proof. The proof is almost the same to that in Lemma 7.

Remark 11. Among the following four real normal matrices only the first three meet (C1), but the last one does not. And only the last three matrices meet (C2), but the first one does not.

2.2. Computing the Eigenvalues with the Largest or Smallest Real Parts, as well as the Corresponding Eigenvectors

As shown in (8), ,  , are the eigenvalues of the symmetric matrix . In this subsection, we assume that and , which can be achieved by reordering and the corresponding columns of .

If replacing with in (2), we get

Without loss of generality, assume that is the largest real part of the eigenvalues of (it may be . However, Lemmas 12 and 13 have no difference in that case). Applying Lemma 2 to (33), we can get , the largest eigenvalue of , and the corresponding eigenvector , where is a solution of (33).

The following lemma introduces an approach for computing under the condition (C3).

Lemma 12. Assume that (C3) holds. Let , where is a solution of (33). Then, .

Proof. By Lemma 2, we know that should be a linear combination of and ,  . Let
Then, based on (9), we get
Because is an orthogonal matrix and for all ,   and hold, that is, for all , we have thus proving the lemma.

The following lemma introduces an approach for computing a pair of conjugated eigenvectors of corresponding to the eigenvalues under the condition (C3).

Lemma 13. Assume that (C3) holds. Given any nonzero , the largest eigenvalue of , and the corresponding eigenvector obtained by (33), let . If , is an eigenvalue of corresponding to the eigenvector . If , let . Then, are two eigenvalues of corresponding to the eigenvectors , respectively.

Proof. Combining the proofs of Lemmas 9 and 7, we can prove this lemma.

If replacing with in (3), we get

Without loss of generality, assume that is the smallest real part of the eigenvalues of (it may be . However, Lemma 14 has no difference in that case). Applying Lemma 3 to (37), we can obtain , the smallest eigenvalue of , as well as the corresponding eigenvector, denoted by . Then, we have the following lemma.

Lemma 14. Assume that (C4) holds. Given any nonzero , the smallest eigenvalue of , and the corresponding eigenvector obtained by (37), let . If , is the eigenvalue of corresponding to the eigenvector . If , let . Then, are two eigenvalues of corresponding to the eigenvectors , respectively.

Proof. Combining the proofs of Lemmas 12, 9, and 7, we can prove this lemma.

Remark 15. Among the following four real normal matrices only the first three meet (C3), but the last one does not. And only the last three matrices meet (C4), but the first one does not.

2.3. Computing the Eigenvalues with the Largest or Smallest Modulus, as well as the Corresponding Eigenvectors

Reorder the eigenvalues of the symmetric matrix in (9) and the corresponding columns of such that and . Without loss of generality, assume that are the eigenvalues of that have the largest modulus (it may be . However, Lemma 16 has no difference in that case), and that are the eigenvalues of that have the smallest modulus (it may be . However, Lemma 17 has no difference in that case).

Replacing with in (2), we get

Applying Lemma 2 to (39), we can obtain , the largest eigenvalue of , and the corresponding eigenvector , where is the solution of (39). Then, we have the following lemma.

Lemma 16. Assume that (C5) holds. Given any , the largest eigenvalue of , and the corresponding eigenvector obtained by (39). Then, . Thus, can be gotten. If , is an eigenvalue of corresponding to the eigenvector . If , let . Then, are two eigenvalues of corresponding to the eigenvectors , respectively.

Proof. Combining the proofs of Lemmas 5, 7, and 9, we can prove this lemma.

Replacing with in (3), we get

Applying Lemma 3 to (40), we can obtain , the smallest eigenvalue of , and the corresponding eigenvector, denoted by . Then, we have the following lemma.

Lemma 17. Assume (C6) holds. Given any , the smallest eigenvalue of , and the corresponding eigenvector obtained by (40). Then, . So one can get that . If , is an eigenvalue of corresponding to the eigenvector . If , let . Then, are two eigenvalues of corresponding to the eigenvectors , respectively.

Proof. Combining the proofs of Lemmas 5, 7, and 9, we can prove this lemma.

Remark 18. Among the following four real normal matrices only the first three meet (C5), but the last one does not. And only the last three meet (C6), but the first one does not.
However, there exists some specially constructed that meet none of (C1) to (C6), for example, , where
Nevertheless, a randomly generated real normal matrix can meet (C1) to (C6) with high probability.

2.4. Extension to Arbitrary Real Matrices

In this subsection, is an arbitrary real matrix of order . Let be the Frobenius norm of , and, ,  , be the eigenvalue of . Denote the set of all complex nonsingular matrices by .

By the Schur inequality [13], we know with equality if and only if is a normal matrix. Since the spectrum of does not change by a similarity transformation, the inequality holds with equality if and only if is a normal matrix. In addition, [24] proved

Based on (45), if we can find a sequence as follows: such that

where is to be a normal matrix with the same eigenvalues as . Such skill, termed as the norm-reducing technique, has been proposed by [2527]. Moreover, following the idea presented by [26], it is easy to find that when is real, can be chosen to be a real normal matrix.

In a word, any real matrix can be translated into a real normal matrix by a similarity transformation . Typical approaches for constructing can be found in [26]. Note that if is the eigenvalue of corresponding to the eigenvector , is the eigenvalue of corresponding to the eigenvector . Hence, our proposed algorithm can be extended to extract eigenpairs of arbitrary real matrices by employing the norm-reducing technique.

Without loss of generality, we use the following random matrix as an example to describe the norm-reducing technique:

The Frobenius norm of is , and the eigenvalues of matrix are , , , and ; obviously , so is a nonnormal real matrix. According to the approach that presented in [26], we can construct , . If the condition can be satisfied, we break the iteration. After 129 iterations, we can obtain a approximate normal matrix : from which we have and that is very close to ; so the matrix can be regarded as a normal matrix in practical application, the corresponding are as follows: which satisfy . We also can see that the eigenvalues of are ,  , , and , which are just the eigenvalues of original nonnormal matrix . We presented here the transient behavior of in Figure 1, where is the iterations and .

In the following, we use random matrices to verify the average performance of the norm-reducing technique. Let denotes the average measure of a large number of nonnormal matrices in a statistical sense at th iteration, where is the measure of the nonnormal matrix at th iteration, if and only if is normal matrix at th iteration. We also presented the dynamic behavior trajectory of in Figure 2, from which we can see that for most of nonnormal matrices, after 350 iterations, the is very close to zero.

3. Neural Implementation Description

In the presented paper, we mainly focus on the classical neural network differential equation as shown in (2), where , are symmetric matrices that need to calculate eigenvalues and the corresponding eigenvectors, is a column vector which denotes the states of neurons in the neural network dynamic system, and the elements of symmetric matrix denote the connection weights between those neurons. We presented the schematic diagram of the neural network in Figure 3, from which we can see that it is a recurrent neural network since the input is just the output of the system.

In the practical applications, we often only need a nonzero column vector to start the neural network system by the following update rule: where denote the th iteration and is a small time step. The iteration stops once , where is a small constraint error that can be set in advance. If , we could regard that , that is, ; so we have , according to the theory in [23], is the eigenvector corresponding to the modulus largest eigenvalue which can be denoted as .

4. Examples and Discussion

Three experiments are presented to verify our results. The following real normal matrix (randomly generated) was used in those three experiments:

Using the function in Matlab, we got the eigenvalues of as , , , and , as well as the corresponding eigenvectors as follows:

We can see that all of (C1) to (C6) hold except (C2). For simplicity, denote by .

Example 19 (for Section 2.1). We used (10) with the following initial condition (randomly generalized) to get (the largest absolute value of the imaginary part among ) that should converge to. From Lemma 5, should hold. By Lemma 7, and should converge to the imaginary and real parts of an eigenvector corresponding to , respectively. The transient behaviors of the above four variables are shown in Figures 4, 5, and 6, respectively. After convergence, we saw
Thus, the estimated complex vector is an eigenvector of corresponding to .
Although we can use (25) to get the smallest absolute value of the imaginary part among (it is zero in this experiment), neither the corresponding real part nor the eigenvector can be obtained from Lemmas 8 or 9 since (C2) does not hold.

Example 20 (for Section 2.2). We used (33) with the same as (55) to get (the largest real part among ) that should converge to. Based on Lemma 12, should hold. The transient behaviors of such two variables are shown in Figure 7. Since , from Lemma 13, and should converge to the imaginary and real parts of an eigenvector corresponding to , as shown in Figures 8 and 9. After convergence, we saw
Hence, the estimated complex vector is an eigenvector of corresponding to .
Based on (37), we got (the smallest real part among ). After convergence, we saw that , , and was equal to just as expected from Lemma 14.

Example 21 (for Section 2.3). Based on (39) and Lemma 16, we can get and one corresponding eigenvector again because is the largest modulus of . In addition, we used (40) with the same as (55) to get , the smallest modulus of . By Lemma 17, and should hold. The transient behaviors of such two variables are shown in Figure 10. After convergence, we saw that . Hence, from Lemma 17, should converge to a constant multiple of , which is shown in Figure 11. As expected, was equal to

Example 22 (for extension to arbitrary real matrices). According to the theory in Section 2.1, we present here an experiment for arbitrary real matrices to verify the effectiveness of the norm-reducing technique in Section 2.4. Considering the following nonnormal matrix : the Frobenius norm of is , where ,   are the eigenvalues of matrix , so matrix is a nonnormal matrix. According to the theory in Section 2.4, the eigenpairs problem of nonnormal real matrix can be converted into the eigenpairs problem of the corresponding normal matrix , which can be calculated by norm-reducing technique: and the corresponding as follows: which satisfies the relationship . In order to solve the eigenpairs of matrix , we have to calculate the eigenpairs of the matrix at first. Direct calculations of the eigenvalues of matrix are , ,   and those are just the eigenvalues of nonnormal matrix and the corresponding eigenvectors as follows:
The eigenvectors of nonnormal matrix , which corresponding to the eigenvalues with largest imaginary parts in absolute are
According to the results above, the eigenvalues with largest imaginary in absolute are and , and let denote them.
We used (10) with the following initial condition (randomly generalized): to get (the largest absolute value of the imaginary part among ) that should converge to. From Lemma 5, should hold. By Lemma 7, and should converge to the imaginary and real parts of an eigenvector corresponding to , respectively. The transient behaviors of the above four variables are shown in Figures 12, 13, and 14, respectively.
After convergence, we saw
Thus, the estimated complex vector is an eigenvector of corresponding to . According to the theory in Section 2.4, the estimated complex eigenvector of nonnormal matrix corresponding to the eigenvalue should be from which we can see that the estimated complex vector is an eigenvector of corresponding to .

5. Conclusion

This paper introduces a neural network based approach for computing eigenvectors of real normal matrices and the corresponding eigenvalues that have the largest or smallest modulus, have the largest or smallest real part, and have the largest or smallest imaginary part in absolute value. All the computation can be carried out in real vector space although eigenpairs may be complex, which can reduce the scale of networks a lot. We also shed light on extending this method to the case of general real matrices by employing the norm-reducing technique proposed in other literatures. Four simulation examples verified the validity of our proposed algorithm.