Abstract

This brief proposes a general framework of the nonlinear recurrent neural network for solving online the generalized linear matrix equation (GLME) with global convergence property. If the linear activation function is utilized, the neural state matrix of the nonlinear recurrent neural network can globally and exponentially converge to the unique theoretical solution of GLME. Additionally, as compared with the case of using the linear activation function, two specific types of nonlinear activation functions are proposed for the general nonlinear recurrent neural network model to achieve superior convergence. Illustrative examples are shown to demonstrate the efficacy of the general nonlinear recurrent neural network model and its superior convergence when activated by the aforementioned nonlinear activation functions.

1. Introduction

Solving the generalized linear matrix equation (GLME) and its variants is an important issue which is widely encountered in scientific and engineering areas (e.g., feedback control system design [1], smart antenna array processing [2]). The well-known Lyapunov equation and Sylvester equation can be regarded as main special cases of GLME with reduced numbers of coefficients and variable matrices, which have drawn widespread interest of researchers and engineers in the past decades [37]. Without loss of generality, in this brief, the GLME problem is formulated as the following form: where , , and denote coefficient matrices and denotes the unknown matrix to be obtained. Usually, it is complicated to analyze what circumstances the solution of (1) would be under in the traditional numerical way. To guarantee such GLME (1) solvable with the unique theoretical solution, coefficient matrices and can be practically configured with their eigenvalues all being positive or negative simultaneously. In many cases, the number of solutions of (1) can be multiple or even none, depending on what kind of combinations matrices and would make to associate with the unknown matrix . A lot of conventional serial approaches may be not efficient enough to solve online GLME due to their inherent drawbacks, and parallel computational approaches seem more preferable [813].

Regarded as another promising approach for parallel computation, dynamic neural networks based on analog solvers have been exploited comprehensively in computational intelligence fields [12, 1416]. Different from a number of conventional numerical methods, approaches based on dynamic neural networks can be more realizable on specific parallel and distributed software or/and hardware architectures [17, 18]. This could highly enlarge utility of current neural networks towards various potential application domains towards high-performance computing. One basic type of dynamic neural networks, recurrent neural networks, which is analogous to the natural transient and steady process, has been applied in online parallel computing tasks with large-scale analog/digital circuit prototypes [19].

Our main contribution in this brief is to develop a general framework of recurrent neural network model to solve GLME (1). Since nonlinear phenomena occur frequently in neural network hardware implementation [19], the proposed general nonlinear framework may be more suitable for analog-based computation. The neural state of the general recurrent neural network can globally converge to the theoretical solutions. If the general recurrent neural network is activated by the linear function, the exponential convergence can be achieved. On the other hand, certain nonlinear forms of such general neural network may be able to obtain more accurate solutions and faster convergence as compared with its linear model, so we propose two specific nonlinear activation functions for the general recurrent neural network model to achieve superior performance to solve GLME (1).

2. General Recurrent Neural Network Solver

In this section, we present and analyze the general model of recurrent neural network to solve GLME (1). If such model is activated by the linear function, state matrix of the general recurrent neural network can globally and exponentially converge to the unique theoretical solution . By exploiting specific nonlinear odd monotonically increasing activation functions, superior convergence is expected to be achieved. In the ensuing subsections, we will discuss their convergence properties of the general nonlinear recurrent neural network model together with its linear form.

2.1. General Nonlinear Neural Network Model

In this brief, the general nonlinear recurrent neural network model is proposed to solve GLME (1) as follows: where operator denotes a nonlinear activation function array, with its each scalar-valued mapping unit being a monotonically increasing odd activation function, and subscript denotes transpose of matrix/vector. Such recurrent neural network model (2) can be generalized as the extended nonlinear version of recurrent neural network in [16] and the ensuing linear model. For the general nonlinear recurrent neural network model (2), we would have the following theorem.

Theorem 1. The neural state matrix of general nonlinear recurrent neural network model (2), starting from any initial value , can globally converge to the theoretical solution(s) of GLME (1).

Proof. Firstly, we define the distance between the neural state and the theoretical solution as . Accordingly, by substituting into neural network model (2), it can be further equivalently transformed as Next, the corresponding Lyapunov-function candidate is defined as follows: where operators , , and , respectively, denote Frobenius norm of matrix, two norms of vector, and Kronecker product between matrices and generates a new column vector obtained by stacking all column vectors of together.
The time derivative of is Considering the following vectorization equality based on (2), we can further derive (5) as For nonlinear activation function array , its individual scalar-valued entry is odd and monotonically increasing, which can guarantee Thus, , which implies that could globally converge to zero matrix according to Lyapunov theory [20]; that is, state matrix of (2) globally converges to the theoretical solution(s) of GLME (1). All of these above complete the proof.

According to Theorem 1, the general nonlinear neural network model (2) can be activated by a number of odd monotonically increasing functions to solve GLME (1) which is with existent theoretical solutions (unique or multiple), which will broadly enlarge the utility domain of (2) towards manifold model generation. As we may know, nonlinear elements are frequently encountered in analog/digital circuit prototypes of neural networks [19, 21]; involving nonlinear activation functions can be beneficial to potential design and implication. On the other hand, faster convergence is indeed required for solving GLME (1) when the linear model might not satisfy increasing computational requirements. With expectancy, the nonlinear neural network model (2) can attain superior convergence to that of (10) if proper activation functions are exploited. Before inducing the superior nonlinear-function activated models, we herein address the linear model of the general nonlinear recurrent neural network and discuss its convergence property.

2.2. Linear Neural Network Model

To solve GLME (1), we firstly define a scalar-valued error function associated with (1), where operator denotes the Frobenius norm. In order to eliminate error function to zero as increases, gradient-descent manner is adopted: where design parameter scales the convergence rate.

According to preliminaries on matrix-differential theory [22], (9) is further expanded to the following dynamic form:

For linear model (10), we would have the following theorem.

Theorem 2. If the linear neural network model (10) is employed to solve GLME (1), starting from initial condition , the state matrix of (10) can globally exponentially converge to unique theoretical solution .

Proof. Using the transformation between and with initial condition , dynamic equation (10) is further derived equivalently as the following: With considered, (11) can be simplified as Similarly, we define the following Lyapunov-function candidate: with its time derivative being There exists a positive scalar [23] being the minimum eigenvalue of satisfying if the unique solution condition of GLME (1) holds. Thus, we can havethat is, which could be further equivalently rewritten as By Lyapunov theory [20], (14) and (18) indicate that state matrix of (10) can globally and exponentially converge to the unique theoretical solution of GLME (1). The proof is thus complete.

It is worth noting that if GLME (1) is with multiple theoretical solutions , scalar equals zero. In this situation, the linear model (10) at least could guarantee its global convergence but is not able with explicit exponential convergence rate.

3. Superior Convergence with Specific Nonlinear Activation Functions

According to Theorem 1, The odd monotonically increasing activation function is able to guarantee global convergence of the general recurrent neural network (2). If the linear activation function is adopted, the general recurrent neural network model reduces to the linear model (10). Such linear model (10) possesses global exponential convergence property. In order to achieve superior convergence to global exponential convergence of the linear model (10), specific types of nonlinear activation functions should be chosen properly. Owing to the aforementioned considerations, two types of nonlinear activation functions, power sum and hyperbolic sine functions, are proposed to activate the general recurrent neural network model (2). Figure 1 shows the curve plotting of the three aforementioned activation functions used in (2). Correspondingly, we will have the following theorems on the two neural network models’ convergence properties.

Theorem 3. If the general recurrent neural network (2) is activated by the power sum function , the state matrix of (2) can globally and superiorly converge to the unique theoretical solution , as compared with linear model (10).

Proof. In order to prove the convergence property of (2) activated by the power sum function under this situation, we define the following Lyapunov-function candidate: with its time derivative being where and denotes the th element of vector . This implies that when power sum functions are used, (2) possesses global convergence to zero matrix, with larger Lyapunov-function vanishing rate (i.e., faster convergence), as compared with the situation of (10). The proof is thus complete.

Theorem 4. If the general recurrent neural network (2) is activated by hyperbolic sine function with coefficient , the state matrix of (2) can globally and superiorly converge to the unique theoretical solution , as compared with the linear model (10).

Proof. Similarly, the following Lyapunov function is defined to investigate convergence: and its time derivative is which indicates that when the hyperbolic sine activation function is employed, the nonlinear recurrent neural network model (2) possesses global convergence as its state matrix is approaching zero, with larger Lyapunov-function vanishing rate (i.e., faster convergence) as compared with the situation of linear model (10). These complete the proof.

4. Illustrative Examples

In this section, three examples are presented to illustrate the efficiency of the general nonlinear recurrent neural network (2) with its specific models under different types of activation functions (linear, power sum, and hyperbolic sine activation functions) for online solving GLME (1).

Example 1. Let us consider the following GLME with : where GLME (24) has unique theoretical solution since the eigenvalues of coefficient matrices , , , and are all positive values. We employ the general recurrent neural network model (2) with activated by linear function, power sum function with , and hyperbolic sine function with .

From Figure 2, we could observe that the solution errors decline to almost zero at around 0.02 s and faster convergence to the solution can be achieved with power sum and hyperbolic sine activation functions used in (2). These can demonstrate the effectiveness of the general recurrent neural network model (2) for solving GLME (24).

Example 2. Let us consider the following GLME with multiple theoretical solutions : where

We use linear model (10) with design parameter to solve GLME (27). The trajectories of entries of state matrix are shown in Figure 3. From Figure 3, we could see that, starting from two different initial matrices , the state matrices of linear model (10), respectively, converge to two different trajectories (or say two different theoretical solutions ). This indicates that the choices of the initial value impact greatly the steady-state results of the recurrent neural network (2) and determine the starting points of convergence for solution of GLME (27), if multiple theoretical solutions exist for GLME (27). Correspondingly, the residual errors synthesized by (10) can always diminish to zero within finite time from twenty different initial values, as illustrated by Figure 4.

Example 3. Let us consider the following GLME in a larger dimension with : where coefficient matrices , , and are all positive-definite randomly generated and fall within interval . We exploit nonlinear neural network models (2) activated by power sum and hyperbolic sine functions and the linear model (10) to solve GLME (29) with design parameter . From Table 1, we could observe that general recurrent neural network models (2) activated by power sum and hyperbolic sine activation functions exhibit faster error diminishing speed than that of the linear model (10), with all of their residual errors reaching the level of within 1 s. From computational results of these three examples above, we could see that the proposed general nonlinear recurrent neural network can solve the GLME (1) problem well.

5. Conclusion

In this brief, we present a general recurrent neural network model for solving GLME. The general nonlinear model of recurrent neural network possesses global convergence property in finding solutions of GLME. By using specifically proposed nonlinear activation functions, superior convergence can be achieved, as compared with the linear model which is with exponential convergence rate. Illustrative results are shown to demonstrate the effectiveness and superiority of nonlinear recurrent neural network models for solution of GLME.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

This work is supported by the National Natural Science Foundation of China (NSFC) under Grant no. 61603078 and the Fundamental Research Funds for the Central Universities at the University of Electronic Science and Technology of China (UESTC) under Grant no. ZYGX2015KYQD044.