Abstract

A new approach for determining the coefficients of a complex-valued autoregressive (CAR) and complex-valued autoregressive moving average (CARMA) model coefficients using complex-valued neural network (CVNN) technique is discussed in this paper. The CAR and complex-valued moving average (CMA) coefficients which constitute a CARMA model are computed simultaneously from the adaptive weights and coefficients of the linear activation functions in a two-layered CVNN. The performance of the proposed technique has been evaluated using simulated complex-valued data (CVD) with three different types of activation functions. The results show that the proposed method can accurately determine the model coefficients provided that the network is properly trained. Furthermore, application of the developed CVNN-based technique for MRI K-space reconstruction results in images with improve resolution.

1. Introduction

Parametric modeling technique has been applied to almost all fields of endeavor, these include but not limited to the the field of biomedical signal processing [19], digital image processing [1012], building and built environment industry [13], nuclear plant [14], and communication [15, 16]. In some of the aforementioned fields, parametric modeling has been applied to determine an unknown system by the knowledge of the input and output data (system modeling and identification), or to predict the future values based on past output values (linear prediction), or for filtering purpose (signal filtering), or to find the frequency content or response of a system (spectral estimation). The widely used parametric modeling technique includes autoregressive (AR), moving average (MA), and autoregressive moving average (ARMA). Mathematically, an ARMA model involves representation of input-output data of a system by a difference equation of the form where and are the model coefficients, and are real-valued model order for the AR and MA parts, respectively [1723]. Processes with spectral poles or narrow peaks are preferably modeled with AR technique whereas MA models are suitable for processes with spectral zeros or narrow valleys, and ARMA models are suitable for processes with both narrow peaks and valleys [18, 19, 24].

In recent times, the introduction of complex-valued neural networks (CVNNs) has widened the scope and applications of artificial neural network (ANN) [2534]. The inevitability of dealing with complex-valued data (CVD) has shown the indispensability of this emerging mathematical paradigm especially in the field of adaptive signal processing, radar systems, digital signal processing, magnetic resonance imaging (MRI) reconstruction, digital communications systems, speech processing, remote sensing, optoelectronics, quantum neural devices and systems, spatiotemporal analysis of physiological neural systems, biomedical signal processing, and artificial neural information processing, just to mention a few.

This evolving paradigm has gained much attention not only because there are situations where CVNNs are inevitably required or greatly effective than its counterpart, the real-valued neural network (RVNN), but because of its usefulness which is enshrined in the fundamental theorem of Algebra [2527]. Another reason largely responsible for the increasing popularity of this evolving field is in the treatment of CVD as an ordered pair rather than treating it as a multidimensional data [29]. Treating a CVD as a multidimensional data eliminates the correlation relationship between the real and imaginary components of such a CVD and this often results in an increase in computational complexity [3335]. Furthermore, CVNN technique reduces ineffective degree of freedom in learning, thus achieving better generalization characteristics than the RVNN technique [29].

If the coefficients in (1) are complex number and either or is CVD, then the ARMA model is referred to as a complex-valued autoregressive moving average (CARMA) model.

Despite the success of parametric models especially AR and ARMA models in various areas of applications [6, 7, 1012], it has two main drawbacks namely, the difficulty in estimating the model order and complexity associated with the determination of model coefficients.

The accuracy of the resulting model parameters highly depends on the methods used in determining the unknown model parameters. The use of inaccurate parameters often leads to introduction of artifacts, erroneous peaks and valleys, outrageous predicted values, system instability, and the list continues [2, 11, 18].

Several methods have been suggested for ARMA model coefficients determination, these can be broadly divided into two groups, namely optimal and suboptimal techniques [25]. Optimal technique involves simultaneous determination of the AR and MA coefficients while suboptimal technique normally involves a two-step procedure, firstly, the determination of AR coefficients, and secondly the estimation of the MA coefficients from the residual error or the estimated AR coefficients. The nonlinear ARMA model equation demonstrates the difficulty in estimating the model coefficients even when the autocorrelation sequence is exactly known. The optimal approach involves significant computations and sometimes fails to converge or may converge to the wrong solution [2]. Besides, it also produces poor resolution for short data length and as such is rarely used rather the suboptimal technique is often preferred.

In overcoming the computational problems associated with the optimal technique, a suboptimal method which takes advantage of the existing linear relationship between the estimated autocorrelation matrix and the AR coefficients has been suggested. One of the most successful suboptimal-based autocorrelation approaches is the modified Yule Walker (MYW) method. In this approach, the AR coefficients are firstly computed, which is then followed by the determination of MA coefficients [2, 3, 5, 15, 16, 21]. Autocorrelation, covariance, and least squares approach are among the known methods of computing the AR coefficients. The autocorrelation technique has been shown to yield AR spectra with the least resolution among these methods. The decrease in resolution is due to the inherent windowing in the data matrix. The accuracy of the coefficients obtained depends so much on the value of the estimated autocorrelation values. The covariance matrix solution produces AR parameters whose resulting spectra have more false peaks and greater perturbations of spectral peaks from their correct frequency locations than other approaches. Spectral line splitting, the placement of two or more closely spaced peaks in the spectrum where only one should be present, has been observed in forward prediction least squares approach [2, 18, 21].

The second problem associated with parametric modeling approach is the appropriate method of model order estimation. Because the optimal model order is not known a priori, the traditional approaches have always been to evaluate various model orders based on some error measure criteria. Several model order determination techniques have been suggested in literature among which are final prediction error (FPE), Akaike information criterion (AIC), Minimum description length (MDL), and Hannan and Quinn (HNQ) [2, 5, 18, 3642].

An optimal technique for determining ARMA model coefficients using RVNN approach has been reported in [6, 7]. Though this method accurately estimates ARMA model coefficients, yet it is only applicable to RVD and as such cannot be used for CVD. In this paper, a CVNN-based method for estimation of CAR and CARMA model coefficients is presented. The approach simultaneously estimates the model coefficients from the weights and coefficients of the adaptive SC activation functions in the hidden and output layer of a properly trained CVNN.

The remaining part of this paper is organized as follows. Development of CVNN-based parametric model for CAR and CARMA is contained in Section 2 while the performance analysis of the developed technique is contained in Section 3 and conclusion is contained in Section 4.

2. Development of CVNN-Based Parametric Model

ANN (i.e., RVNN and CVNN) is a global search technique that emulates the biological neurons of the human body. It is a general mathematical computing paradigm that models the operations of biological neural systems with unique characteristic of having massively parallel distributed structures and high capability of learning and generalization [30, 43]. A typical ANN consists of interconnection of simple processing elements called nodes. These nodes are arranged in layers and are joined together by interconnection of synaptic weights to form structures, the most popular method of arrangement is called feedforward neural network [6, 7, 43].

A typical ANN structure shown in Figure 1 consists of one or two types of layers namely, the hidden layer and the output layer. The first interface though passive in nature is where data are fed to the network, which is called the input nodes. Another interface where processed data are released out of the network is the output layer and between these two layers are the hidden layers. Each layer except the input layer consists of one or more processing unit. A processing unit is made up of an adder and an activation function. The adder sums the weighted input values and computes the input to the activation function. An activation function maps the input to a new output range, examples of which include, sigmoid, tanh, linear, cubic, and radial. There are weights on each side of the processing unit in the hidden layer (left and right hand side) of the input layer and one side (left hand side) of the output layer. These weights are altered during the training process to ensure that the inputs produce an output that is close to the desired value [30]. There is interconnection of weights between nodes in the two consecutive layers but there exist no connections between nodes in the same layer.

The combination of the large sets of connection weights and nonlinear activation functions makes ANN an ideal tool in estimating, classification, and predicting some of the linear and nonlinear systems [6, 7]. In recent times, RVNN has been extended to complex-number domain, where the input, synaptic weights, activation function, and the error propagation were made to process CVD. Such networks are normally referred to as CVNN.

2.1. Choice of Activation Function for CVNN-Based Parametric Model

CVNN also shares similar characteristics and properties with RVNN, the only difference is the nature of data being processed. Similar to RVNN, CVNN training can also be broadly divided into two classes namely supervised and unsupervised learning. In a supervised learning, the network is trained by presentation of sets of input and output data (target data) to the system. During this phase, the weights are successively adjusted based on a set of inputs and desired output presented to the network. The error from learning is back propagated for weight adjustment and the most popular method of error optimization is based on the minimization of the output mean squared error (MSE) [4450]. In an unsupervised learning, the network is presented with just the input data and the network adjusts itself to produce an output. In this work, a two-layer CVNN approach for weight and coefficients update for CVD has been proposed. The network leverages on the nonlinearity nature of the activation function and back propagation algorithm in determining the linear and nonlinear coefficient values of the CAR and CARMA model.

The major challenge mitigating against the use of complex activation in CVNN is the issue of boundedness and differentiability nature of the intended activation function. In overcoming these problems, two major approaches have been reported, namely, fully complex (FC) and split complex (SC) activation function [5155].

The FC approach uses an activation function that can satisfy the conflicting requirements of boundedness and differentiability of a complex function [51, 52]. The simplified version of the FC approach has been shown to be equivalent to the complex conjugate form of SC, provided the activation function used is an elementary transcendental function (ETF). The introduction of adaptive normalized learning rate to [51] by minimization of the instantaneous output error of the FC-BP results in improved performance of the algorithm [56].

Furthermore, in overcoming the unbounded problem associated with complex activation function, Georgiou and Koutsougeras in [54] identified the most desirable properties of complex activation functions and thereafter suggested a practically realizable fully normalized complex activation function. The proposed activation function can process sinusoids at the same frequency (phasor) but less efficient in learning nonlinear phase variation between the input and target signal.

The second approach that avoids the unboundedness in CVNN activation functions and which has been proved to be a special case of the FC is the use of SC approach. In SC approach, two real-valued activation function (RVAF) are used to process the in-phase and quadrature components of the input signal [5762]. This method of processing CVD has been shown to reduce the information redundancy associated with the hidden neurons of a FC approach, more so, hardware implementation of this is easier than that of FC [62, 63].

2.2. Development of CVNN-Based Parametric Modeling Equations

The general form of a CARMA model shown in Figure 2 is given by where and are real value model order and and are the complex model coefficients for the CAR and CMA parts, respectively.

Decomposing and rearranging (2) gives the LHRP of CARMA model as and the LHIP is expressed as

2.3. CARMA and CVNN-Based CARMA Equivalent

A two-layer CVNN for estimating CARMA model coefficients using split complex-valued weight and adaptive activation functions has been proposed here. The basic assumption guiding this proposal is that either the real part of the output or the imaginary part can be independently used in obtaining the CARMA coefficients from a split weight CVNN. Furthermore, it have been shown that a two-layer network with one hidden layer and one output layer is sufficient for accurate approximation and function representation using ANN [6, 7, 43]. Thus justification for the use of a two-layer network. Figure 3 shows the network diagram of the proposed technique, and the analysis of this approach using the imaginary part of the output only is hereby presented where are the split weights connecting input node to hidden node for the CMA part, is the model order for the CMA, is the split weights connecting hidden node to output node for the CMA part, and are the split weights connecting input node to hidden node , is the number of input nodes, is the split weights connecting hidden node to output node, is the bias of the hidden node , is the bias of output node, is the adaptive coefficient of the output node, is the adaptive coefficient of the hidden node , and is the number of neuron in the hidden layer. Comparing (5) and (3) gives

Similarly, comparing (5) and (4) givesthus CARMA model coefficients can be obtained from the split weights and coefficients of a CVNN-based CARMA model.

Furthermore, neglecting value of for the CMA part in (2) leads to an All-pole system or CAR model equation given by (8). Further neglect of all effect associated with complex-valued white noise (input) in (2) gives the complex linear prediction coefficients model equation, (9).

2.4. Power Spectra Density (PSD)

The PSD associated with the rational ARMA or CARMA model transfer function is given as where and are the -transform of the CMA/MA and CAR/AR parts, respectively, and is the variance of driving white noise input of the system [25]. Evaluating (10) on a unit circle gives

Similarly, the PSD of a AR/CAR system is given as Evaluating (12) on a unit circle gives

3. Performance Evaluation of CVNN-Based Technique

3.1. Simulated Data

The performance analysis of the proposed modeling approach on CVD using CVNN-based CAR and CARMA model is investigated in this section.

(1) Computation of PSD of Complex Sinusoidal Data
Consider a CVD, , given by where an additive complex white Gaussian noise with variance has been added to the generated complex sinusoids. The SNR of the data is computed as where and represent the real and imaginary noise components respectively. Furthermore, the coefficient mean-squared-error measure (CMSEM) is given as where and are the actual and estimated model coefficients, respectively, is the model order and is a measure of coefficient error power.
The PSD plot obtained is shown in Figure 4 and it is observed that the two frequency peaks are distinct and correctly located in the plot. As expected the peaks occurs at and . Thus, CVNN-based CAR model parameter technique is satisfactory for the analysis of CVD.

(2) Mixed CAR Process
Consider a second-order CAR of the form where , which is viewed as a mixed system with real- and complex-valued coefficients [39] with as the system complex input noise with given as Results and Observation
The estimated CVNN-based CAR model coefficients are shown in Tables 1 and 2. From the MSE results observed, it shows that the CAR model coefficients obtained using LHIP are closer to the actual values than those obtained using LHRP approach. In Table 1, TANH function has been used in estimating the CAR coefficients. In Table 2, all the activation functions produce accurate results, however, the variance of the results obtained for TANH activation function is smaller than the two other activation functions. As compared to complex sinusoid, LHIP produces lower MSE value than the LHRP approach. The PSD plot of the results obtained is shown in Figure 5.

(3) CARMA Process
Consider a CARMA model described by the difference equation where both the CAR and CMA parts are of order . Tables 3 and 4 show the results obtained for the CVNN-based CARMA model coefficients as compared to the values obtained for the LS approach. As expected, the estimated CARMA model coefficients obtained from the LHIP are closer to the actual values.

3.2. Experimental Data: Magnetic Resonance Imaging

MRI is used to produce images of the internal section of the human body [1012, 42, 64, 65]. The use of Discrete Fourier Transform (DFT) as an MRI reconstruction technique has found common usage in the field of biomedical image reconstruction. Despite its popularity and acceptance, this technique suffers from Gibb's effect, introduction of artifacts, and decrease in Spatial resolution [33, 35]. One of the alternative methods for MRI reconstruction with improved resolution is the use of parametric modeling technique. The Transient Error Reconstruction Algorithm (TERA) and its variants involve modeling the data as a deterministic ARMA model with finite number of steps and its block diagram is shown in Figure 6 [912].

3.2.1. MRI Reconstruction Method Using CVNN-Based CARMA Technique

Detailed information regarding TERA and its variants for MRI reconstruction is as contained in [1012]. Steps involve in TERA-based MRI reconstruction are

(i)split each row or column of the MRI K-space data into Hermitian or anti-Hermitian series to account for data symmetry in the K-space data, that is(ii)each series is modeled as the output of an IIR filter by estimating the transfer function from the generated finite data set. (1)The ARMA model can be regarded as a cascade of MA and AR filter. (2)The unit impulse sequence produces the data series as the output of the filter . (3)The component is modeled as the output of a pth order AR model excited by . Thus, (iii)estimate the AR and MA coefficients of the Hermitian and anti-Hermitian series. (iv)obtain the IDFT of the original image using

3.2.2. MRI Reconstruction Method Using CVNN-Based CAR Technique

Suppose the CVNN-based CAR model is represented as where denotes the model, are the known K-space data, is the CAR model order, and is the extrapolated k-space data. The training data is fed to the CVNN-based CAR model and the output is the estimated or the predicted value. Once CVNN-based CAR model training is complete or the signal decays to zero, the model coefficients can be estimated from the SC synaptic weights and activation function. Furthermore, for the K-space data , the row data can be split into the positive and the negative parts. The positive part consists of data in the range while the negative part consists of data in the range , from the positive and negative parts. Using the first data points that is to train the CVNN-based CAR model, the remaining data points can be predicted accurately using the proposed technique. Similarly, for the negative part, using the first data points, where , the remaining data points can be predicted using the proposed technique.

Figure 7 shows typical extrapolated result for a typical row in the K-space data using the proposed CVNN-based CAR technique while K-space system in Figure 8 shows the resulting image by the use of CVNN-based modeling techniques.

4. Conclusion

A new method of obtaining CARMA, CAR, and CLPC coefficients from a CVNN with split adaptive linear activation function for a CVD data has been developed in this paper. The results obtained from evaluation of LHRP and LHIP shows that any of the two techniques is appropriate for determination of model coefficients for a properly trained network. Similarly, it was observed that the use of TANH function and CLAF gives better result compared with the result obtained using LAF. Images with improved resolution when compared with the FFT technique has be obtained by the use of the proposed technique though with a far time of completion when compared with FFT technique. There is an ongoing work to reduce the algorithm computation time so as to be comparable with that of FFT technique. Other areas of application of this work aside from MRI reconstruction include nonlinear signal modeling and prediction, shape modeling and identification, crack modeling and prediction for automated building system and seismic signal processing.

Acknowledgments

The authors would like to express their appreciation to Professor K. Chon for his support and encouragement. Great appreciation goes to the reviewers of this paper for their constructive reviews and suggestions for improving this paper. This work is supported by Malaysia E-Science Grant 01-01-08-SF0083.