Abstract

This paper deals with the analysis of a third-order tensor composed of a fourth-order output cumulants used for blind identification of a second-order Volterra-Hammerstein series. It is demonstrated that this nonlinear identification problem can be converted in a multivariable system with multiequations having the form of . The system may be solved using several methods. Simulation results with the Iterative Alternating Least Squares (IALS) algorithm provide good performances for different signal-to-noise ratio (SNR) levels. Convergence issues using the reversibility analysis of matrices and are addressed. Comparison results with other existing algorithms are carried out to show the efficiency of the proposed algorithm.

1. Introduction

Nonlinear system modeling based on real-world input/output measurements is so far used in many applications. The appropriate model and the determination of corresponding parameters using the input/output data are owned to apply a suitable and efficient identification method [1–7].

Hammerstein models are special classes of second-order Volterra systems where the second-order homogenous Volterra kernel is diagonal [8]. These systems have been successfully used to model nonlinear systems in a number of practical applications in several areas such as chemical process, biological process, signal processing, and communications [9–12], where, for example, in digital communication systems, the communication channels are usually impaired by a nonlinear intersymbol interference (ISI). Channel identification allows compensating the ISI effects at the receivers.

In [13], a penalty transformation method is developed. Indeed a penalty function is formed by equations relating the unknown parameters of the model with the autocorrelations of the signal. This function is then included in the cost function yielding to an augmented Lagrangian function. It has been demonstrated that this approach gives good identification results for a nonlinear systems. However, this approach is still sensitive to additive Gaussian noise because the 2nd-order moment is used as a constraint. Authors, in [7], overcame this sensitivity by using 4th-order cumulants as a constraint instead of 2nd-order moments in order to smooth out the additive Gaussian noise. But the proposed approach which is based on a simplex-genetic algorithm becomes so long and computationally complex.

The main drawback of identification with Volterra series lies on the parametric complexity and the need to estimate a very big number of parameters. In many cases, Volterra series identification problem may be well simplified using the tensor formulation [10–12, 14].

Authors, in [10], used a parallel factor (PARAFAC) decomposition of the kernels to derive Volterra-PARAFAC models yielding an important parametric complexity reduction for Volterra kernels of order higher than two. They proved that these models are equivalent to a set of parallel Wiener models. Consequently, they proposed three adaptive algorithms for identifying these proposed Volterra-PARAFAC models for complex-valued input/output signals, namely, the extended complex Kalman filter, the complex least mean square (CLMS) algorithm, and the normalized CLMS algorithm.

In this paper, the algorithm derived in [14] is extended to be applied to blind identification of a general second-order Volterra-Hammerstein system. The main idea is to develop a general expression for each direction slices of a cubic tensor and then express the tensor slices in an unfolded representation. The three-dimensional tensor elements are formed by the fourth-order output cumulants. This yields to an Iterative Alternating Least Square (IALS) algorithm which has the benefit over the original Volterra filters in terms of implementation and complexity reduction. A convergence analysis based on matrices reversibility study is given showing that the proposed IALS algorithm converges to optimal solutions in the least mean squares sense. Furthermore, some simulation results and comparisons with different existing algorithms are provided.

The present work is organized as follows; in Section 2, a brief study of the three-dimensional tensor is presented. In Section 3, the model under study and the related output cumulants are then proposed, whereas, in Section 4 the decomposition analysis of the cumulant tensor is developed. In Sections 5 to 8, we give, respectively, the proposed blind identification algorithm, the convergence study, some simulation results, and at the end some main conclusions are drawn.

2. Three-Dimensional Tensor and Different Slice Expressions

A three-dimensional tensor can be expressed by where is the tensor value in the position of the cube with dimension , denotes the pth canonical basis vector with dimension M, and the symbol stands for the outer product (Figure 1).

A cubic tensor can be always sliced along three possible directions (horizontal, vertical, and frontal) as depicted in Figure 2. This yields, in each case, to matrices of dimensions.

The expression of the th slice in the horizontal direction is given by

In the same manner, the other matrix expressions along with the vertical and frontal directions are expressed, respectively, by

It is important to express the tensor slices in an unfolded representation, obtained by stacking up the 2D matrices. Hence, three unfolded representations of are obtained. For the horizontal, the vertical, and the frontal directions, we get, respectively,

We note that each matrix is an one.

3. Nonlinear System Model and Output Cumulants Analysis

We focus on the identification of a second-order Volterra-Hammerstein model with finite memory as it is given in [14]: where is the input of the system, assumed to be a stationary zero mean Gaussian white random process with . stands for the model order.

The Hammerstein coefficients vectors and are defined by

As we evoked in the Introduction, identification algorithms based on the computation of 2nd-order output cumulants are sensitive to additive Gaussian noise because 2nd-order cumulants of this latter are in general different to zero. Since the 4th-order cumulants of additive Gaussian noise is null, it will be interesting to use the 4th-order output cumulants to derive identification algorithms. But this will introduce another problem which is the computation complexity. In this paper, we will overcome this shortcoming by using a tensor analysis.

To determine the kernels of this model, we will generate the fourth-order output cumulants. For this purpose, we need to use the standard properties of cumulants and the Leonov-Shiryaev formula for manipulating products of random variables.

The fourth-order output cumulant is given by [15]: where It is easy to verify that for all .

All the nonzero terms of are obtained for . Such a choice allows us to construct a maximal redundant information, in which the fourth-order cumulants are taken for time lags , and within the range .

In the sequel we shall present an analysis of a 3rd-order tensor composed of the 4th-order output cumulants.

4. Formulation and Analysis of a Cumulant Cubic Tensor

Let us define the three-dimensional tensor , in which the element in position corresponds to , with ; ; . As , we get . Thus,

is given by (3.4). It follows that

Then, expression of the tensor will be given by

The mathematical development of the expression (4.3) yields to where

This notation leads to define two channel matrices as follows: with , and is the operator that builds a special Hankel matrix from the vector argument as shown above.

Let us compute now the different slices of the proposed tensor.

4.1. Horizontal Slices Expressions

From (2.2) and (4.3), we get

which can be written as where ; is the diagonal matrix formed by the th line of its argument.

It can easily be demonstrated that

It follows that with .

The expression of the unfolded tensor representation is given by

To develop this expression, we need the following property.

Property 1. Let be the matrix with dimensions and the matrix with dimensions , then where stands for the Khatri-Rao product.

It becomes that

5. Blind Identification System with Cumulant Tensor

To estimate the Volterra-Hammerstein kernels and to avoid the computation of , we will use the following Khatri-Rao property to propose an Iterative Alternating Least Square (IALS) procedure.

Property 2. If matrices and and vector are such that , then it holds that , where stands for the vectorizing operator.

Applying this property to (4.13), it is straightforward to write

Let and be

The problem of the blind nonlinear identification will be expressed as

This system can be solved using several methods. We propose to resolve it using the Iterative Alternating Least Square algorithm (IALS).

6. Cost Functions and Iterative Alternating Least Square Algorithm

To apply the IALS algorithm, we suppose alternatively that or is a constant vector. Then, we get two cost functions to be minimized. Assuming that the vector is constant, the first cost function will be expressed by

For the second cost function, we assume that is constant; thus

The application of the least mean squares algorithm to these two functions leads to the following solutions: where the subscript # denotes the Moore-Penrose pseudoinverse of the corresponding matrix.

Finally, the different steps of the proposed IALS algorithm are summarized in Algorithm 1.

Initialize and as random variables (estimates ).
For ,
 (i) build Hankel matrices using (4.5), for ,
               ,
 (ii) compute matrices estimate and as
,
,
 (iii) minimize cost functions (6.1) and (6.2) so that
               ,
               ,
 (iv) reiterate until parametric error convergence
               .

The notation stands for the estimates of the parameter .

7. Convergence Analysis

Equation (6.3) shows that the ALS algorithm converges to optimal solutions if and only if the Moore-Penroze pseudoinverse matrices and exist, which implies that matrices and must be full rank [14, 16]. To do this, we start by affirming that, due to the Hankel structure and the assumption that (3.1), each of the matrices and is full rank. Then

Let us now find out the rank of matrices obtained from Khatri-Rao product (5.1). We will make use of the following definition and property defining the k-rank of a matrix and the rank of a Khatri-Rao product of two matrices [17].

Definition 7.1. The rank of a matrix (denoted by ) is equal to if and only if every columns of are linearly independent. Note that , for all .

This means that the rank of the matrix is the largest integer for which every set containing columns of is independent.

Property 3. Consider the Khatri-Rao product , where is and is . If neither nor contains a zero column (and hence ), then .

It follows that which is equivalent to

Due to the definition of the Khatri-Rao product and the structure of the Hankel matrices , we conclude that

Consequently, which means that each matrix is full rank whatever the values taken by and in the set .

Let us now find out the rank of matrices and . For this purpose, we will study the structure of the matrix . Recall that is a matrix. Let be the zero column vector of dimension ; .

Then, for the matrix , we will have the following form:

where stands for the column vector of dimension which is constituted by products of the kernels model arising from computation of the Khatri-Rao matrix product. We have seen that is full rank. The sum of different matrices has the same form of whatever the system order and the values taken by and . Consequently, matrices and are full rank and then their pseudoinverse exist. We conclude that the IALS converges to an optimal solution in least mean squares sense.

8. Simulation Results

In this section, simulation results will be given to illustrate the performance of the proposed algorithm. Two identification Volterra-Hammerstein systems are considered:

The input sequence is assumed to be stationary, zero mean, white Gaussian noise with variance . The noise signal is also assumed to be white Gaussian sequence and independent of the input. The parameter estimation was performed for two different signal-to-noise ratio (SNR) levels: 20 dB and 3 dB.

The SNR is computed with the following expression:

Fourth-order cumulants were estimated from different lengths of output sequences ( and ) assuming perfect knowledge of the system model. To reduce the realization dependency, parameters were averaged over 500 Monte-Carlo runs. For each simulation, we give the curves representing the variation of the estimates along with the Monte-Carlo runs, and we resume exclusive results in different tables.

System 1
Figures 3 and 4 show the estimates of the different kernels of the proposed model, with the IALS algorithm, for and for different SNR levels (3 dB and 20 dB).
The mean and the standard deviation of the estimated kernels against the true ones are shown in Table 1.
Likewise, Figures 5 and 6 show the estimates of the different kernels of System 1 for and for SNR levels equal to 3 dB and 20 dB, while, in Table 2, the mean and the standard deviation of the estimated kernels against the true ones are shown.
From these results, we observe that the proposed IALS algorithm performs well generating estimates for a large variation of the SNR (from 20 dB to 3 dB). We also note that the standard deviation is relatively large and decreases with the number of the system observations.

System 2
Figures 7 and 8 show the estimates of the different kernels of the second proposed model for and for different SNR (20 dB and 3 dB).
The mean and the standard deviation of the estimated kernels against the true ones are shown in Table 3.
Figures 9 and 10 show the estimates of the different kernels of System 2 for and for different SNR (20 dB and 3 dB), while, in Table 4, the mean and the standard deviation of the estimated kernels against the true ones are shown. The mean and the standard deviation of the estimated kernels against the true ones, for the second system, are shown in Table 4.
From these results, we note also that the proposed algorithm provides good estimates for the proposed system. The number of observations affects the range of variation of the standard deviation values. Indeed, for important values of , this range becomes so small. The method provides good estimates even for low levels of SNR. Furthermore, we note that the larger the Monte-Carlo runs number, the smaller the standard deviations are.

8.1. Comparison with Existing Methods

The performance of the previous algorithm was compared with two works: the algorithm proposed in [9] (will be noted as BIL to blind identification with linearization) and the Lagrange Programming Neural Network (LPNN) proposed in [13].(i)In [14], the problem of blind identification was converted into a linear multivariable form using Kronecker product of the output cumulants. This can be described by the following equations: where denotes the output cumulants sequence of order , is the intensity (zero lag cumulant) of order of the vector which is formed in its turn by the different powers of input, and is the kernel vector.For , this becomes Different important scenarios were discussed and successfully resolved. Here, we are interested in the case of Gaussian input when the input statistics are known. Despite the efficiency of the proposed method, the resulting algorithms are in general cumbersome especially for the high series order (As confirmed by authors). For more details, see [14].(ii)In their work [13], authors tried to determine the different Volterra kernels and the variance of the input from the autocorrelation estimates and the third-order moments estimates of the system output, using the Lagrange Programming Neural Network (LPNN). As the LPNN is essentially designed for general nonlinear programming, they expressed the identification problem as follows: where is the autocorrelation function of the real process and is its third order moment sequence. is the vector formed by the unknown parameters of the Volterra model and the unknown variance of the driving noise.So the Lagrangian function will be written as

To improve the convergence and the precision of the algorithm, authors extended the preceding function by defining the Augmented Lagrangian Function such as where is a penalty parameter sequence satisfying . So the back-propagation algorithm can be established using the Lagrange multiplier.

The performance of the new proposed algorithm was compared with these two algorithms. Each of these algorithms was used to identify the two models presented above (8.1), for the case of Gaussian excitation, samples, and for the tow proposed SNR levels 3 dB and 20 dB.

Figures 11 and 12 show a comparison of the standard deviations given by each algorithm. We note that these results may vary considerably depending on the number of the output observations. These results show that the new proposed algorithm performs well. For a small number of unknown parameters, we note that all algorithms give in general the same STD values and these values decrease by increasing the SNR values. We note furthermore that BIL algorithm is so complex for programming in comparison with the LPNN and the new one. For big number of unknown parameters, the BIL algorithm becomes very computational complex and even the LPNN, while the new algorithm keeps its simplicity and provides good parameters with very competitive STD values.

9. Conclusion

In this paper, a new approach for blind nonlinear identification problem of a second-order Hammerstein-Volterra system is developed. Thanks to a matrix analysis of a cubic tensor composed of the fourth-order output cumulants, the nonlinear identification problem is reduced to a system having the following general form: . This system is solved using the Iterative Alternating Least Square. A convergence analysis shows that matrices and are full rank which means that the IALS algorithm converges to optimal solutions in the least mean squares sense. Simulation results on two different systems show good performance of the proposed algorithm. It is noted also that the different values of the estimates improve with the number of the system observation even for small values of SNR. Comparison results with two algorithms show that the new proposed algorithm performs well and especially in the case of great number of unknown parameters. Extending the proposed algorithm for more input classes and for more general Volterra-Hammerstein systems remains an open problem, and it is now the subject matter of current works.

Acknowledgment

The authors would like to thank Prof. Gerard Favier, Laboratory Director (I3S Laboratory–CNRS, University of Nice Sophia Antipolis, France) for his support, advice, and corrections to carry out this work.