Extended Stochastic Gradient Identification Method for Hammerstein Model Based on Approximate Least Absolute Deviation

Xu, Bao-chang; Lin, Zhong-hua; Zhang, Ying-Dan; Xiao, Yu-yue

doi:https://doi.org/10.1155/2016/9548428

Mathematical Problems in Engineering

On this page

Abstract Introduction Conclusions Acknowledgments References Copyright Related Articles

Research Article | Open Access

Volume 2016 | Article ID 9548428 | https://doi.org/10.1155/2016/9548428

Extended Stochastic Gradient Identification Method for Hammerstein Model Based on Approximate Least Absolute Deviation

Bao-chang Xu,¹Zhong-hua Lin,¹Ying-Dan Zhang,²and Yu-yue Xiao¹

Academic Editor: Mitsuhiro Okayasu

Received24 Feb 2016

Revised19 May 2016

Accepted19 May 2016

Published13 Jun 2016

Abstract

In order to identify the parameters of nonlinear Hammerstein model which are contaminated by colored noise and peak noise, the least absolute deviation (LAD) is selected as the objective function to solve the problem of large residual square when the identification data is disturbed by the impulse noise which obeys symmetrical alpha stable () distribution. However, LAD cannot meet the need of differentiability required by most algorithms. To improve robustness and to solve the nondifferentiable problem, an approximate least absolute deviation (ALAD) objective function is established by introducing a deterministic function to replace absolute value under certain situations. The proposed method is derived from ALAD criterion and extended stochastic gradient method. Due to the differentiability of the objective function, we can get a recursive identification algorithm which is simple and easy to calculate compared with LAD. The convergence of the proposed identification method is also proved by Lyapunov stability theory, and the simulation experiments show that the proposed method has higher accuracy and stronger robustness than the least square (LS) method in the identification of Hammerstein model with colored noise and impulse noise. The impact of impulse noise can be restrained effectively.

1. Introduction

In recent years, the Hammerstein model has drawn a lot of attention because of its block-oriented nonlinear (BONL) character. It can be used to describe a variety of nonlinear systems such as nonlinear filtering, actuator saturation, signal analysis, and chemistry in biology system. So far, there have been several kinds of methods for the identification of Hammerstein model with white noise such as the iterative method [1, 2], the overparameterization method [3], the support vector machine (SVM) method [4], the subspace method [5–7], the blind method [8], the frequency domain identification method [9], and the artificial intelligence method [10–12]. In industry process control, however, the noise always includes colored noise and peak noise. It is necessary to consider the identification method for Hammerstein model when data are contaminated by colored noise and peak noise. Chang and Luus proposed an iterative method for Hammerstein model with colored noise [13], but it cannot be used for online-identification. Ding and Chen proposed an extended stochastic gradient method based on least squares for the nonlinear Hammerstein ARMAX system and proved its convergence by the martingale convergence theorem [2]. In most researches, the LS criterion is taken as the objective function during the identification of Hammerstein model. The studies show that the LS method has better identification effect when the stochastic noise is normally distributed [14]. However, the LS method cannot work as well as the LAD method in certain conditions; for example, if the noise does not obey the normal distribution, the statistics performance of LAD estimation is better than LS estimation with irreplaceable advantages [15]. As to the square term in the objective function of LS method, a small change of the measured data will lead to a great influence on the identification results when there are outliers in the measured data [16].

To compensate the effect of the impulse noise and outliers on the identification accuracy, the LAD criterion is chosen to be the objective function which replaces the square terms with absolute deviation. The LAD method decreases the sensitivity to impulse noise and outliers and greatly improves the robustness because the LAD criterion only does the first power computation of the deviation. The LAD objective function is not differentiable. It needs to solve a nonsmooth optimization problem [17], which complicates the computation. The proposed method replaces the absolute deviation in LAD with a certain differentiable function and rebuilds the ALAD objective function. This paper derives the identification algorithm for Hammerstein model from ALAD objective function and the extended stochastic gradient method. To improve the identification accuracy and convergence rates, we add an inertial term to the proposed method. The convergence of the algorithm is proved by Lyapunov stability theory at the same time. The simulation experiments show that the proposed method can effectively eliminate the influence of impulse noise and outliers. Compared with the LS method, the ALAD method has stronger robustness and higher identification accuracy, demonstrating the superiority of the proposed method.

The rest of this paper is organized as follows. Section 2 describes the identification problem of nonlinear Hammerstein systems with colored noise. Section 3 derives the proposed identification algorithm from ALAD criterion. The convergence of the proposed method is also discussed in Section 3. Section 4 offers an illustrative example and compares the proposed algorithm with existing LS methods. Some concluding remarks are provided in Section 5.

2. Hammerstein Model with Colored Noise

As is shown in Figure 1, the structure of Hammerstein model is a series connection of a nonmemory static nonlinear block and a linear dynamic block. The static nonlinear block is connected with an input signal; the linear dynamic block is connected with an output signal.

In Figure 1, is an input and is the output of the nonlinear part and it is the input of the linear part as well. is a real output. is a measured output and the result of which is disturbed by the colored noise . is an additive white noise with zero mean. The colored noise is the output of the linear link that is driven by the white noise . is the “noise model” which decides the property of , and is the transfer function. Assuming the nonlinear part could be represented by a polynomial about the input with known orderthe linear block of Hammerstein model is described by the model of CARMA/ARMAX (Controlled Autoregressive Moving Average/Autoregressive Moving Average eXogenous), the transfer function , and . Therefore, the relationship of input and output can be written asFrom (1) and (2), it is obtained thatwhere , , and are the polynomials of the shift operator:where , , , and () are the parameters to be identified.

From (3), it is obtained that

Definewhere, , and are the parameter vectors of the linear section and is the parameter vector of the nonlinear section of the Hammerstein model:We defineThen, (5) can be rewritten as

3. The Identification Method Based on ALAD Criterion and Extended Stochastic Gradient Method

3.1. Approximate Least Absolute Deviation

As is known, the absolute deviation is not differentiable. To overcome this shortage, we choose a certain differentiable function to approximate the LAD criterion. This requirement can be met by the following logarithmic function:

In (11), is an adjustable parameter, and the nonlinear function is related to . While is small enough, nonlinear function , and ; that is, , approximating the absolute value function. The property curve of the logarithmic function when is shown in Figure 2.

The curve shows that while is small enough, nonlinear function can approximate the absolute value function effectively. Because is differentiable, then the identification algorithm based on can be solved by some optimization methods.

3.2. Algorithm Derivation

In (10), the input and output are measurement data. According to the known structure of the nonlinear function , we can get the value of . But in is white noise and immeasurable, which makes it impossible to solve the identification problem. Based on the idea that the noise can be replaced by its estimated value, we use , the noise residual in the th step, to replace in the th step, so can be replaced by its estimated value :Then, the identification of in (10) is to find that minimizes the following criterion function (ALAD) based on input data and output data :

For (13), it is a typical nonlinear function optimization. To reduce the computation complexity, we use the stochastic gradient method to derive the recursive equation of [18, 19]. The formula of the stochastic gradient method iswhere represents gradient and is step size. The formula above shows that is corrected along the negative gradient direction until the extreme value of is obtained.

Firstly, we get the gradient of :then, from (15) and (13), we obtainwhere .

We can get the gradient of to from (16):where

Equation (17) is obtained based on the exact line search algorithm and always leads to the two adjacent gradient search directions being orthogonal and usually results in zigzag phenomenon. When we use (17) to compute the optimal , the convergence rate is slow because of the linear convergence order, and the identification results of the parameters may be fluctuated. For this shortcoming, an improved stochastic gradient method is applied to get where is based on the information of gradient that is times before the current iteration step; that is, From , it is obtained thatFrom (14), (15), and (19), it is obtained that thenwhere

is computed by the following recursive formula:where .

Equation (24) is equivalent for adding an inertial term to the recursive equation of . The weights of the inertial term are related to the iterative output error at and step.

In conclusion, we can obtain the extended stochastic gradient identification method based on approximate least absolute deviation (ALADESG):

3.3. Analysis of Convergence

Rewrite (22) aswhereIf the truth values of process parameters are noted as , then .

Defining , according to (26), we havesubtracting on both sides of (28)then

Constructing the Lyapunov functionthenwhere is a real symmetric matrix, and the rank of . Obviously, when the following inequality in (34) is fulfilled, the matrix is negative and semidefinite, and then (28) will be asymptotically stable in a large scope at the balance point :

According to (27),

The second term of (35) can be written aswhere During the iterative process, if we could make sure thatthen we get Therefore, in order to guarantee the convergence of the proposed algorithm, we need to add (38) as a constraint when (24) is used for iteration.

3.4. Separation of Parameters by Average Method

It is noteworthy that in the description of the Hammerstein model (as shown in Figure 1), for some nonzero and finite constant , any and can match and yield identical input and output measurements; thus, the nonlinear function and the transfer function are not unique. In other words, any identification scheme will not be able to distinguish between and . Therefore, without any loss of generality, one of the gains of and has to be fixed. There are several ways to normalize these gains [20, 21]. We adopt the following method.

Assumption 1. The first coefficient of the function equals 1 [22]; that is to say, in (1), .
Based on the relationship between the parameters, can be computed usingFrom (40), we can see that the number of is , and is the estimated value of . In order to obtain the estimated value of the parameters with much higher accuracy, we use the average method below to estimate :

4. Simulation Results and Discussions

Consider the following Hammerstein model:where ; ; . The nonlinear static part can be described as the polynomial below:where the input sequence is a series of uncorrelated, continuous, and stochastic sequences with zero mean and unit variance . The noise sequence is a group of stochastic white noise sequences with zero mean and variance . The input and output data for the identification procedure are obtained from this model. We then apply proposed algorithm (25) to estimate the parameters of this system at different situation. The identification process and result analysis are demonstrated in the following. The simulation experiments are performed in Matlab. The value of is taken as 0.01.

The relative error of the parameters estimation is used as the evaluation criteria of the algorithm which is defined as follows:

In order to verify the performance of the proposed algorithm, we designed two simulation cases.

Case 1. Only colored noise exists in the measured data. This case is used to prove the impact of inertial term on the identification results of ALADESG method.

Case 2. Not only colored noise but also impulse noise exists in the measured data. This case is used to compare the identification effects of ALADESG method with that of LSESG method when measured data are contaminated by impulse noise. The robustness of ALADESG method is also verified when impulse noise with different amplitude is added.

The impulse noise in the experiment is subject to distribution [23], and the probability density function of the standard distribution iswhere is gamma function and is characteristics exponent. The less is, the higher chance of large amplitude sample of the random variable which is subject to distribution takes place, and the stronger pulse strength is.

The way to get a random variable which obeys the standard distribution follows the steps below [24]:(1)Get a random variable subject to uniform distribution on .(2)Get a random variable subject to exponential distribution and mean value is 1.(3)Get the variable subject to standard distribution by the following formula:where and are independent.

4.1. The Effect of Inertial Term on the Identification Results

(1) Let ; namely, is computed by (17) with current gradient information only. The curve of the relative error is shown in Figure 3.

According to theoretical analysis, there are some drawbacks when (17) is used to compute the optimal search step size, such as linear convergence order, slow convergence rate, and fluctuant estimation of parameters. As shown in Figure 3, the curve of the relative error of the parameters of Hammerstein model fluctuates severely, and the estimated parameters cannot converge to the true value. That is to say, the method cannot provide a reliable parameter estimation for the identification of Hammerstein model when .

(2) In order to improve the accuracy and convergence rate and to enhance the reliability of the identification results, we add an inertial term to the method, namely, applying (24) and (38) to parameter identification. The results are shown in Table 1.

Obviously, when , the deviations of the estimated values and true values of model parameters are small, and the identification accuracy is much better than that of . This shows that the inertial term coming from the improved stochastic gradient method enhances the identification accuracy effectively and insures the convergence of the ALADESG method. The relative error curve is shown in Figure 4.

Comparing Figure 3 with Figure 4 shows that the relative error curve in Figure 4 is much smoother, the estimated parameters converge to those true values steadily, and the relative error is smaller than the situation as shown in Figure 3. Simulation results in Figures 3 and 4 show that the inertial term can greatly improve the identification accuracy and significantly enhance the reliability of the parameter estimation.

4.2. Comparison of ALADESG and LSESG

4.2.1. Considering the Measured Data with Colored Noise Only

According to the same idea in Section 3.2, we can get the iterative equations of LSESG also containing a similar inertial term. Then the proposed ALADESG method was compared with the LSESG method when the measured data are contaminated with colored noise. The simulation result is shown in Figure 5.

As shown in Figure 5, the identification accuracies of the ALADESG method and the LSESG method are both acceptable when the measured data are only disturbed by colored noise, but the identification accuracy of the LSESG method is higher and the convergence rate is faster. So when the measured data is contaminated with colored noise only, the identification performance of the LSESG method is better than the ALADESG method.

4.2.2. The Measured Data with Impulse Noise Subject to Distribution

The measured data are contaminated with both colored noise and impulse noise. Let the characteristics exponent of the impulse noise be 1.5. The time-domain waveform of the impulse noise is shown in Figure 6. Then the measured data are identified by the ALADESG method and the LSESG method, respectively. The relative errors are shown in Figure 7.

Figure 7 shows that the identification accuracy of the ALADESG method is higher than that of the LSESG method when the measured data are disturbed by colored noise and impulse noise with . From Figures 6 and 7, we notice that the relative error of the LSESG method will fluctuate severely at the moment when a large impulse is added, and the larger the amplitude of the impulse is, the severer the fluctuation is, which makes the parameter estimation unable to converge steadily. In the LSESG method, the objective function is the square of the error so the influence of impulse noise would be amplified especially when the impulse noise is large. In contrast, the ALADESG method is based on the ALAD criterion which restrains the influence of impulse noise and enhances the robustness, the parameter estimation converges to the true value quickly and steadily, and the identification accuracy is also improved. The simulation results indicate that the ALADESG method has a better identification performance than LSESG method when impulse noise is added.

4.2.3. The Influence of Different Pulse Amplitude on the Identification Results of ALADESG Method

In this part, we conduct three simulations with , , and . The identification results of the ALADESG method are shown in Table 2, and the relative errors are shown in Figure 8. In addition, the identification results of the ALADESG method are compared with those of LSESG method when and . The relative errors curves when and are shown in Figures 9 and 10, respectively.

From Table 2 and Figure 8, the identification accuracy of the ALADESG method becomes worse when decreases, but the decrease of accuracy is small. In contrast, from Figures 7, 9, and 10, the identification accuracy of the LSESG method becomes worse when is decreased and the decrease of accuracy is very obvious. Particularly when and , the LSESG method cannot converge indeed, which suggests that the LSESG method cannot identify the parameters of Hammerstein model in this situation. But high accuracy and stable identification results still can be achieved by the ALADESG method. Therefore, it can be concluded that the method based on ALAD criterion overcomes the impact of impulse noise very well and has better robustness to impulse noise.

4.3. Identification Results Analysis

According to the identification results shown in Sections 4.1 and 4.2, the following conclusions are summarized.(1)When the measured data are contaminated by colored noise, the Hammerstein model can be identified effectively by the proposed method.(2)By applying inertial term which contains the gradient information of past time, the identification accuracy and convergence rates are improved significantly.(3)The identification performance of the LSESG method is better than that of the ALADESG method when only colored noise exists.(4)The ALADESG method has better accuracy and convergent rate than LSESG method when both colored noise and impulse noise exist in measured data. Particularly when the amplitude of impulse noise is very large (such as when ), the ALADESG method still can get an acceptable identification result, which proves the adaptability of the method to more serious noise. The robustness of the method is also confirmed by simulation results.

5. Conclusions

In this paper, we expand the application of the LAD technique to the field of nonlinear identification. A new algorithm is proposed based on ALAD criterion and improved stochastic gradient search algorithm for the Hammerstein model, and the convergence is proven by Lyapunov stability theory. With the application of ALAD criterion, the proposed method overcomes the influence of impulse noise and colored noise on the identification results and improves the robustness to a large extent. The proposed algorithm is easy to implement. The simulation results show that the proposed algorithm yields better robustness against impulse noises and a faster convergence rate than that by the LSESG method.

Competing Interests

The authors declare that they have no competing interests.

Acknowledgments

This work was supported by Important National Science & Technology Specific Projects of China (no. 2011ZX05027-005).

References

K. S. Narendra and P. G. Gallman, “An iterative method for the identification of nonlinear systems using a hammerstein model,” IEEE Transactions on Automatic Control, vol. 11, no. 3, pp. 546–550, 1966.
View at: Publisher Site | Google Scholar
F. Ding and T. Chen, “Identification of Hammerstein nonlinear ARMAX systems,” Automatica, vol. 41, no. 9, pp. 1479–1489, 2005.
View at: Publisher Site | Google Scholar | MathSciNet
F. Ding, Y. Shi, and T. Chen, “Auxiliary model-based least-squares identification methods for Hammerstein output-error systems,” Systems & Control Letters, vol. 56, no. 5, pp. 373–380, 2007.
View at: Publisher Site | Google Scholar
I. Goethals, K. Pelckmans, J. A. Suykens, and B. De Moor, “Identification of MIMO Hammerstein models using least squares support vector machines,” Automatica, vol. 41, no. 7, pp. 1263–1272, 2005.
View at: Publisher Site | Google Scholar | MathSciNet
L. Bako, G. Mercère, S. Lecoeuche, and M. Lovera, “Recursive subspace identification of Hammerstein models based on least squares support vector machines,” IET Control Theory & Applications, vol. 3, no. 9, pp. 1209–1216, 2009.
View at: Publisher Site | Google Scholar
Z. Liao, Z. Zhu, S. Liang, C. Peng, and Y. Wang, “Subspace identification for fractional order hammerstein systems based on instrumental variables,” International Journal of Control, Automation and Systems, vol. 10, no. 5, pp. 947–953, 2012.
View at: Publisher Site | Google Scholar
M. Verhaegen and D. Westwick, “Identifying MIMO Hammerstein systems in the context of subspace model identification methods,” International Journal of Control, vol. 63, no. 2, pp. 331–349, 1996.
View at: Publisher Site | Google Scholar | MathSciNet
L. Vanbeylen, R. Pintelon, and J. Schoukens, “Blind maximum likelihood identification of Hammerstein systems,” Automatica, vol. 44, no. 12, pp. 3139–3146, 2008.
View at: Publisher Site | Google Scholar | MathSciNet
B. Er-Wei, “Frequency domain identification of Wiener models,” Automatica, vol. 44, no. 5, pp. 1451–1455, 2008.
View at: Google Scholar
X.-Y. Li and Z.-G. Han, “Identification approach for nonlinear systems based on particle swarm optimization,” Control and Decision, vol. 26, no. 11, pp. 1627–1631, 2011.
View at: Google Scholar
X. Xiaoping, Q. Fucai, and W. Feng, “Identification method for Hammerstein models based on hybrid particle swarm optimization,” Journal of Engineering Mathematics, vol. 27, no. 1, pp. 47–52, 2010.
View at: Google Scholar
X. Xiaoping, Q. Fucai, W. Feng, and L. Hongyan, “Identification method for Hammerstein models based on advanced particle swarm optimization algorithm,” Computer Engineering, vol. 34, no. 14, pp. 200–203, 2008.
View at: Google Scholar
F. Chang and R. Luus, “A noniterative method for identification using Hammerstein model,” IEEE Transactions on Automatic Control, vol. 16, no. 5, pp. 464–468, 1971.
View at: Publisher Site | Google Scholar
C. Z. Fang and D. Y. Xiao, Process Identification, Tsinghua-University Press, 1988.
W. Xiang and Z.-H. Chen, “New identification method of nonlinear systems based on Hammerstein models,” Control Theory and Applications, vol. 24, no. 1, pp. 143–147, 2007.
View at: Google Scholar
B.-C. Xu and X.-L. Liu, “Identification algorithm based on the approximate least absolute deviation criteria,” International Journal of Automation and Computing, vol. 9, no. 5, pp. 501–505, 2012.
View at: Publisher Site | Google Scholar
D. Jian and X. Kaigui, “Research of the non-linear regress models based on the least absolute criteria,” Journal of Chongqing Normal University (Natural Science Edition), vol. 18, no. 4, pp. 71–74, 2001.
View at: Google Scholar
Y. Liu, Y. Xiao, and X. Zhao, “Multi-innovation stochastic gradient algorithm for multiple-input single-output systems using the auxiliary model,” Applied Mathematics and Computation, vol. 215, no. 4, pp. 1477–1483, 2009.
View at: Publisher Site | Google Scholar | Zentralblatt MATH | MathSciNet
D. Feng, L. Jingfan, and X. Yongsong, “Parameter estimation for a class of nonlinear systems,” Control Engineering of China, vol. 18, no. 3, pp. 373–376, 2011.
View at: Google Scholar
F. Ding, X. P. Liu, and G. Liu, “Identification methods for Hammerstein nonlinear systems,” Digital Signal Processing, vol. 21, no. 2, pp. 215–238, 2011.
View at: Publisher Site | Google Scholar
F. Wei and D. Feng, “Three methods for Hammerstein models' parameter separation,” Science Technology and Engineering, vol. 8, no. 6, pp. 1586–1589, 2008.
View at: Google Scholar
E.-W. Bai, “A blind approach to the Hammerstein-Wiener model identification,” Automatica, vol. 38, no. 6, pp. 967–979, 2002.
View at: Publisher Site | Google Scholar | MathSciNet
M. J. Coates and E. E. Kuruoğlu, “Time-frequency-based detection in impulsive noise environments using α-stable noise models,” Signal Processing, vol. 82, no. 12, pp. 1917–1925, 2002.
View at: Publisher Site | Google Scholar
L. Deroye, Non-Uniform Random Variant Generation, Springer, New York, NY, USA, 1986.

Copyright

Copyright © 2016 Bao-chang Xu et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

PDF Download Citation

Download other formats

Order printed copies

Views

599

Downloads

635

Citations