Abstract

Hammerstein model has been popularly applied to identify the nonlinear systems. In this paper, a Hammerstein-type neural network (HTNN) is derived to formulate the well-known Hammerstein model. The HTNN consists of a nonlinear static gain in cascade with a linear dynamic part. First, the Lipschitz criterion for order determination is derived. Second, the backpropagation algorithm for updating the network weights is presented, and the stability analysis is also drawn. Finally, simulation results show that HTNN identification approach demonstrated identification performances.

1. Introduction

Identification of nonlinear dynamic systems has been one of the most interesting research areas in engineering. Similar as the well-known Wiener model, Hammerstein model is also well known and the most widely used for modeling of various processes [1, 2], which comprises of a static nonlinear block preceding a dynamic linear one [3]. This type of model was called block-oriented nonlinear model [4]. Different from black-box models, the block-oriented model was regarded as gray-box model which has clear physical interpretation, and its steady-state part describes the gain of the system [5]. The major drawback of these classic identification techniques is that they could have many constraints for nonlinear systems with time-varying parameters and uncertainties.

It has been proven that neural networks (NN) can generally approximate any smooth nonlinear function well [2, 3]. During the last decade, they have been successfully and efficiently applied to identify and control the nonlinear systems due to high adaptation, self-organization, and fast parallel processing of real-time information. However, for general neural networks, the model structures have no relation with the physical nature of the process and the model parameters have no physical interpretation, similar to black-box models [5]. Therefore, many works were focused on integrating NN with Wiener or Hammerstein models. As a result, the model structures and the model parameters of NN have clear physical interpretation while the well-developed classic identification methodologies would have adaptive abilities.

There are two ways to integrate NN with Wiener or Hammerstein models. One is to use NN to formulate the system static nonlinearities of Wiener or Hammerstein models. Chen et al. [6] used a simple linear model to represent the dynamic part and a neural network to represent the nonlinear static part. The dynamic linear part was replaced by Laguerre filters and the nonlinear static part was described as a neural network [2]. Tötterman and Toivonen [7] used support vector regression to identify nonlinear Wiener systems. The linear block was expanded in terms of Laguerre or Kautz filters, and the static nonlinear block was determined using support vector machine regression. Saha et al. [8] developed a Weiner-type nonlinear black-box model for capturing dynamics of open loop stable MIMO nonlinear systems, where quadratic polynomials and NN were used for constructing the nonlinear output map. The models were then used in nonlinear model predictive control [8]. Al-Duwaish et al. [9] used a hybrid model consisting of a linear autoregressive moving average (ARMA) model and a NN to, respectively, represent the dynamic linear block and the static nonlinear element of Wiener model. The other one is to formulate the Wiener or Hammerstein model entirely by a multilayer NN; that is, the dynamic linear block and static nonlinear block of Wiener or Hammerstein model are both represented by the NN. Janczak [4] designed a NN to formulate the Hammerstein model, which was composed of one hidden layer with nonlinear nodes and one linear output node. Wu et al. [10] proposed a Hammerstein neural network compensator to identify a dynamic process. Peng and Dubay [11] proposed a Wiener-type neural network to identify nonlinear dynamic processes. However, most of those investigations did not give the system order determination or stability analysis.

In this paper, a multilayer neural network is used to formulate the traditional Hammerstein model, which is called Hammerstein-type neural network (HTNN). The HTNN formulates the Hammerstein model with a nonlinear static block in cascade with a linear dynamic block, the weights in which are corresponding with the parameters of Hammerstein model. Then, an identification methodology based on HTNN is presented and applied to nonlinear SISO systems. To determine the order and weights of HTNN, the Lipschitz criterion and backpropagation (BP) algorithm are derived, respectively. Furthermore, the stability analysis is drawn. Finally, the proposed identification method is tested on several nonlinear plants.

In [2, 69], neural networks were used to formulate the system static nonlinearities of Wiener or Hammerstein models, while in our design, a multilayer neural network is used to formulate the Hammerstein model entirely; that is, the dynamic linear block and static nonlinear block of Hammerstein model were both represented by the neural network. In this way, the parameters of Hammerstein model can be obtained by training HTNN using an adequate training algorithm. Moreover, the model order determination and convergence stability are also drawn.

The rest of this paper is organized as follows. In Section 2, Hammerstein model is described. The design of the HTNN for identification is given in Section 3, the Lipschitz criterion is presented for order determination, and the learning algorithm is analyzed. The convergence analysis of HTNN is given in Section 4. The simulation results are given in Section 5. The conclusions are drawn in Section 6.

2. Hammerstein Model

Many industrial processes can be described by Wiener model or Hammerstein models [1]. As shown in Figure 1, a general Hammerstein model consists of a nonlinear static block and a linear dynamic block. The nonlinear static block is given by where is the input variable; represents the nonlinear component of the Hammerstein model; is a nonmeasured intermediate variable that does not necessarily have a physical meaning.

And the linear dynamic block can be described as with where is the output variable; is the unit delay operator; and are the orders of the linear dynamics and generally .

There are many methods to identify the model parameters of Wiener model or Hammerstein model, such as the least squares method or its recursive version [12], maximum likelihood methods [13], correlation methods [14], linear optimization methods [15], and nonlinear optimization methods [2]. In this study, a multilayer neural network is designed to formulate the Hammerstein model. Therefore, the parameters of the Hammerstein model can be directly obtained through training the multilayer neural network by the BP training algorithm.

3. System Identification Using Hammerstein-Type Neural Network

A general neural network can be regarded as a black-box model [5]. In this section, a Hammerstein-type neural network (HTNN) is designed to formulate the Hammerstein model entirely. To determine the minimal model order, an order determination method based on Lipschitz quotients [16] is utilized. And for weights updating, the BP algorithm is presented.

3.1. Hammerstein-Type Neural Network

As shown in Figure 2, a multilayer neural network called Hammerstein-type neural network is designed, which consists of a nonlinear static element and a single linear node with two tapped delay lines forming the model of the linear dynamic element.

In Figure 2, is the input variable, is the output variable, and is a nonmeasured intermediate variable that does not necessarily have a physical meaning. , , means the order of input , the parameters , , and mean the weights of HTNN, and is the unit delay operator. Giving an input signal , initial condition of and (), and initial values of , , and , we can obtain the HTNN output ().

As shown in Figure 2, a polynomial function is used in the nonlinear static block, the output of hidden layer can be expressed as and the output can be expressed as where for , for , and for are the weights of HTNN, which are associated with the parameters of Hammerstein model in (1) and (2). As a result, the parameters of the Hammerstein model can be expressed directly as the weights of the dynamic neural network. Then, the identified Hammerstein model can be obtained by training HTNN using an adequate training algorithm.

3.2. Model Order Determination

Usually, before training a neural network, it is important to determine how many neurons are in the network. In the HTNN, the structure of linear dynamic part is determined by and . As the most popular statistical model selection criterion, Akaike information criterion (AIC) [17] is a common method to determine the model order. However, the AIC for model order determination is usually subjected to a complex optimization process. The Lipschitz quotients criterion, which was proposed by He and Asada [16], can be utilized to determine the model order by analyzing the input-output data. Peng and Dubay [11] introduced the Lipschitz quotients criterion to determine the model order in a Wiener neural network.

Consider a general nonlinear SISO dynamic system, which can be described as where and are the output and input variables of the nonlinear dynamic system, and and are the true orders of the output and input, respectively; is a nonlinear function assumed to be continuous and smooth.

Rewriting (6) in a compact form gives where is the number of input variables; let . To reconstruct the nonlinear function from the input-output data pairs , where is the number of data sets used for the model order determination. Define the Lipschitz quotient as where is the distance of two points in the input space and is the difference between and . For data points with a small distance between them, the Lipschitz quotient can be rewritten as where and for and the subscript in represents the number of input variables in (7). According to the investigation in He and Asada [16], the values of can be used as indicator when one or more input variables are missing or in the case when one or more redundant input variables are included. For instance, if one of input variables, , is missing from the input set, the Lipschitz quotient will be considerably larger than or even unbounded. On the contrary, when an input variable is included and the Lipschitz quotients and are calculated, if is a redundant input variable, there will be a slight difference between and but not significant.

To avoid the effect of measurement noise, the following index [16] is utilized to determine an appropriate order: where is the th largest Lipschitz quotient among all with the input variables , and parameter is a positive number, usually selected to be . For testing purposes, the stop criterion can be defined as [18] where is a prespecified threshold.

It should be noted that the number of nodes in the nonlinear layer, , is chosen manually. This is because, from empirical experience, for most cases, can be chosen [11].

3.3. Learning Algorithm

During the neural network learning procedures, the weights are updated by the BP algorithm. An error function can be defined as where is the identification error; and are the actual system output and the neural network output, respectively. Let be the weights, which consist of the parameters , , and . Applying the steepest descent method, the optimization target is characterized to minimize the error function equation (12) with respect to the weights of the network. Consider where represents the element .

The general update rule is expressed as where is the training rate.

Considering the element of also being function of the weights and , the partial derivative of the intermediate variables to linear dynamic block weights , and intermediate variables can be calculated as

The partial derivative of intermediate variables to weights can be calculated as

According to (17) and (18), the partial derivative of the HTNN output to weights can be calculated as

According to (14), the update law of , , and can be calculated as where the partial derivatives in (20), (21), and (22) are shown in (16), (15), and (19), respectively. From the above analysis, the structure of the system identification using the HTNN is shown in Figure 3.

4. Convergence Analysis

In the training procedures of HTNN, a proper choice of training rate is usually required in the update rules of (20)–(22). Too small guarantees convergence but with slow training speed, while too big leads to being unstable. In this section, the approach on selecting properly is developed.

A discrete type Lyapunov function can be defined as [19]

Therefore, the time change of the Lyapunov function can be obtained as

The error difference can be represented as

From the update rule of (13), we have

Theorem 1. Assume that is the learning rate for the weights of HTNN and is defined as , where and is the Euclidean norm. Then, the convergence is guaranteed if is satisfied as

Proof. According to (23)-(24), can be represented as Substituting the equation into (28) yields Define , where ; we obtain Since for all time , the convergence of training algorithm means . Referring to (30), it implies that (27) is satisfied. It should be pointed that optimal convergence via maximum learning rate corresponds to , which is the upper half of the limit in (27).

5. Simulation Examples

In this section, three examples are utilized to illustrate the HTNN identifier. The first is a Hammerstein process which is utilized to demonstrate the HTNN of fitting the Hammerstein model. The second and the third are a nonlinear dynamic system and an industry process, which are utilized to demonstrate the HTNN can be used to identify nonlinear dynamic systems and processes, respectively.

The implementation procedure of the HTNN for nonlinear system identification is itemized as follows.

Step 1. Select the input-output variables and and the structures ( and ) of the HTNN according to Section 3.2.

Step 2. Select the order of polynomial function; note that the value range of integer is .

Step 3. Model the nonlinear system using HTNN as in (4) and (5), and train the HTNN using the training datasets according to (20)–(22) to obtain the weights , , and .

Step 4. Calculate the error function in (12). If is less than a limiting value within a given number of training iterations, it means that the model can be accepted; otherwise go to Step 2 to reselect the value of integer .

Example 1. Nonlinear Hammerstein model identification is as follows.
The following Hammerstein model is described as
A total of 1000 data pairs of an i.i.d. uniform sequence within the limits were generated to train the HTNN. In the order determination procedure, firstly, it gives the input and computes the Lipschitz quotient and then sets ; the Lipschitz quotient . For and , the corresponding Lipschitz quotients and are decreased significantly. Then it sets , , and , the corresponding Lipschitz quotients , , and are not significantly different from , and the stop criterion equation (11) is satisfied, where the threshold is chosen as . Figure 4 shows the values of the Lipschitz quotients for different order; the best order of the system is (2,2); that is, and . From (31), the true order of the system is (2,2).
In this case, the number of neurons in nonlinear static block is chosen as . According to the convergence analysis, ; therefore, the learning rate can be chosen as . The initial parameters in (21)–(23) are , , , , and as . Then, the testing input signal is used to verify the identification performance of the HTNN. Figure 5 illustrates the output of the plant and the HTNN. And the mean square error (MSE) is with 7 tunable parameters using the proposed HTNN.

Example 2. Nonlinear dynamic system identification is as follows.
The following process is a nonlinear dynamic process formulated in a discrete form as [20]
A total of 500 data pairs which are generated by inputs with a sinusoidal function are utilized to train the network.
Similar as Example 1, the values of the Lipschitz quotients are , and , , , , , , . The stop criterion ends at ; it implies that the best order of the system is (3,1). Figure 6 shows the values of the Lipschitz quotients for different orders.
In this case, the number of neurons in linear dynamic block is chosen as , and the number of neurons in nonlinear static block is chosen as . According to the convergence analysis, ; therefore, the learning rate can be chosen as . The testing input signal is used to verify the identification performance of the HTNN.

The proposed HTNN was compared to several neural network based identification methods, the Controllable-Canonical-Form-Based Recurrent Neurofuzzy Network (CReNN) [21] and the Dynamic Fuzzy Neural Network (DFNN) [22]. As for the DFNN model, it is a fuzzy model of three rules, where the two-dimensional input space was partitioned to three clusters and a Gaussian membership function was assigned to each cluster. Also, using least square (LS) algorithm, a standard Hammerstein model (SHM) with , , and , which is shown in (1) and (2), was applied to identify the nonlinear system for comparison. As shown in Table 1, the results illustrate that the proposed HTNN has the least number of parameters with lower MSE value. Figure 7 shows the output of the plant, the standard Hammerstein model with least square algorithm (SHM-LS), and the HTNN identifier. It can be seen that the standard Hammerstein model with least square algorithm (SHM-LS) can not identify the complex nonlinear system well.

Example 3. Identification for an industry process is as follows.
The proposed HTNN was also used to identify a typical industry process, a continuous stirred tank reactor (CSTR). An irreversible first-order reaction is part of the CSTR, which has the dimensionless mass and energy balances. The system is described by the following equation [23, 24]: where the description and value of each variable are given in Table 2. Two state variables of the model are the reactant concentration and the reactor temperature . The control objective is to control the reactant concentration , through the manipulation of the coolant temperature . Note that the reactor temperature is not controlled for this simulation. Therefore, the output variable and manipulated variable are given by and , respectively. For practicability, the coolant temperature is constrained to the ranges .

The above model is used to generate a series of input-output time-series data. The sampling time of the process measurements is set to 0.1 min. From Figure 8, the stop criterion ends at , indicating that the best order of the system is (2,1); that is, and . The number of neurons in nonlinear static block . According to the convergence analysis, ; therefore, the learning rate can be chosen as , and the initial parameters in (21)–(23) are the same as Example 1. The nonlinear part of HTNN is sensitive to the data between and ; however, the input values () are within . Therefore, it is necessary to normalize the input signals as follows:

The normalized data is then used to train the HTNN; the training results are shown in Figure 9. The models were then tested on a set of data produced from an input with random amplitude; a common multilayer neural network with 5 hidden neurons [25] was also developed for comparison. As shown in Figure 9, the HTNN gives a good fit to this data.

6. Conclusions

In this paper, by formulated Hammerstein model, a Hammerstein-type neural network was developed for identifying nonlinear SISO systems, where the weights are corresponding with the parameters of Hammerstein model. To determine the model order and the parameters of the Hammerstein model, the Lipschitz quotients and the backpropagation training algorithm were used to determine the model order and update the weights in the network, respectively, and the stability was also analyzed. The HTNN was tested on several nonlinear systems to demonstrate the identification performances.

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

Acknowledgment

The authors would like to acknowledge the funding received from the National Natural Science Foundation of China (no. 61005068), Open Foundation of State Key Laboratory of Robotics of China (no. 2013O09), Specialized Research Fund for the Doctoral Program of Higher Education of China (no. 20124101120001), and Key Project for Science and Technology of the Education Department of Henan Province (no. 14A413009) to conduct this research investigation.