Abstract

A new strategy for internal model control (IMC) is proposed using a regression algorithm of quasilinear model with extreme learning machine (QL-ELM). Aimed at the chemical process with nonlinearity, the learning process of the internal model and inverse model is derived. The proposed QL-ELM is constructed as a linear ARX model with a complicated nonlinear coefficient. It shows some good approximation ability and fast convergence. The complicated coefficients are separated into two parts. The linear part is determined by recursive least square (RLS), while the nonlinear part is identified through extreme learning machine. The parameters of linear part and the output weights of ELM are estimated iteratively. The proposed internal model control is applied to CSTR process. The effectiveness and accuracy of the proposed method are extensively verified through numerical results.

1. Introduction

Internal model control is to design control strategy based on a kind of mathematical model of the process. Because of its obvious dynamic and static performance, as well as simple structure and strong robustness, internal model control plays an increasingly significant effect in control area [1, 2]. Two crucial problems in the inverse system approach are identification of plant model and determination of controller settings. For a complex nonlinear system, it is difficult to obtain an accurate internal model and its inverse model. In recent years, much effort has been devoted to nonlinear system modeling based on artificial neural networks (NNs) and support vector machine (SVM) [35]. It is widely applied to use the solution of trained inverse model as a nonlinear controller [6]. However, the disadvantages of the dynamic gradient method lie in its long training time, minor update of weights, and high probability of training failure. SVM method based on standard optimization often suffers from parameter adjustment difficulties. Moreover, the update information based on errors in internal model and inverse model also leads to decrease of the control performance [7].

To deal with the above problems, extreme learning machine (ELM) proposed by Huang et al. [810] shows great advantages. Its simplified neural network structure makes the learning speed fast. A smaller training error can be obtained via a canonical equation. The advantage of ELM is its low computational effort and high generalization ability. Therefore, ELM has been successfully applied in many areas, such as classification of EEG signals and protein sequence [11], building regression model [12, 13], and fault diagnosis [14, 15].

For the nonlinear modeling, the key point is to find a suitable model structure. Volterra model is a kind of crucial nonlinear system model [16]. It provides an elaborate mathematical description for a great many of nonlinear systems. Recently, some researchers proposed the proof of inverse theory for IMC based on Volterra model [17]. However, the obvious shortcoming that limits its application is its high complexity in the identification of kernel function.

In recent years, some block-oriented models have been proposed and applied widely, such as Wiener model and Hammerstein model [1821] which consist of a static nonlinear function and a linear dynamic subsystem. Both of them are of simple structures and can be used to identify some highly nonlinear process, such as pH neutralization process [22] and fermentation process [23]. However, sometimes it is difficult to separate the system concerned into a linear dynamic block and a memoryless nonlinear one. Another class of methods based on local linearization of the structure, combining the nonlinear nonparametric models with some conventional statistical models, has achieved some great results. McLoone et al. [24] proposed an off-line hybrid training algorithm for feed-forward neural networks. Peng et al. [25, 26] proposed hybrid pseudolinear RBF-AR, RBF-ARX models. A cascaded structure of the ARX-NN model is proposed by Hu et al. [27]. The idea of these two different classes of methods is to separate the linear and nonlinear identification, so as to facilitate the inverse computation.

However, these models show high nonlinear characteristics, which are difficult to analysis in theory, without exploiting some good linearity properties. It is well known that simple structure, such as ARX model, has a lot of advantages in modeling. Firstly, its linear properties will significantly simplify the parameter estimation. Secondly, it is convenient to deduce regression predictor. Furthermore, linearity structure is also convenient for control design as well as the control law derivation. A good representation cannot only approximate the nonlinear function accurately, but also simplify the identification process. Hu et al. [28, 29] proposed a quasilinear model constructed by a linear structure using a quasilinear ARX model for nonlinear process mapping. From a macrostandpoint, the model can be seen as a linear structure which is a redundant for the regression ability. Its complex coefficients reflect the nonlinearity of the system. The model has a great flexibility to deal with the system nonlinearity.

Inspired by this kind of quasilinear ARX model as well as the thought of separate identification, the motivation of this paper is intended to propose a class of quasilinear ELM model, which can be separated into a linear part and a nonlinear kernel part. It cannot only identify ordinary nonlinear system, but also simplify the identification process via separating the model complexity. In this paper, a novel internal model control based on quasilinear-ELM (QL-ELM) structure is proposed for CSTR system. Taking advantage of separate identification, the quasilinear model consists of a linear part and a nonlinear kernel part. The parameters of nonlinear part are estimated by ELM, which increases the flexibility of the model. The linear parameters are estimated by using the RLS method. A recursive algorithm is conducted to estimate the parameters in both parts. Moreover, QL-ELM is used to set up the internal and inverse model of nonlinear CSTR systems. Through the establishment of the inverse model, the control action is obtained to achieve fixed-point control and tracking control of concentration. Taking the advantage of its characteristics of high modeling accuracy and less human interference, the closed-loop system control is more stable and has less steady-state deviation. Simulation results demonstrate the dynamic performance and tracking ability of the proposed QL-ELM based IMC strategy.

This paper is organized in six sections. Following the introduction, the traditional extreme learning machine is illustrated in Section 2. In Section 3 the algorithm of QL-ELM is presented. IMC with QL-ELM is described in Section 4. To show the applicability of the proposed method, simulations results for CSTR are presented in Section 5. Finally, the conclusion is presented in Section 6.

2. Extreme Learning Machine: Basic Principles

For the input nodes and output nodes    the single-hidden layer feed-forward neural networks (SLFNs) with hidden nodes and activation function can be expressed aswhere is the vector of weights between hidden layer and the output nodes. is the vector of weights between input vectors and hidden layer. In addition, is the bias of the th hidden node. ELM with wide types of activation functions can get high regression accuracy. Unlike other traditional implementations, the input weights and biases are randomly chosen in extreme learning machine. The output of the hidden layer is written as a matrix , and (1) can be rewritten aswhere

With the theorems proposed in [8, 9], the input weights and the hidden layer biases are randomly generated without further tuning. It is the main idea of the ELM that training problem is simplified to find a least square solution. According to the Moore-Penrose generalized inverse theory, the output can be calculated by using the following equation:

It must be the smallest norm solution among all the solutions. The one step algorithm can produce best generalization performance and learn much faster than traditional learning algorithms. It can also avoid local optimum.

3. The Quasilinear ELM Model Treatment

A quasilinear ELM model can be seen as a SLFN embedded in the coefficients of a linear model. The feature of the quasilinear ELM model is that it has both good approximation abilities and easy-to-use properties. For a nonlinear SISO system described bywhere .   is the regression vector, , are the order of the system. , . is a stochastic noise with zero-mean.

Assume that is continuous and differentiable at a small region around . By using Taylor equation [30], can be further expanded as

Set ; . Picking up common factor from (6), then the following can be obtained:where . It can be seen as the coefficient of nonlinear function .

The quasilinear model has a linear structure ARX model with a functional coefficient . It can be separated into a nonlinear part and a linear part described aswhere .

For case of near linear system, nonlinear part is the supplement for nonlinear feature, so good regression results can be achieved. For case of the nonlinear system, nonlinear network as an interpolated coefficient can be used to expend the regression space. Equation (10) can be seen as the linear form with a nonlinear coefficient , which is actually a problem of function approximation from a multidimensional input space into a one-dimensional scalar space . Using ELM to estimate nonlinear part parameters will be more convenient and concise. Replacing by ELM, the model in (7) can be rewritten aswhere ; then the quasilinear ELM model can be further expressed aswhere the activation function is chosen as . The whole identification process based on QL-ELM is described in Figure 1, where , and are orders of the input and output, and are weight matrices of the input and output layer, is bias vector of hidden nodes, and is the parameter of linear part and also can be seen as the bias vector of output nodes. The parameters of two submodels are updated during each iterative process until the ultimate goal to make the error between the output of actual model and the QL-ELM model minimized. The deviation between and is used to update the nonlinear part through ELM learning. The deviation between and is used to update the linear part through recursive least squares.

The whole process is to make the error between actual output and model output minimized. In this paper a hierarchical iterative algorithm is considered for quasilinear model.

For linear part, at every iteration, the following RLS is used to minimize the sum of squared residuals avoiding the problems of local optimal and overfitting.

For nonlinear part, weights of input layer and biases are fixed and the training error is minimized through ELM learning. Then the weights of output layer are calculated as . The linear parameter also can be regarded as noise. ELM method has capability of interference suppression and rapidity. Because the linear form of the model disperses the complexity of the nonlinear process, the computation of nonlinearity estimation can be simplified at every iteration. It means that less hidden nodes are required to avoid the overfitting problem in some extent. Using the QL-ELM model to identify the reversible model system can improve the identification accuracy and system performance.

4. IMC with QL-ELM

In the nonlinear IMC, the nonlinear model and its inverse play an important role. In this study, QL-ELM is employed as both internal model and inverse model controller. The basic structure of QL-ELM based IMC is described as block diagram of Figure 2. There are four parts for the unknown nonlinear discrete control systems. is the nonlinear plant; QL-ELM model is employed as inverse model controller () and internal model (). In particular, the additional filter cannot only increase the physical realization of the controller, but also improve the robustness of the system. It can effectively solve problems caused by model mismatch [31].

4.1. Establishment of Internal Model

For a nonlinear plant described bywhere and is the order of output and input vector. The input vector and output vector are used as samples to set up the internal model via QL-ELM. , is the training set. The learning of QL-ELM is implemented by the following steps.

Step 1 (initialization). Choose the order of the regression vector , . Set to zero and the number of nodes in hidden layer and nonlinear parameters , , to some small values randomly. The number of iteration is set to .

Step 2. Update the linear part using deviation and estimate using (9).

Step 3. Update the nonlinear part using and estimate using (10).

Step 4. Turn to Step 2, and set until the training error reaches minimum.

4.2. Establishment of Inverse Model

The controller of IMC is the inverse model of the nonlinear process which is equivalent to finding the inverse of the system at given frequencies. Therefore the reversibility of the model must be considered in advance.

Theorem 1. For the above nonlinear system, if the is monotone function to , the system is reversible. Or for any given two inputs , if holds, the system is reversible [7].

Assume a SISO nonlinear process is described in (13); the inverse model is established aswhere is nonlinear function of inverse model. According to Theorem 1 the process of (14) is monotone and reversible. Training the QL-ELM can establish the inverse model of system. The input and output vectors are and , respectively. Because the value of is unknown, the output of filter replaces the value in the above formula. Training process of inverse model is the same as the internal model.

5. Numerical Results

A typical representative of nonlinear system in chemical processes is CSTR system. The system has multiple equilibrium points (stable and unstable ones). Its dynamical behavior exhibits some complex features depending on system parameters. In this study, the dynamic behavior is described by the following differential equations [32, 33]:where and represent the dimensionless reactant concentration and reactor temperature, respectively; and denote the system disturbances. The control action is the cooling jacket temperature. The model parameters are shown in Table 1. The model has three equilibrium points, where and are stable points and is unstable point. The reactant concentration is chosen as the controlled variable. The resulting control problem is nonlinear. Therefore, training of the models has to be restricted to a region where inverse mapping is unique to ensure the reversibility.

The initial condition is set as , and . The fourth-order Runge-Kutta algorithm is used to calculate this model with the integral step size . The number of hidden nodes is 80. In the simulation, modelling error caused by the lack of the training sample will lead to the residual of the control system. Therefore, some steady-state data around the stable point are added as training samples. Results of model based on QL-ELM are compared with those of ELM, SVM, and QL-SVM methods. In detail, the number of hidden nodes in QL-ELM is reduced by 40. The optimal parameters of SVM and QL-SVM with RBF kernel are selected using the cross-validation. For the SVM method, the scale parameters in internal model are set as the penalty factor , the variance in RBF kernel function , and the epsilon in loss function of SVR . In the inverse model the parameters are , , and . For the QL-SVM method the parameters in internal model and inverse model are , , and , , , respectively.

For the internal model, the order of internal model is set as ; 2000 groups of samples are chosen as the training data and 500 groups of samples are chosen as test data.

For the inverse model ; ; 2000 groups of data are chosen for inverse model training so as to get the controller, and the remaining 500 groups of data are chosen for the inverse model test.

The performance of modelling is measured by the root mean square error (RMSE), and the indicator can be expressed by

Identification results of the internal model and inverse model with the QL-ELM and its corresponding error are shown in Figures 3 and 7. In addition, results and corresponding modeling error of the comparative ELM method are shown in the Figures 4 and 8, respectively. Validation results of SVM and QL-SVM based internal model are shown in Figure 5(a) and their corresponding errors are shown in Figures 5(b) and 5(c). The enlarged detail of Figure 5(a) is shown in Figure 6. Similarly, validation results of SVM and QL-SVM based inverse model are shown in Figure 9(a) and their corresponding errors are shown in Figures 9(b) and 9(c). The enlarged detail of Figure 9(a) is shown in Figure 10. Obviously, both in the internal model and in inverse model identification, errors of the proposed method are smaller than other methods. Quasilinear method can get a better generalization performance compared with normal method. In addition, ELM method reduces the regression error. Combining with both advantages, the proposed QL-ELM method provides high precision and better generalization. To illustrate the effectiveness of the proposed method, the measurable indicator of different identification methods are listed in Table 2. In the comparisons of the time performance, it is worth mentioning that, for SVM, the time means once running time for SVM with the optimal parameter, without considering the time of parameters adjustment. It is obvious that QL-ELM could provide higher precision and less time consumption than other methods.

5.1. Control under Set-Point Change

The set-point is given as follows: for , for , for , and for . The initial condition of nonlinear CSTR process is designed as , respectively. Besides, the first order filter is , where . Simulation results by IMC based on QL-ELM, ELM, and SVM are displayed in Figure 11. It can be easily seen that the proposed method has faster regulation time, smaller overshoot, and smaller steady-state error. Therefore, it is not difficult to conclude that the proposed method is superior to the ELM and SVM method in this control case.

5.2. Tracking Control for a Sinusoidal Wave Input

In this case, the nonlinear system output response to track a desired sinusoidal function is . The first order filter is , where . All the parameters settings for CSTR system are the same as Section 5.1. Comparisons of output for tracking control and error of output by the proposed method and ELM method are displayed in Figure 12. It is clearly seen that the system output can track the desired sinusoidal function perfectly. It can be concluded that the proposed method gives better tracking performance.

Although QL-ELM model reveals better approximation performance in all the above experiments, the proposed method still lacks versatility in extremely strong nonlinear process modelling. The shortage is caused by its essential linear structure.

6. Conclusion

Nonlinear control strategy by QL-ELM based IMC is proposed and tested on the concentrate control of SISO CSTR process. The internal model and inverse model controller are learned using QL-ELM regression algorithm to improve the modeling accuracy. The proposed QL-ELM method represents a great flexibility and simplicity via a quasilinear ARX model with ELM coefficient. It cannot only approximate continuous nonlinear function but also separate the nonlinear complexity with less hidden neurons in ELM. Taking advantage of the high accuracy of QL-ELM modeling as well as increased training samples around stable points, the steady-state error is deduced in IMC system. Simulation results reveal the superiority of the proposed method in good tracking ability and stability nonlinear process control.

Conflict of Interests

The authors declare that they have no conflict of interests regarding the publication of this paper.

Acknowledgment

This research was supported by the National Natural Science Foundation of China (Grant no. 61273132, 61104098, 61473024).