Extreme Learning Machine on High Dimensional and Large Data ApplicationsView this Special Issue
Research Article | Open Access
Dazi Li, Qianwen Xie, Qibing Jin, "Quasilinear Extreme Learning Machine Model Based Internal Model Control for Nonlinear Process", Mathematical Problems in Engineering, vol. 2015, Article ID 181389, 9 pages, 2015. https://doi.org/10.1155/2015/181389
Quasilinear Extreme Learning Machine Model Based Internal Model Control for Nonlinear Process
A new strategy for internal model control (IMC) is proposed using a regression algorithm of quasilinear model with extreme learning machine (QL-ELM). Aimed at the chemical process with nonlinearity, the learning process of the internal model and inverse model is derived. The proposed QL-ELM is constructed as a linear ARX model with a complicated nonlinear coefficient. It shows some good approximation ability and fast convergence. The complicated coefficients are separated into two parts. The linear part is determined by recursive least square (RLS), while the nonlinear part is identified through extreme learning machine. The parameters of linear part and the output weights of ELM are estimated iteratively. The proposed internal model control is applied to CSTR process. The effectiveness and accuracy of the proposed method are extensively verified through numerical results.
Internal model control is to design control strategy based on a kind of mathematical model of the process. Because of its obvious dynamic and static performance, as well as simple structure and strong robustness, internal model control plays an increasingly significant effect in control area [1, 2]. Two crucial problems in the inverse system approach are identification of plant model and determination of controller settings. For a complex nonlinear system, it is difficult to obtain an accurate internal model and its inverse model. In recent years, much effort has been devoted to nonlinear system modeling based on artificial neural networks (NNs) and support vector machine (SVM) [3–5]. It is widely applied to use the solution of trained inverse model as a nonlinear controller . However, the disadvantages of the dynamic gradient method lie in its long training time, minor update of weights, and high probability of training failure. SVM method based on standard optimization often suffers from parameter adjustment difficulties. Moreover, the update information based on errors in internal model and inverse model also leads to decrease of the control performance .
To deal with the above problems, extreme learning machine (ELM) proposed by Huang et al. [8–10] shows great advantages. Its simplified neural network structure makes the learning speed fast. A smaller training error can be obtained via a canonical equation. The advantage of ELM is its low computational effort and high generalization ability. Therefore, ELM has been successfully applied in many areas, such as classification of EEG signals and protein sequence , building regression model [12, 13], and fault diagnosis [14, 15].
For the nonlinear modeling, the key point is to find a suitable model structure. Volterra model is a kind of crucial nonlinear system model . It provides an elaborate mathematical description for a great many of nonlinear systems. Recently, some researchers proposed the proof of inverse theory for IMC based on Volterra model . However, the obvious shortcoming that limits its application is its high complexity in the identification of kernel function.
In recent years, some block-oriented models have been proposed and applied widely, such as Wiener model and Hammerstein model [18–21] which consist of a static nonlinear function and a linear dynamic subsystem. Both of them are of simple structures and can be used to identify some highly nonlinear process, such as pH neutralization process  and fermentation process . However, sometimes it is difficult to separate the system concerned into a linear dynamic block and a memoryless nonlinear one. Another class of methods based on local linearization of the structure, combining the nonlinear nonparametric models with some conventional statistical models, has achieved some great results. McLoone et al.  proposed an off-line hybrid training algorithm for feed-forward neural networks. Peng et al. [25, 26] proposed hybrid pseudolinear RBF-AR, RBF-ARX models. A cascaded structure of the ARX-NN model is proposed by Hu et al. . The idea of these two different classes of methods is to separate the linear and nonlinear identification, so as to facilitate the inverse computation.
However, these models show high nonlinear characteristics, which are difficult to analysis in theory, without exploiting some good linearity properties. It is well known that simple structure, such as ARX model, has a lot of advantages in modeling. Firstly, its linear properties will significantly simplify the parameter estimation. Secondly, it is convenient to deduce regression predictor. Furthermore, linearity structure is also convenient for control design as well as the control law derivation. A good representation cannot only approximate the nonlinear function accurately, but also simplify the identification process. Hu et al. [28, 29] proposed a quasilinear model constructed by a linear structure using a quasilinear ARX model for nonlinear process mapping. From a macrostandpoint, the model can be seen as a linear structure which is a redundant for the regression ability. Its complex coefficients reflect the nonlinearity of the system. The model has a great flexibility to deal with the system nonlinearity.
Inspired by this kind of quasilinear ARX model as well as the thought of separate identification, the motivation of this paper is intended to propose a class of quasilinear ELM model, which can be separated into a linear part and a nonlinear kernel part. It cannot only identify ordinary nonlinear system, but also simplify the identification process via separating the model complexity. In this paper, a novel internal model control based on quasilinear-ELM (QL-ELM) structure is proposed for CSTR system. Taking advantage of separate identification, the quasilinear model consists of a linear part and a nonlinear kernel part. The parameters of nonlinear part are estimated by ELM, which increases the flexibility of the model. The linear parameters are estimated by using the RLS method. A recursive algorithm is conducted to estimate the parameters in both parts. Moreover, QL-ELM is used to set up the internal and inverse model of nonlinear CSTR systems. Through the establishment of the inverse model, the control action is obtained to achieve fixed-point control and tracking control of concentration. Taking the advantage of its characteristics of high modeling accuracy and less human interference, the closed-loop system control is more stable and has less steady-state deviation. Simulation results demonstrate the dynamic performance and tracking ability of the proposed QL-ELM based IMC strategy.
This paper is organized in six sections. Following the introduction, the traditional extreme learning machine is illustrated in Section 2. In Section 3 the algorithm of QL-ELM is presented. IMC with QL-ELM is described in Section 4. To show the applicability of the proposed method, simulations results for CSTR are presented in Section 5. Finally, the conclusion is presented in Section 6.
2. Extreme Learning Machine: Basic Principles
For the input nodes and output nodes the single-hidden layer feed-forward neural networks (SLFNs) with hidden nodes and activation function can be expressed aswhere is the vector of weights between hidden layer and the output nodes. is the vector of weights between input vectors and hidden layer. In addition, is the bias of the th hidden node. ELM with wide types of activation functions can get high regression accuracy. Unlike other traditional implementations, the input weights and biases are randomly chosen in extreme learning machine. The output of the hidden layer is written as a matrix , and (1) can be rewritten aswhere
With the theorems proposed in [8, 9], the input weights and the hidden layer biases are randomly generated without further tuning. It is the main idea of the ELM that training problem is simplified to find a least square solution. According to the Moore-Penrose generalized inverse theory, the output can be calculated by using the following equation:
It must be the smallest norm solution among all the solutions. The one step algorithm can produce best generalization performance and learn much faster than traditional learning algorithms. It can also avoid local optimum.
3. The Quasilinear ELM Model Treatment
A quasilinear ELM model can be seen as a SLFN embedded in the coefficients of a linear model. The feature of the quasilinear ELM model is that it has both good approximation abilities and easy-to-use properties. For a nonlinear SISO system described bywhere . is the regression vector, , are the order of the system. , . is a stochastic noise with zero-mean.
Assume that is continuous and differentiable at a small region around . By using Taylor equation , can be further expanded as
Set ; . Picking up common factor from (6), then the following can be obtained:where . It can be seen as the coefficient of nonlinear function .
The quasilinear model has a linear structure ARX model with a functional coefficient . It can be separated into a nonlinear part and a linear part described aswhere .
For case of near linear system, nonlinear part is the supplement for nonlinear feature, so good regression results can be achieved. For case of the nonlinear system, nonlinear network as an interpolated coefficient can be used to expend the regression space. Equation (10) can be seen as the linear form with a nonlinear coefficient , which is actually a problem of function approximation from a multidimensional input space into a one-dimensional scalar space . Using ELM to estimate nonlinear part parameters will be more convenient and concise. Replacing by ELM, the model in (7) can be rewritten aswhere ; then the quasilinear ELM model can be further expressed aswhere the activation function is chosen as . The whole identification process based on QL-ELM is described in Figure 1, where , and are orders of the input and output, and are weight matrices of the input and output layer, is bias vector of hidden nodes, and is the parameter of linear part and also can be seen as the bias vector of output nodes. The parameters of two submodels are updated during each iterative process until the ultimate goal to make the error between the output of actual model and the QL-ELM model minimized. The deviation between and is used to update the nonlinear part through ELM learning. The deviation between and is used to update the linear part through recursive least squares.
The whole process is to make the error between actual output and model output minimized. In this paper a hierarchical iterative algorithm is considered for quasilinear model.
For linear part, at every iteration, the following RLS is used to minimize the sum of squared residuals avoiding the problems of local optimal and overfitting.
For nonlinear part, weights of input layer and biases are fixed and the training error is minimized through ELM learning. Then the weights of output layer are calculated as . The linear parameter also can be regarded as noise. ELM method has capability of interference suppression and rapidity. Because the linear form of the model disperses the complexity of the nonlinear process, the computation of nonlinearity estimation can be simplified at every iteration. It means that less hidden nodes are required to avoid the overfitting problem in some extent. Using the QL-ELM model to identify the reversible model system can improve the identification accuracy and system performance.
4. IMC with QL-ELM
In the nonlinear IMC, the nonlinear model and its inverse play an important role. In this study, QL-ELM is employed as both internal model and inverse model controller. The basic structure of QL-ELM based IMC is described as block diagram of Figure 2. There are four parts for the unknown nonlinear discrete control systems. is the nonlinear plant; QL-ELM model is employed as inverse model controller () and internal model (). In particular, the additional filter cannot only increase the physical realization of the controller, but also improve the robustness of the system. It can effectively solve problems caused by model mismatch .
4.1. Establishment of Internal Model
For a nonlinear plant described bywhere and is the order of output and input vector. The input vector and output vector are used as samples to set up the internal model via QL-ELM. , is the training set. The learning of QL-ELM is implemented by the following steps.
Step 1 (initialization). Choose the order of the regression vector , . Set to zero and the number of nodes in hidden layer and nonlinear parameters , , to some small values randomly. The number of iteration is set to .
Step 2. Update the linear part using deviation and estimate using (9).
Step 3. Update the nonlinear part using and estimate using (10).
Step 4. Turn to Step 2, and set until the training error reaches minimum.
4.2. Establishment of Inverse Model
The controller of IMC is the inverse model of the nonlinear process which is equivalent to finding the inverse of the system at given frequencies. Therefore the reversibility of the model must be considered in advance.
Theorem 1. For the above nonlinear system, if the is monotone function to , the system is reversible. Or for any given two inputs , if holds, the system is reversible .
Assume a SISO nonlinear process is described in (13); the inverse model is established aswhere is nonlinear function of inverse model. According to Theorem 1 the process of (14) is monotone and reversible. Training the QL-ELM can establish the inverse model of system. The input and output vectors are and , respectively. Because the value of is unknown, the output of filter replaces the value in the above formula. Training process of inverse model is the same as the internal model.
5. Numerical Results
A typical representative of nonlinear system in chemical processes is CSTR system. The system has multiple equilibrium points (stable and unstable ones). Its dynamical behavior exhibits some complex features depending on system parameters. In this study, the dynamic behavior is described by the following differential equations [32, 33]:where and represent the dimensionless reactant concentration and reactor temperature, respectively; and denote the system disturbances. The control action is the cooling jacket temperature. The model parameters are shown in Table 1. The model has three equilibrium points, where and are stable points and is unstable point. The reactant concentration is chosen as the controlled variable. The resulting control problem is nonlinear. Therefore, training of the models has to be restricted to a region where inverse mapping is unique to ensure the reversibility.
The initial condition is set as , and . The fourth-order Runge-Kutta algorithm is used to calculate this model with the integral step size . The number of hidden nodes is 80. In the simulation, modelling error caused by the lack of the training sample will lead to the residual of the control system. Therefore, some steady-state data around the stable point are added as training samples. Results of model based on QL-ELM are compared with those of ELM, SVM, and QL-SVM methods. In detail, the number of hidden nodes in QL-ELM is reduced by 40. The optimal parameters of SVM and QL-SVM with RBF kernel are selected using the cross-validation. For the SVM method, the scale parameters in internal model are set as the penalty factor , the variance in RBF kernel function , and the epsilon in loss function of SVR . In the inverse model the parameters are , , and . For the QL-SVM method the parameters in internal model and inverse model are , , and , , , respectively.
For the internal model, the order of internal model is set as ; 2000 groups of samples are chosen as the training data and 500 groups of samples are chosen as test data.
For the inverse model ; ; 2000 groups of data are chosen for inverse model training so as to get the controller, and the remaining 500 groups of data are chosen for the inverse model test.
The performance of modelling is measured by the root mean square error (RMSE), and the indicator can be expressed by
Identification results of the internal model and inverse model with the QL-ELM and its corresponding error are shown in Figures 3 and 7. In addition, results and corresponding modeling error of the comparative ELM method are shown in the Figures 4 and 8, respectively. Validation results of SVM and QL-SVM based internal model are shown in Figure 5(a) and their corresponding errors are shown in Figures 5(b) and 5(c). The enlarged detail of Figure 5(a) is shown in Figure 6. Similarly, validation results of SVM and QL-SVM based inverse model are shown in Figure 9(a) and their corresponding errors are shown in Figures 9(b) and 9(c). The enlarged detail of Figure 9(a) is shown in Figure 10. Obviously, both in the internal model and in inverse model identification, errors of the proposed method are smaller than other methods. Quasilinear method can get a better generalization performance compared with normal method. In addition, ELM method reduces the regression error. Combining with both advantages, the proposed QL-ELM method provides high precision and better generalization. To illustrate the effectiveness of the proposed method, the measurable indicator of different identification methods are listed in Table 2. In the comparisons of the time performance, it is worth mentioning that, for SVM, the time means once running time for SVM with the optimal parameter, without considering the time of parameters adjustment. It is obvious that QL-ELM could provide higher precision and less time consumption than other methods.
(a) QL-ELM method output and the actual output of internal model
(b) Error of QL-ELM internal model
(a) ELM method output and the actual output of internal model
(b) Error of ELM internal model
(a) Identification results of internal model using SVM and QL-SVM
(b) Test error of internal model using SVM
(c) Test error of internal model using QL-SVM
(a) QL-ELM method output and the actual output of inverse model
(b) Error of QL-ELM inverse model
(a) ELM method output and the actual output of inverse model
(b) Error of ELM inverse model
(a) Identification results of inverse model using SVM and QL-SVM
(b) Test error of inverse model using SVM
(c) Test error of inverse model using QL-SVM
5.1. Control under Set-Point Change
The set-point is given as follows: for , for , for , and for . The initial condition of nonlinear CSTR process is designed as , respectively. Besides, the first order filter is , where . Simulation results by IMC based on QL-ELM, ELM, and SVM are displayed in Figure 11. It can be easily seen that the proposed method has faster regulation time, smaller overshoot, and smaller steady-state error. Therefore, it is not difficult to conclude that the proposed method is superior to the ELM and SVM method in this control case.
(a) Control output using IMC based on QL-ELM, ELM, and SVM
(b) Error of output using IMC based on QL-ELM, ELM, and SVM
5.2. Tracking Control for a Sinusoidal Wave Input
In this case, the nonlinear system output response to track a desired sinusoidal function is . The first order filter is , where . All the parameters settings for CSTR system are the same as Section 5.1. Comparisons of output for tracking control and error of output by the proposed method and ELM method are displayed in Figure 12. It is clearly seen that the system output can track the desired sinusoidal function perfectly. It can be concluded that the proposed method gives better tracking performance.
(a) Control output using IMC based on QL-ELM, ELM
(b) Error of output using IMC based on QL-ELM, ELM
Although QL-ELM model reveals better approximation performance in all the above experiments, the proposed method still lacks versatility in extremely strong nonlinear process modelling. The shortage is caused by its essential linear structure.
Nonlinear control strategy by QL-ELM based IMC is proposed and tested on the concentrate control of SISO CSTR process. The internal model and inverse model controller are learned using QL-ELM regression algorithm to improve the modeling accuracy. The proposed QL-ELM method represents a great flexibility and simplicity via a quasilinear ARX model with ELM coefficient. It cannot only approximate continuous nonlinear function but also separate the nonlinear complexity with less hidden neurons in ELM. Taking advantage of the high accuracy of QL-ELM modeling as well as increased training samples around stable points, the steady-state error is deduced in IMC system. Simulation results reveal the superiority of the proposed method in good tracking ability and stability nonlinear process control.
Conflict of Interests
The authors declare that they have no conflict of interests regarding the publication of this paper.
This research was supported by the National Natural Science Foundation of China (Grant no. 61273132, 61104098, 61473024).
- H. Deng and H.-X. Li, “A novel neural approximate inverse control for unknown nonlinear discrete dynamical systems,” IEEE Transactions on Systems, Man, and Cybernetics Part B: Cybernetics, vol. 35, no. 1, pp. 115–123, 2005.
- H. Yu, H. R. Karimi, and X. Zhu, “Research of smart car’s speed control based on the internal model control,” Abstract and Applied Analysis, vol. 2014, Article ID 274293, 5 pages, 2014.
- I. Rivals and L. Personnaz, “Nonlinear internal model control using neural networks: application to processes with delay and design issues,” IEEE Transactions on Neural Networks, vol. 11, no. 1, pp. 80–90, 2000.
- Y.-N. Wang and X.-F. Yuan, “SVM approximate-based internal model control strategy,” Acta Automatica Sinica, vol. 34, no. 2, pp. 172–179, 2008.
- I. Rivals and L. Personnaz, “Nonlinear internal model control using neural networks: application to processes with delay and design issues,” IEEE Transactions on Neural Networks, vol. 11, no. 1, pp. 80–90, 2000.
- H.-X. Li and H. Deng, “An approximate internal model-based neural control for unknown nonlinear discrete processes,” IEEE Transactions on Neural Networks, vol. 17, no. 3, pp. 659–670, 2006.
- Y. Huang and D. Wu, “Nonlinear internal model control with inverse model based on extreme learning machine,” in Proceedings of the International Conference on Electric Information and Control Engineering (ICEICE '11), pp. 2391–2395, Wuhan, China, April 2011.
- G.-B. Huang, Q.-Y. Zhu, and C.-K. Siew, “Extreme learning machine: theory and applications,” Neurocomputing, vol. 70, no. 1–3, pp. 489–501, 2006.
- G.-B. Huang and L. Chen, “Convex incremental extreme learning machine,” Neurocomputing, vol. 70, no. 16–18, pp. 3056–3062, 2007.
- G.-B. Huang and L. Chen, “Enhanced random search based incremental extreme learning machine,” Neurocomputing, vol. 71, no. 16–18, pp. 3460–3468, 2008.
- Y. Song and J. Zhang, “Automatic recognition of epileptic EEG patterns via Extreme Learning Machine and multiresolution feature extraction,” Expert Systems with Applications, vol. 40, no. 14, pp. 5477–5489, 2013.
- Y. Chen, Z. Zhao, S. Wang, and Z. Chen, “Extreme learning machine-based device displacement free activity recognition model,” Soft Computing, vol. 16, no. 9, pp. 1617–1625, 2012.
- J. Cao, T. Chen, and J. Fan, “Fast online learning algorithm for landmark recognition based on BoW framework,” in Proceedings of the 9th IEEE Conference on Industrial Electronics and Applications, pp. 1163–1168, 2014.
- P. K. Wong, Z. Yang, C. M. Vong, and J. Zhong, “Real-time fault diagnosis for gas turbine generator systems using extreme learning machine,” Neurocomputing, vol. 128, pp. 249–257, 2014.
- G. Wang, Y. Zhao, and D. Wang, “A protein secondary structure prediction framework based on the Extreme Learning Machine,” Neurocomputing, vol. 72, no. 1–3, pp. 262–268, 2008.
- R. K. Pearson and B. A. Ogunnaike, Identification and Control Using Volterra Models, Springer, 2002.
- K. T. Iskakov and Z. O. Oralbekova, “Resolving power of algorithm for solving the coefficient inverse problem for the geoelectric equation,” Mathematical Problems in Engineering, vol. 2014, Article ID 545689, 9 pages, 2014.
- S. I. Biagiola and J. L. Figueroa, “Identification of uncertain MIMO Wiener and Hammerstein models,” Computers & Chemical Engineering, vol. 35, no. 12, pp. 2867–2875, 2011.
- Y. Tang, Z. Li, and X. Guan, “Identification of nonlinear system using extreme learning machine based Hammerstein model,” Communications in Nonlinear Science and Numerical Simulation, vol. 19, no. 9, pp. 3171–3183, 2014.
- A. Wills, T. B. Schön, L. Ljung, and B. Ninness, “Identification of Hammerstein-Wiener models,” Automatica, vol. 49, no. 1, pp. 70–81, 2013.
- D. Wang and F. Ding, “Extended stochastic gradient identification algorithms for Hammerstein-Wiener ARMAX systems,” Computers & Mathematics with Applications, vol. 56, no. 12, pp. 3157–3164, 2008.
- J. G. Smith, S. Kamat, and K. P. Madhavan, “Modeling of pH process using wavenet based Hammerstein model,” Journal of Process Control, vol. 17, no. 6, pp. 551–561, 2007.
- L. Zhou, X. Li, and F. Pan, “Gradient-based iterative identification for MISO Wiener nonlinear systems: application to a glutamate fermentation process,” Applied Mathematics Letters, vol. 26, no. 8, pp. 886–892, 2013.
- S. McLoone, M. D. Brown, G. Irwin, and G. Lightbody, “A hybrid linear/nonlinear training algorithm for feedforward neural networks,” IEEE Transactions on Neural Networks, vol. 9, no. 4, pp. 669–684, 1998.
- H. Peng, T. Ozaki, V. Haggan-Ozaki, and Y. Toyoda, “A parameter optimization method for radial basis function type models,” IEEE Transactions on Neural Networks, vol. 14, no. 2, pp. 432–438, 2003.
- H. Peng, T. Ozaki, Y. Toyoda et al., “RBF-ARX model-based nonlinear system modeling and predictive control with application to a NOx decomposition process,” Control Engineering Practice, vol. 12, no. 2, pp. 191–203, 2004.
- B. Hu, Z. Zhao, and J. Liang, “Multi-loop nonlinear internal model controller design under nonlinear dynamic PLS framework using ARX-neural network model,” Journal of Process Control, vol. 22, no. 1, pp. 207–217, 2012.
- J. Hu, K. Kumamaru, and K. Inoue, “A hybrid quasi-ARMAX modeling scheme for identification of nonlinear systems,” Transactions of the Society of Instrument and Control Engineers, vol. 34, no. 8, pp. 977–985, 1998.
- Y. Cheng and J. Hu, “Nonlinear system identification based on SVR with quasi-linear kernel,” in Proceedings of the Annual International Joint Conference on Neural Networks (IJCNN '12), pp. 1–8, June 2012.
- G. Prasad, E. Swidenbank, and B. W. Hogg, “A local model networks based multivariable long-range predictive control strategy for thermal power plants,” Automatica, vol. 34, no. 10, pp. 1185–1204, 1998.
- Y. Zhang and Z. Zheng, “Based on inverse system of internal model control,” in Proceedings of the International Conference on Computer Application and System Modeling (ICCASM '10), pp. V14–19–V14–21, Taiyuan, China, October 2010.
- C.-T. Chen and S.-T. Peng, “Intelligent process control using neural fuzzy techniques,” Journal of Process Control, vol. 9, no. 6, pp. 493–503, 1999.
- C.-T. Chen and S.-T. Peng, “Learning control of process systems with hard input constraints,” Journal of Process Control, vol. 9, no. 2, pp. 151–160, 1999.
Copyright © 2015 Dazi Li et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.