Machine Learning with Applications to Autonomous SystemsView this Special Issue
Research Article | Open Access
Twin Support Vector Machine Method for Identification of Wiener Models
Twin support vector regression is applied to identify nonlinear Wiener system, consisting of a linear dynamic block in series with static nonlinearity. The linear block is expanded in terms of basis functions, such as Laguerre or Kautz filters, and the static nonlinear block is determined using twin support vector machine regression. Simulation of a control valve model and pH neutralization process have been presented to show the features of the proposed algorithm over support vector machine based algorithm.
The goal of system identification is to find a model, within a selected class of models, that produces the best predictions of a system’s output. In general, one forms a cost function that depends on some norm of the prediction errors and finds the model that minimizes this cost function. Since the model is an approximation to the true system, there is a trade-off between the complexity of the structure of the model and the accuracy of its predictions. In many cases, linear models can be used to produce accurate predictions of a system behavior, particularly, if it is restricted to operating within a narrow region. However, if the model is required to cover a broader operating region, then a nonlinear model may be required . Block structured models, cascades of static nonlinearities and dynamic linear systems, are often a good trade-off as they can represent some dynamic nonlinear systems very accurately but are nonetheless quite simple. Common nonlinear models of this type are the Wiener and Hammerstein models . Many algorithms have been proposed to identify Wiener models [3–11]. As one might notice from these researches, the extensive knowledge about linear time invariant (LTI) system representations was applied to the dynamic linear blocks. On the other hand, finding an effective representation for the nonlinearity is an active area of research. Traditionally, the nonlinearity is represented by a polynomial because it is simple and easy to estimate. However, the problem with polynomial approximation is that it cannot deal with many common nonlinearities (saturation, threshold, dead zone, etc.). Better approximation of these nonlinearities can be achieved by spline functions. However, spline functions are defined by a series of knot points which must either be chosen a priori or treated as model parameters and included in the (nonconvex) optimization. Neural networks are another tool to approximate nonlinear functions. Their powerful approximation abilities make them attractive. However, the need to specify the neural network topology in terms of the number of nodes and layers and the need to solve nonconvex optimization complicate their implementation. Recently, support vector machines (SVMs) and least squares support vector machines (LS-SVMs) have demonstrated powerful abilities in approximating linear and nonlinear functions [12, 13]. In contrast with other approximation methods, SVMs do not require a priori structural information. Furthermore, there are well established methods with guaranteed convergence (ordinary least squares, quadratic programming) for fitting LS-SVMs and SVMs . One of the common drawbacks of SVM based algorithms is that they are computationally heavy. Recently, a twin support vector machine (TSVM) regression algorithm has been proposed . The formulation of TSVM is very close to classical SVM except that it aims to enclose data points between two parallel planes such that each plane is closer to one class and is as far as possible from the other. Such formulation reduces the computation complexity which makes the TSVM one of the common methods in machine learning. Lately, many extensions of TSVM classifier have been proposed to improve its performance in certain aspects. A TSVM classifier has been extended to put the TSVM to multitask learning . A generalized framework of TSVM for learning from labeled and unlabeled data was investigated by [17, 18].
Recently, Tötterman and Toivonen  have developed a new algorithm to identify Wiener models based on SVM regression, where the linear part is described by a basis filter expansion while the nonlinear part is represented by SVM. In this work, TSVM regression is used to formulate an identification algorithm for Wiener models. Simulation examples are presented to show the virtues of the proposed algorithm over Tötterman’s algorithm. The outline of this paper is as follows: TSVM theory is reviewed in Section 2. In Section 3, an algorithm for the identification of Wiener models based on twin support vector machine is proposed. Section 4 presents two illustrative examples to test the proposed algorithm. In Section 5, concluding remarks are given.
2. Twin Support Vector Machines Regression
Twin support vector regression (TSVR) is obtained by solving the following pair of quadratic programming problems (QPPs):where , are parameters, , are slack variables, is vector of ones, and is a nonlinear kernel function. Given training data points where and represent input and output vectors, respectively, the TSVR algorithm finds two functions , which determines the -insensitive down bound regressor, and , which determines the -insensitive up bound regressor. The end regressor is computed as the mean of these two functions. The geometric interpretation is given in Figure 1. The objective function of (1) or (2) is the sum of the squared distances from the shifted functions or to the training points. Therefore, minimizing it leads to the function or . The constraints require the estimated function or to be at a distance of at least or from the training points. That is, the training points should be larger than the function at least , while they should be smaller than the function at least . The slack variables and are introduced to measure the error whenever the distance is closer than or . The second term of the objective function minimizes the sum error variables, thus attempting to overfit the training points.
To find the solution of (1) and (2), the dual problems are needed to be derived. The optimization problems (1) and (2) just described are the primal problems for regression. To formulate the corresponding dual problems, the Lagrangian function is written. Then, is minimized with respect to the weight and slack variables , and maximized with respect to the Lagrange multipliers and . By carrying out this optimization, can be written in terms of Lagrange multipliers and . Finally, substituting the value of and simplifying with the help of Karush-Kuhn-Tucker (KKT) the following dual problem is obtained:where . Then, problem (3) leads toThat is,
The solution of (5) requires the inverse of . Sometimes, matrix may be ill-conditioned. This situation may be avoided by adding a regularization term, , to . Here, “” is an identity matrix of suitable dimension. Therefore, (5) is reformulated as
Similarly, problem (2) is considered and its dual is obtained aswhere . Hence, problem (7) leads to Note that, in the duals (3) and (7), the inversion of matrix of size should be computed . Once the vectors and are known from (6) and (8) the two up and down bound functions are obtained. Then, the estimated regressor is constructed as follows:
3. Identification of Wiener Models
The Wiener cascade, a linear filter followed by a static nonlinearity as shown in Figure 2, is often used to represent certain higher-order nonlinear systems.
In this section, the development in  is followed, up to the point where the SVM optimization is introduced (where TSVM is used). The output error Wiener model can be described aswhere are the input and output of the linear block and the measured output signals, respectively, for . The innovation is assumed to be white. Identifying output error models is computationally heavier than autoregressive exogenous (ARX) models even for linear systems. Several techniques have been suggested to deal with linear output error (OE) models such as instrumental variables (IV), subspace methods, algorithms based on the Steiglitz-McBride, and using orthonormal filter expansions like Laguerre or Kautz functions . In this paper the linear part will be represented by Laguerre filters. The truncated discrete Laguerre expansion of the transfer function is given byfor all , where the function is assumed to be strictly proper, analytic in , and continuous in .
Remark 1. Notice that the finite impulse response FIR model can be obtained from (11) when .
Substituting (11) into the first equation of (10) results in which can be written aswhere . Applying such expansion to the linear part of Wiener model enables us to rewrite the Wiener model (10) as follows:
Remark 2. If the nonlinearity was modeled as a polynomial of order then model (14) can be seen as bank of parallel linear filters ’s followed by multi-input-single-output (MISO) nonlinear block as shown in Figure 3.
Remark 4. Model (16) might be identified by solving (3)–(8). It is clear that, solving (3)–(8), which consists of two simple quadratic programming problems without equality constraint, is easier than solving the quadratic program developed in .
4. Example 1
To show the features of the proposed algorithm, the simulation example presented in  is identified. Consider a valve for fluid flow control described bywhere is a pneumatic control signal applied to the valve, is the position of the valve plug, and is the fluid flow. The fluid flow measurement is given bywhere consists of independent sequence of normally distributed random numbers with variance 0.0025. Two input-output sequences consisting of 1000 data points were generated as follows. A pseudorandom binary signal fluctuating between −1 and +1 with a basic clock period of seven sampling intervals was produced. Then, in each time interval, the input was multiplied by a uniformly distributed random factor between 0 and 0.4. Finally, a bias 0.5 was added on it to result in a signal having an amplitude between 0.1 and 0.9. This procedure was described and used in . The training and testing data sets are shown in Figures 4 and 5, respectively. The hyperparameters (, , , , , , and ) were selected based on cross-validation method where one parameter was varied and the others were kept fixed. The value of the varying parameter that gives the least root mean square error (RMSE) value is chosen as shown in Tables 1 and 2. The cross-validation process resulted in , , , , and . The Laguerre filter pole and order were chosen to be and , respectively. Figure 6 shows TSVR based algorithm estimates together with measured output for the last 100 samples of the test data. It is clear from the figure that the algorithm produced accurate result.
5. Example 2
Consider pH neutralization process consisting of a continuous stirred tank reactor (CSTR) where a strong base (NaOH) reacts with the feed stream, a strong acid (HCl). The process is shown in Figure 7. The process input is the flow rate of the strong base and the process output is the pH of the effluent solution. The acid flow rate as well as the volume of the tank is assumed to be constant. The identification algorithm described in Section 3 is used to estimate a Wiener model of the simulation system presented in . The nominal operating conditions of the system are given in Table 3. The model is highly nonlinear due to the implicit output equation, known as the titration curve. The system was excited with band limited white noise, with zero mean and 0.01 variance, around the nominal value of the base flow rate. The output of the system was corrupted with additive Gaussian white noise with zero mean and standard deviation 0.001.
The training and testing data sets are shown in Figures 8 and 9, respectively. The TSVM and SVM hyperparameters and the Laguerre filter pole and order were chosen based on cross-validation method. For TSVM, the cross-validation resulted in , , , , and (see Table 5). In Table 4, the Laguerre filter pole and order were chosen to be and , respectively. The SVM hyperparameters were chosen to be , , and as shown in Table 7 and the Laguerre filter pole and order were selected as and (see Table 6).
By comparing the least RMSE (0.1248) and CPU time (21.36 sec.) values of the TSVR algorithm (Table 5) with the RMSE (0.1311) and CPU time (72.58 sec.) values of the SVR algorithm (Table 7), it is clear that TSVM algorithm outperforms the SVM algorithm in terms of accuracy and speed of computation.
In this paper a new algorithm for identification of Wiener systems using TSVR has been derived and used to identify simulation examples. The algorithm was able to model the nonlinearity, without requiring any a priori assumptions regarding its structure. The linear model was represented by Laguerre filter. The main advantage of the proposed algorithm over the method presented in  is that the proposed algorithm results in two smaller quadratic programs which is easier to solve than the quadratic program developed in . It was shown in Example 2 that TSVM algorithm is computationally lighter than SVM algorithm.
Conflict of Interests
The author declares that there is no conflict of interests regarding the publication of this paper.
Mujahed Al-Dhaifallah would like to acknowledge the support provided by the Deanship of Scientific Research (DSR) at King Fahd University of Petroleum and Minerals (KFUPM) for funding this work through Project no. FT131015.
- J. Schoukens, R. Pintelon, T. Dobrowiecki, and Y. Rolain, “Identification of linear systems with nonlinear distortions,” Automatica, vol. 41, no. 3, pp. 491–504, 2005.
- D. Westwick and R. Kearney, Identification of Nonlinear Physiological Systems, John Wiley & Sons, Piscataway, NJ, USA, 2003.
- D. Westwick and M. Verhaegen, “Identifying MIMO Wiener systems using subspace model identification methods,” Signal Processing, vol. 52, no. 2, pp. 235–258, 1996.
- D. Q. Wang and F. Ding, “Hierarchical least squares estimation algorithm for hammerstein-wiener systems,” IEEE Signal Processing Letters, vol. 19, no. 12, pp. 825–828, 2012.
- L. Zhou, X. Li, and F. Pan, “Least-squares-based iterative identification algorithm for Wiener nonlinear systems,” Journal of Applied Mathematics, vol. 2013, Article ID 565841, 6 pages, 2013.
- F. Ding, J. Ma, and Y. Xiao, “Newton iterative identification for a class of output nonlinear systems with moving average noises,” Nonlinear Dynamics, vol. 74, no. 1-2, pp. 21–30, 2013.
- Y. Hu, B. Liu, Q. Zhou, and C. Yang, “Recursive extended least squares parameter estimation for Wiener nonlinear systems with moving average noises,” Circuits, Systems, and Signal Processing, vol. 33, no. 2, pp. 655–664, 2014.
- M. Schoukens, G. Vandersteen, Y. Rolain, and F. Ferranti, “Fast identification of Wiener-Hammerstein systems using discrete optimisation,” Electronics Letters, vol. 50, no. 25, pp. 1942–1944, 2014.
- M. Schoukens, A. Marconato, R. Pintelon, G. Vandersteen, and Y. Rolain, “Parametric identification of parallel Wiener-Hammerstein systems,” Automatica, vol. 51, pp. 111–122, 2015.
- S. L. Lacy and D. S. Bernstein, “Identification of FIR Wiener systems with unknown, noninvertible, polynomial nonlinearities,” in Proceedings of the American Control Conference, pp. 893–898, May 2002.
- M. Pawlak, Z. Hasiewicz, and P. Wachel, “On nonparametric identification of Wiener systems,” IEEE Transactions on Signal Processing, vol. 55, no. 2, pp. 482–492, 2007.
- V. N. Vapnik, Statisical Learning Theory, John Wiley & Sons, New York, NY, USA, 1998.
- J. A. K. Suykens, T. van Gestel, J. de Brabanter, B. de Moor, and J. Vandewalle, Least Squares Support Vector Machines, World Scientific, Singapore, 2002.
- S. Boyd and L. Vandenberghe, Convex Optimization, Cambridge University Press, Cambridge, UK, 2004.
- P. Xinjun, “TSVR: an efficient twin support vector machine for regression,” Neural Networks, vol. 23, no. 3, pp. 365–372, 2010.
- Y. Tian and Z. Qi, “Review on: twin support vector machines,” Annals of Data Science, vol. 1, no. 2, pp. 253–277, 2014.
- Z. Qi, Y. Tian, and Y. Shi, “Laplacian twin support vector machine for semi-supervised classification,” Neural Networks, vol. 35, pp. 46–53, 2012.
- Z. X. Yang, “Nonparallel hyperplanes proximal classifiers based on manifold regularization for labeled and unlabeled examples,” International Journal of Pattern Recognition and Artificial Intelligence, vol. 27, no. 5, Article ID 1350015, pp. 1–19, 2013.
- S. Tötterman and H. T. Toivonen, “Support vector method for identification of Wiener models,” Journal of Process Control, vol. 19, no. 7, pp. 1174–1181, 2009.
- L. Ljung, System Identification: Theory for the User, Prentice Hall PTR, Upper Saddle River, NJ, USA, 1999.
- S. A. AlSabbah, M. A. Al-Khedher, M. K. Abu Zalata, and T. M. Younes, “Evaluation of multiregional fuzzy cascade control for pH neutralization process,” International Journal of Research Reviews in Applied Sciences, vol. 10, no. 2, pp. 193–199, 2012.
Copyright © 2015 Mujahed Al-Dhaifallah. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.