Abstract

To tackle the sensitivity to outliers in system identification, a new robust dynamic partial least squares (PLS) model based on an outliers detection method is proposed in this paper. An improved radial basis function network (RBFN) is adopted to construct the predictive model from inputs and outputs dataset, and a hidden Markov model (HMM) is applied to detect the outliers. After outliers are removed away, a more robust dynamic PLS model is obtained. In addition, an improved generalized predictive control (GPC) with the tuning weights under dynamic PLS framework is proposed to deal with the interaction which is caused by the model mismatch. The results of two simulations demonstrate the effectiveness of proposed method.

1. Introduction

In the past decades, PLS, a multivariable regression method, has been applied to many areas, such as quality prediction, process monitoring, and chemometrics [1]. PLS is not only able to extract principle component from both input and output datasets, but also able to determine the direction on which input and output data have the largest covariance [2]. Considering the advantage of dimension reduction and automatic decoupling, many researchers applied PLS to the modeling and control of dynamic systems. Kaspar and Ray [3, 4] proposed a dynamic PLS framework by utilizing the PLS loading matrices to construct precompensators and postcompensators. Chen et al. [5] proposed another dynamic PLS framework with ARX model. Laurí et al. [6] proposed a PLS-based model predictive control (MPC) relevant identification method.

There are many uncertain factors in industrial fields which make the clean data cannot be acquired directly from actual systems. It is well known that the standard algorithms for PLS regression (NIPALS and SIMPLS) [7] are very sensitive to outliers in the dataset. Outliers can distort the curve of the data. It may lead to undesirable results if standard PLS is directly applied in practical applications. To overcome this problem, several robust PLS methods are proposed in recent years. One of them is obtained by robustifying the sample covariance matrix between the input and output datasets based on SIMPLS algorithm [8, 9]. Another is PLS calibration with outliers detection, which selects a subset of samples dataset randomly and gets the initial residual set and then detects and discards outliers in the subset [10]. These methods are not suitable for the dynamic PLS control framework because the samples are a time sequence series and cannot be selected randomly. Moreover, these methods only focus on the regression; they do not fit for dynamic modeling.

There is one way to get a robust dynamic PLS model. In contaminated data, several data preprocessing methods could be applied to detect outliers and eliminate the influence of them. Various methods for outliers detection have been proposed in the literature, such as statistics-based algorithms, machine learning-based algorithms, and neural network-based algorithms. In these methods, outliers are the point far away from the majority of the data; however this situation is not always right [11]. Outliers can be close to the major curve of data as well. In order to detect general outliers in the dataset which is used for modeling, the relationship between inputs and outputs should be known. Liu et al. [11, 12] proposed an autoregression structure RBFN to predict the outputs of the system and detected outliers with this predictive model combined with HMM. The limitation is that the output of the RBFN is composed of the inputs and outputs of the process, but the prediction of system input is useless. Because outliers are usually contained in the output dataset, outliers in input dataset are commonly calculated by the controllers. Therefore, it is not useful to increase the dimension of the network weights to estimate the value of the inputs. In addition, Liu et al. [11] analyzed the performance of the algorithm with single-input single-output (SISO) system at low rate of outliers. There is no analysis with more complex situations such as multiple-input multiple-output (MIMO) systems and high rate of outliers.

On the other hand, many control schemes are applied under the dynamic PLS framework. Kaspar and Ray proposed a PID control scheme under this framework [3], and Chen et al. [5] designed multiloop adaptive PID controllers based on an modified decoupling PLS framework. Hu et al. [13, 14] proposed a multiloop internal model controller in the dynamic PLS framework and achieved better performance for disturbance rejection. Lü and Liang [15] proposed a multiloop constrained MPC scheme. Ideally, with the PLS decoupled structure, there should be no interaction between PLS components [3]. However, there are always some interactions observed due to the plan/model mismatch which is caused by contaminated data and filtering color noises incompletely.

Aiming at these limitations, a robust dynamic PLS modeling and improved GPC control scheme is proposed in this paper. A new RBFN structure is applied to predict the outputs of the system. With the difference between the prediction and the real data, a simple HMM algorithm is used to detect the outliers, and then they are replaced by the mean of clean data nearby. After outliers detection, dynamic PLS modeling would be more robust. Based on this model, a dynamic tuning weight sequence is introduced into the cost functions of improved GPC under the PLS latent space to reduce the interaction of different outputs which is caused by mismatch between the PLS model and real process.

The rest of paper is organized as follows: in Section 2, an improved RBFN training method and an HMM based outliers detection algorithm are presented. In Section 3, dynamic PLS modeling method is introduced and an improved GPC scheme is proposed. In Section 4, two examples are used to demonstrate the effectiveness and better control performance of the proposed method. In the last section, the conclusions are summarized.

2. Outliers Detection for Modeling Data

2.1. Improved Radial Basis Function Network

As a feedforward network, the RBFN neural network has three layers: input layer, hidden layer, and output layer. And, in many practical applications, Gaussian RBF is used in the hidden layer. It can approximate the linear and nonlinear functions with arbitrary precision in theory. When fitting the dynamic systems, several past input and output samples should be used as the input of the RBFN. To introduce the new structure of RBFN, 3 assumptions are firstly given as follows: the system has inputs and outputs; only the outputs of the system contain outliers; and each input signal and each output signals have the same lags, respectively. With assumption , arbitrary input and output dimension system could be involved. Outliers in outputs are usually caused by some unmeasured reasons such as unknown sensor failures. Hence it is what we focus on (with assumption ). While it is easy to detect outliers in inputs, for the inputs can be calculated with set points and detected values by controllers. In order to simplify the analysis, assumption is employed. Signal with different lags can be treated as a special case of that signal with the same lags. Past samples used as the inputs of the network have the formwhere is an -dimension vector and is an -dimension vector. denotes the input values from to , denotes the output values from to . and denote the lag of input and output, respectively.

The structure of the modified RBFN is shown in Figure 1. past input values and past output values enter the RBFN. The output of RBFN denotes the predictive vector, which is calculated by summing the product of the hidden layer outputs and the output layer weights . The superscripts and denote which input or output it belongs to, and the subscript denotes which hidden node it belongs to. is computed by the hidden node center vector with the Gaussian function, where and . Then the output of the RBFN is When is inputted into the RBFN, there are hidden nodes for it to cluster. The outputs of hidden layers with Gaussian function RBFN for the inputs areIt is noted that is an -dimension vector for or an -dimension vector for because RBFN input represents both and vectors. The input of the conventional RBFN should be a -dimensional vector , and the center vector of it has the same dimension with it, which is denoted by . should be chosen from ; therefore for each there are possible. In other words, conventional RBFN needs hidden nodes to be equal to the improved RBFN with hidden nodes. It is obvious that is less than ; therefore, the improved RBFN has less hidden nodes.

2.2. Convergence of Improved RBFN

Above all, to analyze the convergence of improved RBFN, the relationship between improved RBFN and conventional RBFN is proposed in Lemma 1.

Lemma 1. The improved RBFN is equivalent to the conventional RBFN.

Proof. Suppose that are independent of each other. The output of the hidden layer of the conventional RBFN is given byThe output of conventional RBFN can be written as where is the weight of the output layer. For each there must be time-varying weights always found to make the following hold:According to (6), (5) can be rewritten asEquation (7) can be rewritten asComparing (8) and (2), it is obvious that (6) always holds with Gaussian function RBFN:From (9), we can see the improved RBFN with time-variant output weights is completely equivalent to the conventional RBFN.

To analyze the convergence of improved RBFN, we give two assumptions as follows: each output must be bounded when . In other words, there must be a constant , satisfying the relationship that , ; the first sample data of the time series is accurate and , .

From (6), we know that the output weights of improved RBFN are time-variant, which are decided by the output of hidden nodes. In order to analyze the convergence, we divided the output of improved RBFN into parts corresponding to the outputs of hidden nodes of conventional RBFN. We take only one output of the improved RBFN for analysis, and other parts of the RBFN have the same property. From (5) and (6), the th hidden nodes for a output of the RBFN are denoted by for the left of (6) at time , and the right are denoted by :where is the time-variant weight vector and is the output of the hidden layer.

From (10), we can get the output of the hidden layer of improved RBFN at time :From (6), (10), and (11), we know that and , and then the output weights of the improved RBFN at time and have the relation as follows:So the deviation of the improved RBFN at time isSince is the hidden layer output of conventional RBFN and approximation property of conventional RBFN, there is relationship that , where denotes the true value of th output. denotes an arbitrarily small value. can be written aswhere is the true value of system output. According to assumption , we know that and is accurate. Thus is used to replace and the equation is obtained as follows:Because of the approximation property of the conventional RBFN, is bounded. also can be arbitrarily small when . Thus the conclusion of the convergence of improved RBFN is proved.

2.3. Outliers Detection Using Hidden Markov Model

The real value of is denoted by . From the approximation property of the RBFN, the prediction should be close to its true value, . For the th output, it has the following relationship:where and is the estimation of the white noise which obeys the Gaussian distribution . The probability density function of is Gaussian-type membership function and given by [11, 12] where . The higher the probability is, the closer is equal to 0 (the closer is equal to ). In other words, if equals or is close to 0 at time , is considered to be a possible outlier.

In order to make outliers detection accurately, HMM is introduced. HMM is a statistical Markov model in which the system being modeled is assumed to be a Markov process with unobserved states. One of the problems in HMM is choosing a most likely state chain with the given observation chain and model. This problem can be solved by Viterbi algorithm [16]. In the observation chain, state transition probabilities are the only parameters. is fit for this character. Hence, an HMM is introduced to detect whether the data are outliers or not. It is noted that the dimension of the chains is the same and equal to the dimension of system output.

An HMM for outliers detection is characterized by the following [17]:(1)Outliers detection is done separately for each system output.(2) represents the chain which has been observed. The hidden chain is the state of the observation.(3)There are two states in the HMM. “1” denotes that the state is normal, and “0” means it is outlier.(4)The transition probability of hidden chain is defined as follows:where ( or 1) denotes the transition probability from state to and denotes the total transition times.

To find the best detection result, the probability is defined aswhere is a sequence of observation probabilities and calculated by (17); the element in sequence denotes the state; and is the best score (highest probability) for finding the best state sequence. Since the outliers are detected in a time sequence and the previous data before time have already been detected, the conventional Viterbi algorithm can be simplified. Equation (19) is rewritten as

State transition probability is used to express all influences of historical observations probability and states, and (20) can be rewritten as

obtained by (17) is the observation probability that state is “1” and . Then the state of the data at time can be determined:

If , then , and is an outlier.

Else then , is normal.

Then the data which are detected as outliers are removed and replaced by the mean of clean data nearby.

3. Dynamic PLS Modeling and Controller Design

3.1. Dynamic PLS Modeling

The conventional PLS consists of an outer model which extracts the latent variables and an inner model to estimate the relation between the input and output scores. Considering an dimension input dataset and an dimension output dataset , and denote the number of samplings, input dimension, and output dimension, respectively. It is also assumed that the dataset has been processed by the outliers detection algorithm and scaled to zero mean and unit variance. As for outer model, it is built by decomposing matrices and as follows: where is the number of latent variables; and are the th loading vectors of loading matrices and , respectively; and are the th score vectors of matrices and , respectively; and and are residual matrices of and , respectively.

In inner model, the score matrices and are related by a diagonal matrix which is obtained by least squares (LS) method:where is the th element of inner relation matrices .

It is noted that PLS has the effect of dimensionality reduction. is met in most cases. And nonsquare and ill-conditioned system control problem can be dealt with due to the fact that PLS decomposes the multivariate regression problem into a series of univariate regression problems [2].

The major drawback of PLS is that it is not suitable for dynamic systems. Researchers have proposed many different dynamic PLS models by employing structures like time series term or dynamic filters into PLS structure [4, 18]. In this paper, integrating autoregressive exogenous (ARX) model is applied into inner model of PLS [19] to represent the dynamic character of process. It can be expressed as follows:where . denotes the ARX model of the th subsystem in the latent variable space. With (24), (23) is transformed toAn LS algorithm is used to identify the parameters of ARX model in the latent space.

3.2. Dynamic PLS Framework for Controller Design

The controller design in latent variable space proposed by Kaspar and Ray [4] is shown in Figure 2. and are scaling matrices of and ; , , and are the transfer function matrices of the controller in latent variable space, the controlled plant, and disturbances, respectively. and are the loading matrices of the input data and output data, respectively. is the appropriate inverse of . Since the MIMO system is decomposed into multiple SISO subsystems in latent variable space, the original MIMO controller problem is decomposed into multiple independent SISO control problems. A variety of conventional control schemes could be applied in this control framework [4]. In this paper, GPC would be employed.

3.3. Improved GPC Controller Design in Dynamic PLS Framework

GPC is one of the model predictive control methods. It has been successfully implemented in many industry and academia, showing good performance and a certain degree of robustness [20]. The GPC controller in dynamic PLS is proposed by Chi et al. [2]. This method has some special advantages, such as simple controller design procedure and low computing time. The predictive model used in GPC is called controlled autoregressive integrated moving averaged (CARIMA) model:where , , , and .

As demonstrated by Laurí et al. [21], a filtered version ARX is equivalent to the CARIMA model for both SISO case and MIMO case. Hence, the ARX models which are obtained in the latent variable space (Section 3.1) can be used for GPC. Replacing the controller in Figure 2, the GPC scheme in dynamic PLS framework is illustrated in Figure 3. The setpoints in the original space are transferred into latent variable space; then the output scores are subtracted to be the inputs of the GPC controllers. The outputs of the GPC controllers are transferred back to the original space as the input of the real process.

The cost function for the th GPC controller iswhere is the predictive output of step ahead. and are the control horizon and predictive horizon, respectively. To make the discussion simple, let , in the following discussion. and are weighting sequences. is usually a constant weighting sequence that considers future behavior of controllers. It is chosen based on the system features and experience of the engineers while, in this paper, it is dynamically tuned based on the residual of setpoints and output of the systems. In order to denote time-varying weighting sequence, it is rewritten as . Then, (27) is rewritten as is calculated by the following equation:where is a threshold and is a weight coefficient. And , when .

affects the weight on the residual of setpoints and prediction outputs. In the case of incompletely decoupling caused by model mismatch, this method can restrain the interaction in the presence of part coupling. The principle of interaction rejection with is described as follows. When the residual of th output and setpoint increases, the weight is adjusted with (29). The adjustment speed of outputs except th output is accelerated because weight is increased according to (29). Then adjustment time and overshoot value will be reduced. In this way, the th output interaction on other output would be reduced.

Take a two-input two-output latent variable model as example. When the 1st output deviation between and is produced, the weight of 2nd output is increased. Conversely, when the 1st output deviation is reduced to zero, is reduced to the threshold . is adjusted with as shown in Figure 4.

According to the prediction scheme of GPC, in (29) consists of of other outputs. The cost function (28) is a multivariate minimization problem. These variables influence each other, which reflects variables interactions due to the plan/model mismatch. There are cost functions for the GPC controllers in latent space; these series of minimized problem   () are equal to the minimized problem of as follows:And the total cost function isEquation (31) is rewritten in the form of matrix aswhere , , .

Let , , , and ; (29) is rewritten asLet , , , and ; then the th output weight is .

Let , , , and ; weight matrix has the formReplace the weight factor in (32) with (34); the cost function of improved MIMO GPC in latent variable space is written as follows:Equation (35) is a nonlinear optimal problem; interior point method [22] is a typical method to solve this problem.

From the discussion above, the improved GPC has 4 types of parameters (predictive horizon , control horizon , weight threshold , and weight coefficient ) which would influence the control result. They should be chosen properly. For each step of the control period, the following procedures are done for control incremental update.(1)Get the prediction of the outputs.(2)Calculate (35) by interior point method.(3)Send the first element of every to the score vector as , where .

4. Examples

In this section, two examples are presented to demonstrate the advantages of proposed modeling and control strategy.

4.1. Example  1

The first example is an industrial-scale polymerization reactor control problem, which is proposed by Chien et al. [23]. The basic control strategy is similar to the terpolymerization reactor example by Ogunnaike et al. [24]. The advanced product quality control loops are used to track the setpoints of the reactor condition control loops. The process model for the reactor setpoints control loops is

Controlled variables are the two measurements representing the reactor condition; and manipulated variables are the setpoints of two reactor feed flow loops; load disturbance is the purge flow of the reactor.

In order to build up the dynamic PLS model, two random step signals with magnitude are applied to excite the system. And random signals with and are added to both disturbance inputs separately in order to simulate the disturbance influence. The generated datasets are plotted in Figure 5.

In order to demonstrate the performance of outliers detection method, a number of outliers are added to the outputs of the modeling dataset and several performance criteria are involved to evaluate the performance.

In this example, the number of latent variables is 2. And an outlier is defined as 3 times bigger than the true value to simulate the failure that some samples are measured abnormally. Outliers are randomly added to the outputs at different rate. The data with more than 50% outliers are considered invalid because the useful information it carried is not sufficient. Since the numbers of outliers are added randomly, the results of different simulations may be different. For example, the result with 100 discontinuous outliers is better than the continuous ones. A discontinuous outlier can be detected and imputed easily based on the adjacent true points.

For each rate of outliers, 3 performance criteria are calculated to evaluate the performance of outliers detection method. Let , , , and denote the number of samples, outliers that are detected, error detection (normal samples detected to be outliers), and missing detection (outliers that are not detected), respectively. The criteria are described as follows:(1)Accurate rate is the ratio of actual outliers among () and actual outliers (): (2)Error detection means the true value is considered as an outlier. Error rate is the ratio of the number of error detections () and the number of actual true samples (): (3)Missing detection means outliers are considered as true samples. Missing rate is the ratio of the number of missing detections () and the number of actual outliers ():

The simulation result is shown in Table 1. It is clear that the less outliers contained, the better accurate rate is obtained. It is noted that the results in Table 1 could have a bit different because of the randomness of outliers.

The detected results of 10% outliers are shown in Figures 6(b) and 6(d). As observed in Figure 6 and Table 1, most outliers have been detected out correctly when the rate of outliers is small. That is because when outliers’ rate increases, the probability that samples nearby outliers are also outliers is bigger. That is to say, the useful information nearby would be missing; the RBFN method would not get enough information to predict the next output correctly. The samples of from 600 to 750 are all detected as true values because the values of these samples are very small, and the outliers are submerged in the noises.

The data with 10% outliers would be used for further dynamic PLS modeling. The outliers that have been detected as outliers would be replaced with the corresponding samples next to them. In order to evaluate the robustness of dynamic PLS modeling with outliers detection, a criterion of comparison is applied [25]. The angle between loading vectors of the improved method with outliers detection and conventional method is chosen and is defined as , where is the function that calculates the angle between the loading vectors and . denotes the loading vector which is derived from the data without outliers. denotes the loading vectors of modeling method with outliers detection or conventional modeling method.

The simulation result is shown in Table 2. From Table 2, it is clear that the improved method has smaller angle than the conventional method. A model validation test of the two methods is done and shown in Figure 7. The other 500 samples without outliers are used to perform this test. The improved dynamic PLS without outliers detection method and conventional method are denoted as ODPLS and DPLS for short. The integral of squared error (ISE) is applied to evaluate the deviation between real data and modeling result which is described aswhere denotes the real data, denotes the modeling result data, and denotes the output number.

In Figure 7, the two ISEs of ODPLS are and , while ISEs of DPLS are and . It is clear that the dynamic PLS modeling with outliers detection is more robust than conventional method.

In addition, the SISO GPC (SGPC) controllers and improved MIMO GPC (MGPC) controller in latent variable space are implemented with the two modeling methods. The results are shown in Figures 8 and 9. The predictive horizon and control horizon have the same value in the 4 simulations (, ). It is noted that the SGPC and MGPC with DPLS model would be unstable when and are bigger because of the mismatch between the DPLS model and the real process which is caused by outliers. It is observed that SGPC and MGPC with ODPLS obtain better control performance than that with DPLS in terms of rise time and ISE. That is because ODPLS partly removes outliers away and make the modeling more accurate than DPLS. With the same model (DPLS or ODPLS), it can also be observed that MGPC obtains better result than SGPC in terms of rise time and ISE. In summary, it can be concluded that MGPC with ODPLS has the least interaction between the two outputs among the 4 simulations.

4.2. Example  2

The methanol-water distillation column model reported by Wood and Berry [26] is a typical 2-by-2 process with strong interaction. The reflux and the reboiler steam flow rates are the process inputs. The compositions of the top and bottom products are the process outputs. The discrete sample time  min. The transfer function is given as [6]

In order to build up the dynamic PLS model, two 1000 sampling random step signals with are applied to excite the system and generate the outputs. And white noise in which the SNR (signal to noise ratio) is 10 DB is added to both outputs in order to simulate the actual outputs. The generated dataset is plotted in Figure 10.

In this example, the number of latent variables is 2. And a zero value or a small value is defined as an outlier to simulate the measuring equipment failure that some samples are detected abnormally. 10% and 25% rates of outliers are added. The effect of RBFN parameters on the outliers detection result is shown in Tables 3 and 4. These parameters include the number of input and output time delays () and hidden node center (), where the hidden node center of each RBFN input has the same number . Comparing Tables 3 and 4, RBFN parameters have more impaction on the detection result at 25% rate of outliers detection than at 10%. That is because great rate of outliers makes outliers detection more difficult and results in the poor robustness of RBFN. It is obvious that RBFN with parameters (4, 4, 2) has better outliers detection result shown in Table 4. It is noted that the real detected result such as that in Table 4 or Table 3 cannot be obtained in the real process. Parameters may be selected based on the ratio of ISE and integral of output (ISE/IO) as shown in Table 5 which would be fitting the samples properly. Shown in Figure 11 are the detected results of 10% outliers with RBFN parameters (4, 4, 2) which has better fitting result and is used for the further control simulation.

Figure 12 shows the control result of MGPC with different parameters, where the solid line is the result of parameter, , , , , and dotted line is the result of parameter, , , , . It is clear to see the interaction between the 2 outputs at times 250 and 750, where is disturbed by the setpoint changing of . From Figure 12, control result of dotted line has less overshot and shorter convergence time. It can conclude that the parameters and in (31) have a great influence on the control result. With appropriate parameters in (31) a better control result can be obtained.

In real process, there are some disturbance signals under the control framework. In order to get the disturbance rejection performance, and are separately set at 0.1 and 0, and a 50% change is added to . The result is shown in Figure 13. MGPC has less disturbance than SGPC for both and . And the convergence time of MGPC is also shorter.

5. Conclusions

Since dynamic PLS modeling is sensitive to outliers in the data, a new RBFN structure with HMM to detect outliers is proposed to deal with contaminated data. The improved RBFN considered past samples as the input of the network for the dynamic system prediction. It has less hidden nodes; the equivalent to the conventional RBFN and convergence is proved. With HMM, there is no need to preselect a threshold to detect outliers. With the processed data in this method, a more robust dynamic PLS model is obtained. Unfortunately, there is still mismatch between the model and the real process, caused by many reasons. The inner model cost function in dynamic PLS is treated as a special MIMO problem and a GPC based on the adaptive weights is used to tackle this problem. Two examples are used to demonstrate the performance of outlier detecting, accurate modeling, and control.

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

Acknowledgments

This work is supported by the National Natural Science Foundation of China (61174114), the Research Fund for the Doctoral Program of Higher Education in China (20120101130016), and the Zhejiang Provincial Science and Technology Planning Projects of China (2014C31019).