#### Abstract

A robust online fault prediction method which combines sliding autoregressive moving average (ARMA) modeling with online least squares support vector regression (LS-SVR) compensation is presented for unknown nonlinear system. At first, we design an online LS-SVR algorithm for nonlinear time series prediction. Based on this, a combined time series prediction method is developed for nonlinear system prediction. The sliding ARMA model is used to approximate the nonlinear time series; meanwhile, the online LS-SVR is added to compensate for the nonlinear modeling error with external disturbance. As a result, the one-step-ahead prediction of the nonlinear time series is achieved and it can be extended to *n*-step-ahead prediction. The result of the *n*-step-ahead prediction is then used to judge the fault based on an abnormity estimation algorithm only using normal data of system. Accordingly, the online fault prediction is implemented with less amount of calculation. Finally, the proposed method is applied to fault prediction of model-unknown fighter F-16. The experimental results show that the method can predict the fault of nonlinear system not only accurately but also quickly.

#### 1. Introduction

Fault prediction mainly deals with the fault that will happen according to the past and current states of the systems, which can avoid large calamity. With the further requirements of system reliability and security, fault prediction has attracted considerable attention in aircrafts, weapons, and almost all the industrial applications [1].

Fault prediction is regarded as a challenging task of fault diagnosis. Now little headway has been made in fault prediction and there are only a few results about it. The main methods in this field can be classified into two categories, namely, model-based methods and data-driven methods. The methods based on system model require the establishment of exact system models, filtering of the measure data, and estimation of the future state. The basic idea is to judge the fault according to the prediction output of the model and make a decision about the future state of system. A variety of methods such as the extended Kalman filtering (EKF) studied in [2–4], strong tracking filtering studied in [5, 6], and particle filtering studied in [7, 8] are the typical representative in this category. As a premise, the system model must be known beforehand and accurate for these methods to be highly effective. Unfortunately, nonlinearity and uncertainty widely exist in the system model. For some complex nonlinear systems, accurate models cannot be obtained. This can easily degrade the estimation of output and cause either missed detections or false alarms [9].

Owing to increased automation, faster sampling rate, and advances in computing power, large amount of data is available online. Consequently, the researchers pay much attention to the data-driven methods [10]. This category of methods based on the sensor or historical data can avoid the dependence on the model of system, so that more and more efforts are devoted to the research on the methods. At present, most of data-driven fault prediction methods are associated with time series [11]. One of the common methods is to apply classical time series analysis and prediction theory to the field of fault prediction. Ho and Xie [12] established autoregressive integrated moving average (ARIMA) model to predict the time of next fault occurrence. Li and Kang [13] forecasted failure rate with autoregressive moving average (ARMA) model. Zhao et al. [14] developed an ARMA model to forecast the equipment fault. The methods based on classical time series prediction theory are mature and have achieved certain success in practical application, especially in linear system. Nevertheless, this kind of methods uses a linear statistical model to fit the data sequence. Therefore, in essence, they are not suitable for the prediction of nonlinear system. Moreover, these methods are also affected by the error of modeling, parameter perturbation, and outside disturbance. So the robustness is unsatisfactory [15]. As another kind of data-driven methods, intelligent fault prediction approaches such as neural networks, support vector regressions (SVRs), and other computational intelligence methods have elicited considerable research interests in the last decade. Neural network-based methods have increasingly attracted attention because of their robust implementation, good performance in learning arbitrarily, and excellent ability for nonlinear mappings. The methods combining time series forecasting and neural network have been especially studied and applied recently. A large number of publications are noticeable in fault prediction [16–22]. However, the application of neural network suffers from a number of drawbacks. One of the drawbacks is that the structure of neural networks is confirmed difficultly and empirically. Different people may design different network and choose different parameters for training and testing. Therefore, they may get different results on the same subject. Another significant drawback is that the performance of network is limited by the number and distributed situation of the chosen sample. In addition, neural network is usually a network without any constraint conditions. If there are any constraints on inputs and/or outputs, it will be very hard for the neural network to be trained to meet such constraints [9].

As a novel machine learning method, SVR which is proposed by Vapnik based on statistical learning theory adopts structural risk minimization principle instead of empirical risk minimization principle used in neural network, so that it has simpler structure and better generalization ability with small sample [23]. Besides, SVR has the advantages of the global optimum, handling high dimensional nonlinear data efficiently, and so on. For these reasons, it has been an effective method for modeling time series and widely applied in nonlinear time series prediction [24, 25]. In recent years, SVR has been also used successfully in the field of fault prediction [26–30]. However, some problems arise from the application of SVR techniques. The training of SVR must solve the problem of convex quadratic programming. Although the obtained solution is an optimal solution, the more the sample data is, the slower the computing speed will be and the bigger the spaces it takes, because the complexity of the algorithm relies on the number of the sample data.

Least squares support vector regression (LS-SVR) is the development of standard SVR. It can convert solving the problem of convex quadratic programming into solving linear equation group [31, 32], which greatly reduces the computational complexity of SVR. In other words, LS-SVR is computationally more efficient than standard SVR. Nevertheless, LS-SVR is more sensitive to outliers than standard SVR, which can result in less robustness. For some complex nonlinear and nonstationary time series, it is hard to achieve expected prediction result with single LS-SVR. Furthermore, the training of traditional LS-SVR is a batch learning process that will affect the real-time fault prediction because of time-consuming and heavy computation.

The purpose of this paper is to develop an underlying approach for real-time fault prediction of unknown nonlinear systems. Firstly, an online LS-SVR algorithm is derived for nonlinear time series prediction. On this basis, a combined online prediction method is developed for nonlinear system prediction. The sliding ARMA model is used to approximate the nonlinear time series and the LS-SVR is added to online compensate for the nonlinear modelling error with external disturbance. The combined method is then used to predict the nonlinear system. Based on the forecasting results of outputs, the fault prediction can be achieved using a density function estimation method only with normal data of system.

The remainder of this paper is organized as follows. An online LS-SVR algorithm is designed for nonlinear time series prediction in Section 2. In Section 3, the sliding ARMA modeling combined with online LS-SVR compensation is discussed and an effective prediction method is developed for nonlinear system prediction. In Section 4, an abnormity estimation method is used to predict the fault based on the prediction results of system. In Section 5, a simulation example is provided to illustrate the implementation procedures and prediction performance of the proposed method. Finally, conclusions and future work are given in Section 6.

#### 2. Time Series Prediction Based on Online LS-SVR

Compared with standard SVR, LS-SVR can greatly reduce the computational complexity, so that it has drawn more attention in practical application. However, traditional LS-SVR is off-line learning, so the real time is poor. In this section, we briefly introduce the basic concepts of LS-SVR [33] and design an online LS-SVR algorithm for nonlinear time series prediction algorithm.

##### 2.1. A Brief Introduction to LS-SVR

Given a time series training set , where and , a linear regression function with regard to and is constructed as where is a vector in a huge dimensional feature space ; it determines the margin of support vectors. is a nonlinear function which maps to a vector in . The and in formula (1) are obtained by solving an optimization problem:

In formula (2), the regularization parameter determines the fitting error and smoothness and the nonnegative error variable is used to construct a soft margin hyperplane. This optimization problem including the constraints can be solved by the Lagrange function as follows: where is the Lagrange coefficient. According to Karush-Kuhn-Tucker (KKT) optimization condition, a linear equation group below can be obtained by expurgating : where , , , , , and is a kernel function. Different kernel functions present different mappings from the input space to the feature space. So the LS-SVR model changes with the different kernel function. In most cases, the radial basis function (RBF) is employed as the kernel function because of its robustness.

Given the solution of formula (4), the LS-SVR model can be expressed as follows:

Formula (5) is a nonlinear function with regard to . Normally, those samples with nonzero are called support vectors of the regression function, because it is these critical samples in the training set that solely determine the formulation of (5).

##### 2.2. The Design of Online Learning LS-SVR Algorithm

The LS-SVR training algorithm introduced in Section 2.1 is a batch algorithm. That is, whenever a new sample is added into the training set, the existing regression function can only be updated by retraining the whole training set, which is not an efficient way to implement our prediction algorithm. Fortunately, we have recently derived an incremental LS-SVR training algorithm [34] as follows.

At the moment , let the sample be . The learning sample sets can be represented as , where , , and , .

So the kernel function matrix , the to-be-computed Lagrange coefficient , and constant warp are all functions of . That is to say, at the moment , they can be denoted separately by , , , and , so the output of LS-SVR (8) is transformed to be

Let , where is a unit matrix, so formula (7) can be rewritten as follows:

Let , , ; then where

Substituting formula (8) into (7) yields

At the moment a new sample is added and the kernel matrix

One problem with the algorithm is that the longer the prediction goes on, the bigger the training set will become and the more support vectors will be involved in the LS-SVR regression function (6). In some environments with limited memory and computational power, it is possible to stress out the system resources with the complexity of the LS-SVR model (6) growing in this way. One way to deal with this problem is to adopt a “forgetting” strategy. When training set grows to this maximum, then the LS-SVR model (6) will first be trained to remove the oldest sample before the next new sample is used to update the model.

To sum up, online learning LS-SVR is an optimization process along with time rolling; the whole process is described as follows.

*Step 1. *Initialization, .

*Step 2. *Select new data, while removing the oldest ones.

*Step 3. *Compute kernel matrices and .

*Step 4. *Compute and and predict .

*Step 5. *Consider ; return to Step 2.

##### 2.3. Online LS-SVR for Time Series Prediction

Let the dimension of the reconstruction space be equal to ; the corresponding input and output samples for online learning are then constructed as follows:

Along with the progression of the time , we can get the following prediction function through the continuous learning samples:

From formula (13), the one-step-ahead prediction value can be obtained at current time :

Considering moving horizon prediction, we take the prediction value as the known condition of the next step prediction and construct the next input value . So we can obtain the two-step-ahead prediction value according to formula (6).

By analogy, the -step prediction is then achieved as follows:

#### 3. Nonlinear Time Series Prediction Based on Sliding ARMA Combined with Online LS-SVR

##### 3.1. Review of Sliding Time Window ARMA

For the stationary time series, the ARMA model is defined as where , , and is called the order of the model ~ ARMA. In ARMA model, the current state can be represented using its past value , current white noise , and past white noise value in a linear regression form. The basic procedures of time series prediction utilizing ARMA can be described as follows.

###### 3.1.1. Data Processing

If the data exhibit a clear nonstationary and periodic feature, we should take some preprocessing methods such as -order difference and subtracting means of the series. The time series are much more stationary to fit the ARMA model after data processing.

###### 3.1.2. Autocorrelation Analysis

The autocorrelation feature of the existing time series is then analyzed. Here, the characteristics of the autocovariance function (ACVF) and the autocorrelation function (ACF) are adopted to assure the stationary of the time series.

###### 3.1.3. Modeling and Parameter Estimation

The parameters are estimated according to the ACF and partial ACF. Besides, Bayesian information criterion (BIC) and Akaike’s information criterion (AIC) are usually employed to compute the least order of the ARMA model. A more detailed introduction about ARMA modeling can be found in [35].

###### 3.1.4. Time Series Prediction

According to the AMRA model defined by (16), we can obtain the following prediction function:

So the one-step-ahead prediction value at current time can be described as

Correspondingly, the two-step-ahead prediction value at current time can be obtained as

Substituting formula (18) into (19), formula (19) is then represented as

In a similar manner, the -step-ahead prediction value can be derived as

As is discussed above, the ARMA model can be employed to forecast the further -step value of time series. However, when the time series updates for online application, the ARMA model should vary with the data set. To ensure the efficiency and decrease the computation burden, the model is built in the neighborhood through sliding time window.

##### 3.2. Nonlinear Prediction Using Sliding ARMA Revised by Online LS-SVR

The sliding ARMA above can be used to approximate time series online and achieve the prediction of time series. However, the ARMA model is essentially linear. Using it to fit nonlinear time series online, we can only obtain the local linear model of nonlinear systems and the unknown modeling errors caused by the nonlinearity and outside disturbance of system can be ignored on the whole. In order to compensate for the ARMA modeling errors and the unknown disturbance, the online LS-SVR model is adopted to revise the nonlinear prediction errors of the sliding ARMA. The flow of combined online prediction algorithm is described in Figure 1.

The initial data is preprocessed firstly to train for the ARMA model. In the local neighborhood, the sliding ARMA model is built for the future -step prediction of nonlinear time series. At the same time, the modeling errors of the sliding AMRA are used to compose new time series. The online LS-SVR algorithm is then trained for the -step prediction of the new time series. The final predicted value is yielded by both of the two model outputs. More detailed steps are as follows.

*Step 1. *Data processing.

*Step 2. *Initial training of the ARMA model.

*Step 3. *At time , the new sample is added to online train the sliding ARMA model (in the neighborhood of ) and is then predicted.

*Step 4. *The modeling errors are added to online train the LS-SVR and is then predicted.

*Step 5. *Final predicted value .

*Step 6. *When data updates, repeat Step 1.

#### 4. Fault Prediction Based on Time Series Abnormality Estimation

For many complex engineering systems, the acquisition of fault data generally is at the cost of equipment damage and loss of economy; it is unpractical to fully obtain the fault training data or prior knowledge [15]. Even though the training data of faults can be gained, the obtained data will gradually be out of use because of time variance and uncertainty of the actual system. Therefore fault prediction only with data of normal condition is provided with realistic significance.

Based on the combined prediction algorithm proposed above, the -step-ahead prediction value at time , , , can be obtained. This result includes the future information corresponding to fault. So we can use it to predict the fault.

Let denote the normal state and denote abnormal or fault state. It is assumed that system is normal at the beginning and all of time series data , in the past are known. The probability density functions of the time series sample , can be estimated by [15]

For a given sample at the time , we want to judge whether or not. We construct a test set including all of normal data till now, namely, . For an arbitrary sample , let its log-likelihood be denoted by . Similarly, the log-likelihood of is denoted by . We test the hypothesis with the following rule: where is a contrived threshold and . It means the confidence limit of hypothesis tests and influences the performance of fault predictor. A larger makes missing alarm rate become greater and makes false alarm rate become lesser; on the contrary, a smaller makes missing alarm rate become lesser and makes false alarm rate become greater. So the value of can be adjusted to obtain the best predictor with the error as low as possible.

#### 5. Experiments and Results

In order to demonstrate the potential of the proposed method in engineering practice, a practical case study for the US-made F-16 fighter from [36] is provided in this section. Suppose that the fighter flies straight away at a height of 500 m with the speed of 0.45 Mach at the beginning. The original trimming angle of attack ; the elevator deflection angle ; the composite linear velocity m/s and the sampling period is ms. In this paper, we consider the faults of fighter’s structure only. According to the features of faults in fighter’s structure, the pitching angular velocity is chosen as the criterion to check the situation of fault.

Suppose that left elevator is locked into at 1.5 s. Using the method presented in this paper, the structure of simulation process is shown in Figure 2.

At the th sampling time point, a sliding ARMA model can be obtained using the approach in Section 3. According to the error between the actual system output and the ARMA model output in (16), the online LS-SVR algorithm can be designed to revise the nonlinear prediction errors of the sliding ARMA. To simplify the calculating process, the order of ARMA model is set as 2. The other parameters selected are the embedding dimension and the parameters of LS-SVR with RBF kernel , , .

Figure 3 shows the one-step prediction results and the prediction errors when the system has no external disturbance. The solid line means the actual output of system and the dash line means the prediction output of our approach. According to Figure 3, the future dynamics of the system can be predicted accurately.

**(a) The predicted output and the actual output of the system**

**(b) The prediction error**

To further prove the effectiveness of the proposed method, the external disturbance of system is then considered. In Figure 4, the prediction errors of our approach are compared with the results only with online LS-SVR. The solid line represents the errors without online LS-SVR, while the dash line represents the error of our approach. Figure 4(a) shows the result when the variance of external disturbance is 0.02 and Figure 4(b) shows the result when the variance of external disturbance is 0.1. With the variance of external disturbance increasing, the errors of only using online LS-SVR are increasing rapidly, while the prediction errors of our approach are changeless. Obviously, the proposed approach can compensate for the external disturbance efficiently.

**(a) The comparison of the prediction error when the disturbance variance is 0.02**

**(b) The comparison of the prediction error when the disturbance variance is 0.1**

At last, the fault judgment result through hypothesis test is provided in Figure 5, when . The result shows that the fault is predicted at the time 1.5125 s when the variance of external disturbance is 0.1, which means that the fault can be predicted in one period. As is known, the aircraft has a flexible element in input channels to avoid the abrupt changes of control. Generally, it can be regarded as an inertial construction; the time to add a control completely into the input side of system is 0.05 seconds. Same argument is that the time for the fault signal added to the input side of system is also considered as 0.05 seconds. This means that the time of the period when the systems deviate from normal condition until the fault happened is 0.05 s, namely, four sampling periods. Under such circumstances, the proposed approach can predict the fault in advance with three sampling periods. So the approach can do the job of online fault forecasting successfully. It is worth mentioning that the fault can be predicted only with data of normal condition.

#### 6. Conclusion

This paper presents a robust online fault prediction approach for model-unknown nonlinear system. The sliding ARMA model is used to approximate the nonlinear systems and the online LS-SVR is designed to compensate for the nonlinear modeling error with external disturbance. The prediction results from the two models are used to judge the fault based on a density function estimation method. The proposed method can predict the fault of nonlinear system accurately only using normal data of system. And it can study and predict with fewer amounts of calculation while system is running. So, it is believed with good real-time capability and applicability. The results of simulation on fighter F-16 show the efficiency of the proposed method.

The purpose of the paper is to supply an effective approach for online fault prediction of some actual systems with complex nonlinearity. As with other approaches, it is impossible for the proposed approach to succeed in all scenarios. However, it can at least provide an alternative and complementary solution to some problems in which other available techniques may fail. In the future, we should further improve the celerity of our method through optimizing the sliding ARMA in the aspect of model updating. Besides, the parameter selection and the interaction between two models will be also a further research work.

#### Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

#### Acknowledgments

This work is supported by Innovation Program of Shanghai Municipal Education Commission under Grant no. 12YZ156, the Fund of SUES under Grant no. 2012gp45, Shanghai Municipal Natural Science Foundation under Grant no. 12ZR1412200, and National Natural Science Foundation of China under Grant no. 61271114.