Abstract

The prediction of the aero-engine performance parameters is very important for aero-engine condition monitoring and fault diagnosis. In this paper, the chaotic phase space of engine exhaust temperature (EGT) time series which come from actual air-borne ACARS data is reconstructed through selecting some suitable nearby points. The partial least square (PLS) based on the cubic spline function or the kernel function transformation is adopted to obtain chaotic predictive function of EGT series. The experiment results indicate that the proposed PLS chaotic prediction algorithm based on biweight kernel function transformation has significant advantage in overcoming multicollinearity of the independent variables and solve the stability of regression model. Our predictive NMSE is 16.5 percent less than that of the traditional linear least squares (OLS) method and 10.38 percent less than that of the linear PLS approach. At the same time, the forecast error is less than that of nonlinear PLS algorithm through bootstrap test screening.

1. Introduction

Aeroengine condition monitoring is applied as a better way to estimate aero-engine condition and reliability amongst most of the commercial airlines [1, 2]. Predicting and detecting abnormal behavior of main aeroengine performance parameters is of great importance in the aeroengine condition monitoring. With the development of modern industry, one typical aeroengine becomes a more and more integrative and complex system, of which working condition and fault are very difficult to predict and judge, and the influence and harm caused by system faults are also more serious than ever. Therefore, there will be a growing demand for effective prediction methods of long-term trend of aeroengine system to predict abnormal behavior of significant performance parameters and potential faults. Then we can take preventive maintenance measure to avoid faults occurrence, shorten aircraft down time, thereby reducing maintenance cost and improving aeroengine life.

The aeroengine system is a complex system with nonlinear and dynamic chaos action, so it is difficult to establish an exact analytic mathematical model of performance parameters. As a result, analysis usually depends on time series obtained from observation. But in practical monitoring the availability of obtaining characteristic data is very limited, and one fault usually shows various changes in performance parameters, and a same change may result from different faults [3, 4]. However, extracting multiple fault characteristic of parameters usually brings complexity and redundancy, which may bring difficulty for the following prediction. The theory of chaotic prediction is a mathematics method to deal with inexactness, uncertainty, and incompletion, which is remarkable efficiency in aspects of data reduction and feature extraction.

Studies on chaos prediction can be traced back to Parker and Chua (1989) [5]. More sophisticated approaches of nonlinear chaotic time series prediction mainly rely on neural network [6], support vector theory [7], fuzzy rules [8], a simple least square algorithm [9]. In recent years, considerable interest has been aroused in the study of nonlinear chaos prediction with the actual observation of aeroengine dynamic system [10, 11]. These papers summarize the prediction model of nonlinear chaos system based on ordinary least squares (OLS); however, there are some practical problems in those models such as serious multi-collinearity and poor robustness of regression function.

Partial least squares (PLS) can overcome information overlaps caused by multi-collinearity since it adopts canonical correlation analysis, which improves accuracy of nonlinear system modeling [12]. What is more, PLS adopts cross validity method, which rejects redundancy principal component and enhances regression equations robustness. Hence, in our investigation the designed chaotic algorithm is based on the nonlinear partial least squares (NPLS) regression and the prediction research on the time series.

2. Data Pretreatment

Typical aeroengine performance parameters are exhaust gas temperature (EGT), low-spool rotor speed (NI), high-spool rotor speeds (N2), and fuel flow (WF) [13]. These four measurements are often called the four basic parameters, which can be recorded in the air-borne equipment on most new and old aeroengines. Note that we take the most important performance parameter, EGT, as example to analyze in this paper; the other parameters can be fitted in the same method.

2.1. Similarly Correcting EGT Sequence

According to operating principle of the aeroengine, the data of main performance parameters of the same aeroengine in different external flight conditions such as the external atmosphere total temperature, total pressure, MACH numbers are quite different, so the original engine exhaust temperature EGT data usually cannot be directly used for comparison analysis. To solve this problem, similar theory is adopted to eliminate the influence of external conditions on the aeroengine EGT sequence [14]. EGTcor=EGTobs𝜃𝑇2,(2.1) where EGTobs is the original EGT data, EGTcor is the corrected value of EGTobs,𝜃𝑇2=(TAT+273.15)/288.15, TAT is the total air temperature.

2.2. Eliminating Outliers According to Layida Criterion

If outliers exist in the data, they will seriously affect the prediction accuracy of an algorithm. Layida criterion is the most commonly used for discrimination outliers, of which the precondition is that there is sufficient data [15]. For time series {𝑋𝑖}(𝑖=1,,𝑛), if 𝑛 is large enough (normally 𝑛>80), and only random disturbance exists in all data. Assume that the random time series is following a normal distribution whose mean is μ and deviation is 𝜎2, so the probability that the deviation (𝑋𝑖𝜇) falls in ±3𝜎 ranges is about 0.3%. According to Layida criterion, if the absolute deviation of a time series data 𝑋𝑖(𝑖=1,,𝑛) is larger than 3𝜎, then 𝑋𝑖 can be judged as an outlier and should be removed.

Let the number of sample be 𝑁, the observed value of time series is {𝑥𝑖}(𝑖=1,,𝑁), the sample mean and standard deviation is 𝑥 and 𝑆, respectively. If the sample point 𝑥𝑖 meets |𝑥𝑖𝑥|>3𝑆, then 𝑥𝑖 is an outlier which will be eliminated. 𝑥 and 𝑆 should be computed again after all the outliers are removed.

3. Chaos Prediction Model Based on Nonlinear Partial Least Squares (PLS) Regression

Suppose observed time series is {𝑥𝑛}(𝑛=1,2,,𝑁), the chaotic phase space 𝑠𝑛 is reconstructed by [16]. 𝑠𝑛=𝑥𝑛,𝑥𝑛𝜏,,𝑥𝑛(𝑚1)𝜏𝑅𝑚,𝑛=(𝑚1)𝜏+1,(𝑚1)𝜏+2,,𝑁,(3.1) where 𝜏 is the delay time, m is the embedding dimension and 𝑁 is the length of the time series.

3.1. Characterizing Chaos through Lyapunov Exponents

Through the years, the existence of chaos has been characterized in time series by means of several methods, among others: analysis of Fourier spectra, fractal power spectra, entropy, fractal dimension, and calculation of Lyapunov exponents. However, several of these methods have proven not to be very efficient [16]. The Lyapunov exponents provide all the information about the local and global complexity of chaos, therefore the measurement of Lyapunov exponents has been an effective method to judge whether a time series is caused by a chaotic dynamical system.

The Lyapunov exponents 𝜆1,𝜆2,,𝜆𝑑 are the average rates of expansion (𝜆𝑖>0) or contraction (𝜆𝑖<0) near a limit set of a dynamical system [16]. In fact, the LE is a quantitative measure of the divergence of several trajectories in the system. There is one LE for each direction in the d-dimensional phase space where the system is embedded. The variable d represents the Lyapunov dimension or phase space dimension. The LE does not change when the initial conditions of a trajectory are modified or when some perturbation occurs in the system. If at least one LE is positive, the system presents chaotic motion [17]; hence if chaos exists in a dynamical system, the largest Lyapunov exponent must be positive.

We will calculate the largest Lyapunov exponents according to the algorithm offered by Wolf et al. (1985) [17]; the numerical results are shown in Table 1.

3.2. Chaos Prediction Model

Let state vector at time 𝑇 be 𝑠𝑇, we obtain the nearest neighbor points 𝑠𝛼1,𝑠𝛼2,,𝑠𝛼𝐾 by calculating the Euclidean distance between the target points 𝑠𝑇 and any one of reconstruction vectors [18], where 𝑠𝛼𝑖=𝑥𝛼𝑖,𝑥𝛼𝑖𝜏,,𝑥𝛼𝑖(𝑚1)𝜏,𝑖=1,2,,𝐾.(3.2)

Let 𝑋 be the sample matrix of reconstructed variable, 𝑥𝑋=𝛼1𝑥𝛼1𝜏𝑥𝛼1(𝑚1)𝜏𝑥𝛼2𝑥𝛼2𝜏𝑥𝛼2(𝑚1)𝜏𝑥𝛼𝐾𝑥𝛼𝐾𝜏𝑥𝛼𝐾(𝑚1)𝜏𝑅𝐾×𝑚.(3.3)

Let 𝑦 be the prediction variable, then the nonlinear chaos prediction model is described as 𝑥𝑦=𝑓1,𝑥2,,𝑥𝑚,(3.4) where 𝑥𝑖 is the 𝑖th column of 𝑋,𝑖=1,2,,𝑚,𝑦=[𝑥𝛼1+𝑝,𝑥𝛼2+𝑝,,𝑥𝛼𝐾+𝑝]𝑅𝐾, 𝑝 is the length of prediction step.

In practice, the observed data sets of independent variable and dependent variable can usually be obtained. However, the model form of their relations is unknown especially in case of higher dimensions, and the problem becomes more complicated when the relationship between the dependent variable and independent variables is nonlinear. Usually (3.4) is fitted by additive model or multiplicative model. For the sake of computation convenience, we consider the addition model of each independent variable; that is, 𝑦=𝑓1𝑥1+𝑓2𝑥2++𝑓𝑚𝑥𝑚+𝜀.(3.5)

At present, the nonlinear relationship between the original variable usually is into a quasi-linear relationship, after then, the prediction model is established by an effective linear theory method. Spline function and kernel function are commonly used in the transformation of the basic functions.

(1) Cubic Spline Function Transformation

Let 𝑓𝑗(𝑥𝑗)(𝑗=1,2,,𝑚) be the cubic spline fitting function, then, 𝑦=𝛽0+𝑓1𝑥1+𝑓2𝑥2𝑓++𝑚𝑥𝑚+𝜀,(3.6) where 𝑓𝑗𝑥𝑗=𝑀𝑗+2𝑙=0𝛽𝑗,𝑙Ω3𝑥𝑗𝜉𝑗,𝑙1𝑗,Ω(3.7)3𝑥𝑗𝜉𝑗,𝑙1𝑗=𝑧𝑗,𝑙=13!3𝑗4𝑘=0(1)𝑘4𝑘𝑥𝑗𝜉𝑗,𝑙3+𝑘3𝜉,(3.8)𝑗,𝑙1𝑥=min𝑗+(𝑙1)𝑗,𝑗=𝑥max𝑗𝑥min𝑗𝑀𝑗𝑙=0,1,,𝑀𝑗,+2(3.9) where 𝛽0,𝛽𝑗,𝑙 are the model parameters to be determined; 𝜉𝑗,𝑙1,𝑗,𝑀𝑗 are range points, segment length and section number are divided on the variable 𝑥𝑗, respectively.

The minimum observed value of variables is recorded as min(𝑥𝑗) and the maximum as max(𝑥𝑗), then the nonlinear prediction function relationship between independent variables and dependent variable can be transformed as the following equation: 𝑦=𝛽0+𝑚𝑗=1𝑓𝑗𝑥𝑗+𝜀=𝛽0+𝑝𝑀𝑗=1𝑗+2𝑙=0𝛽𝑗,𝑙Ω3𝑥𝑗𝜀𝑗,𝑙1𝑗+𝜀.(3.10)

Equations (3.6)–(3.8) show that the relationship between 𝑦 and 𝑧𝑗,𝑙 is linear, and the relationship between 𝑥𝑗 and 𝑧𝑗,𝑙 is nonlinear, so the chaos prediction model with cubic spline function transformation is a quasi-linear regression model (see in (3.10)).

(2) Kernel Function Transformatio

The principle of Kernel function transformation is same as spline function transformation; that is, a kernel function transformation based on one dimensional nonlinear function 𝑓𝑗(𝑥𝑗), and the cubic spline function{Ω3((𝑥𝑗𝜉𝑗,𝑙1)/𝑗)(𝑙=0,1,,𝑀𝑗+2)} is replaced by basic function {𝐾((𝑥𝑗𝜉𝑗,𝑙1)/𝑗)(𝑙=0,1,,𝑀𝑗+2)}. Thus, the chaos prediction model with kernel function transformation is as follows [19]: ̂𝑦=𝛽0+𝑝𝑗=1𝑓𝑗𝑥𝑗=𝛽0+𝑝𝑀𝑗=1𝑗+2𝑙=0𝛽𝑗,𝑙𝐾𝑥𝑗𝜉𝑗,𝑙1𝑗.(3.11)

The commonly used kernel functions include uniform kernel function, Epanechnikov kernel function, biweight kernel function, tri-weight kernel function and Gaussian kernel function.

3.3. Nonlinear Partial Least Squares Regression (NPLS)

However, there exist some problems when establishing the above quasi-linear regression model based on ordinary least squares method, because the dimension of regression model will increase significantly after a function change, which may lead to the result that the number of sample points is less than the number of transformed variables; on the other hand, the data of reconstructed variable 𝑥1,𝑥2,,𝑥𝑚 are come from the same series, and the nonlinear correlation exists between each other. So it is not appropriate to employ the ordinary least squares method to estimate the model parameters when the multicollinear relationship among the new transformed independent variables is found. One of the effective methods to solve this problem is to use partial least squares method (PLS) to estimate parameter of the predictive model.

In this paper, a modified chaos prediction approach based on nonlinear PLS regression with basic function transformation is developed. The cubic spline function and various kernel function transformation are used in our model, and the comparative analysis between them is presented in Section 5.

3.4. Prediction Evaluation

Normally, prediction accuracy is evaluated by the mean squared error MSE, 1MSE=𝑃𝑃𝑝=1𝑥𝑇+𝑝̂𝑥𝑇+𝑝2,(3.12) where ̂𝑥𝑇+𝑝 is predict value, 𝑥𝑇+𝑝 is observed value. However MSE is concerned with the order of the magnitude of the observed data. In this paper, the normalized mean square error NMSE is used as the evaluation criterion of prediction. NMSE=MSE𝜎2,(3.13) where 𝜎2 is sample variance of the prediction sequence. Our investigation adopts the mean of 20 predicted values NMSE in order to overcome contingency.

4. Algorithm and Algorithm Flowchart

Step 1. The observational data of aeroengine performance parameter EGT are collected and similarly corrected, and then the outliers are removed according to Layida criterion.

Step 2. For the selected embedding dimension m (1𝑚𝑚_max) and delay time 𝜏(1𝜏𝜏_max), the chaos phase space on pretreated EGT time series is reconstructed, and the appropriate nearby points are selected to construct independent variable matrix 𝐗 and dependent variable vector y.

Step 3. Choose section number M (1𝑀𝑀_max); each row vector 𝑥𝑗(𝑗=1,2,,𝑚) in independent variable matrix 𝐗 is transformed to 𝐙𝑗 by various basis function transformation such as (3.8), then 𝑍𝑗=𝑧𝑗,0,𝑧𝑗,1,,𝑧𝑗,𝑀+2.(4.1)

Step 4. The new independent variable 𝐙𝑗 and dependent variable 𝐲 are standardized and marked as (𝑍𝑍,̃𝑦)=(1,𝑍2𝑍,,𝑚,̃𝑦), so the prediction model is described as ̃𝑦=𝑝𝑗=1𝑀+2𝑙=0𝛼𝑗,𝑙̃𝑧𝑗,𝑙+𝜀,(4.2) where 𝛼𝑗,𝑙(𝑗=1,2,,𝑚;𝑙=0,1,,𝑀+2) are the parameter of regression model (4.2), 𝜀 is the stochastic error and follows the normal distribution with zero mean and the same variance.

Step 5. The partial least squares (PLS) is applied to build regression model (4.2). The number of appropriate PLS components is extracted by effective cross-method [20]; estimate the regression coefficients 𝛼𝑗,𝑙(𝑗=1,2,,𝑚;𝑙=0,1,,𝑀+2), and obtain the regression coefficients 𝛽0,𝛽𝑗,𝑙(𝑗=1,2,,𝑚;𝑙=0,1,,𝑀+2) of original model (3.5) by returning 𝛼𝑗,𝑙.

Step 6. If estimation of prediction model for all the combination of 𝑀 are completed, then go to step 7, otherwise go to step 3.

Step 7. Calculate the predicted value according to prediction function (3.6) and calculate the corresponding predicted NMSE.

Step 8. If the calculation of the combination of all the embedding dimension 𝑚 and delay time 𝜏 is completed, then go to step 9, otherwise go to step 2.

Step 9. Find out the optimal prediction model with the smallest NMSE, and obtain the corresponding dimension 𝑚, delay time 𝜏 and section number 𝑀.

Algorithm flow chart is shown in Figure 1.

5. Results and Analysis

The in-the-wing ACARS data of four PW4077D aeroengines have been collected for one year, and about 1400 EGT data of each engine were selected to form a time series used in numerical example analysis.

The maxim value of embedding dimension 𝑚_max, the delay time 𝜏_max, and the segment number 𝑀_max in the above algorithm are estimated after a large amount of experiment. In our research, the range of 𝑚 is among 1–30, 𝜏 is 1–20, 𝑀 is 1–8, respectively. The cubic spline function and five kernel function introduced in Section 2 are employed for transformation, and we minimize chaotic prediction NMSE based on PLS algorithm.

The numerical results of the first engine are shown in Table 1, in which line 1 to line 6 show the prediction results based on various NPLS, line 7 to line 8 show the prediction results based on the common PLS and OLS.

As can be seen in Table 1, all the largest Lyapunov exponents are positive and it is clarified that the nonlinear and chaotic natures exist in EGT sequence. The first seven NMSE calculated based on PLS are all less than that based on OLS, which indicates that there is an obvious advantage of PLS over OLS in fitting chaotic prediction models. The NMSE of PLS based on various nonlinear transform methods is even less than that of free transformed PLS, and it indicates that there is a greater advantage of chaotic prediction based on PLS in forecasting aeroengine EGT. PLS prediction algorithm based on biweight Kernel transformation is the optimal chaotic prediction algorithm, estimated optimal embedding dimension is 8, the delay time is 19, and the number of nonlinear transformation section is 2.

PLS regression model includes all of the original selected variables, and there is multi-collinearity in the independent variables, which can affect the prediction accuracy. The model parameters estimated based on PLS is of a pretty complex nonlinear character, so the independent variables cannot be selected by the parameter test of linear methods. In this paper, the nonparametric statistical method (Bootstrap [20]) is adopted to select the independent variables transformed with kernel function and Cubic spline function, and to find the minimum NMSE; the optimal results of the first engine are shown in Table 2.

Table 2 shows the predictive results based on various NPLS algorithm after selecting independent variable by Bootstrap method. Only the optimal predictive NMSE based on triweight Kernel function transformation decreases, and the minimum NMSE in Table 2 is larger than that in Table 1; it indicates that better NMSE cannot be achieved by Bootstrap test for selecting independent variables.

Figure 2 shows the comparison of the prediction values based on OLS, PLS, and biweight Kernel transformation chaos algorithm with the original value. The PLS algorithm based on biweight Kernel function transformation has the optimal NMSE in Figure 2. However, not all of the predictive value has the minimum forecast NMSE.

The deviation between the predictive value and original value based on the above three algorithms are shown in Figure 3. The fluctuations of the deviations curve based on biweight Kernel method with the optimal NMSE is the smallest, and larger predicted error is not appeared in its deviation sequence.

Through the numerical results of the first engine we also find that the search range of designed algorithm is enormous and it takes a long time, so we hope to further reduce search range and the algorithm complexity. The percentages of the number of NMSE in Table 1 which falls in [min(NMSE),min(NMSE)+0.03] are statistically analyzed, and the results are shown in Figures 4, 5 and 6.

As can be seen from Figure 4 to Figure 6, the percentage of NMSE falling in [min(NMSE),min(NMSE)+0.03] is lower for different value of 𝑚, τ, M, which indicates that the probability of obtaining suboptimal solution by our algorithm is very low, and also illustrates that the selection of variables is of significant importance to improve the prediction accuracy. From Figure 4 we can see that when the dimension of predicted model is over 15, there is no optimal NMSE which falls within the range above mentioned, So the range of 𝑚 in the algorithm can be selected among ranging from 1 to 15. Figures 5 and 6 also show that the predicted NMSE distribution of various delay time and section number are more dispersed, so the search range of 𝜏 and 𝑀 cannot be minimized.

The brief computational results of the other three engines are as in Table 3.

From the above calculation results we can see that the algorithm based on various nonlinear PLS (Cubic Spline Interpolation, Uniform Kernel, Epanechnikov Kernel, biweight Kernel, Tri-weight Kernel, and Gaussian Kernel) is superior to that based on PLS, and the algorithm based on PLS is superior to that based on OLS. The results in Tables 1 and 3 show that the predictive results derived from the algorithm based on biweight Kernel are the optimal among the four aeroengines. And there are no significant difference between the first aeroengines and the second aeroengines in the same aircraft in predictive NMSE, so we can conclude that the proposed algorithm is stable for predicting the aeroengine parameter EGT.

Tables 1 and 3 also show that the optimal values of the embedding dimension (m), delay time (𝜏), subset numbers (𝑀) searched by various NPLS algorithm are approximately the same, but they are quite different among the four aeroengines.

6. Conclusion

The chaos prediction model based on NPLS through Kernel function or Cubic Spline function transformation is developed and is applied in predicting the aeroengine performance parameters EGT. Numerical results show that the proposed model is able to effectively predict the performance parameter EGT with a higher degree of accuracy. The optimal embedding dimension is 8, the delay time is 19, and the number of segments is 2. NPLS chaotic prediction algorithm based on biweight Kernel function transformation is the optimal EGT prediction algorithm because of the smallest predictive NMSE. Our prediction NMSE is 16.5 percent less than that of the traditional linear least squares (OLS) method and 10.38 percent less than that of the linear PLS approach. The NPLS algorithm of chaotic prediction with selection variables through Bootstrap test fails to make NMSE further reduced. And we find that reducing calculation time complexity by lessening search range of embedding dimension will not affect the prediction accuracy of EGT. It is proved that the proposed algorithm is robust by the predictive results from four aeroengines.

In particular, the conservative baselines of main aeroengine performance parameters are given in ECM program provided by jet engine manufacturers [14]. Set the baseline value of EGT be EGTbase, the deviation between EGTcor and EGTbase can be defined as ΔEGT=EGTcorEGTbase.(6.1)

The trend plot of the deviation (ΔEGT) sequence in continuous flights is routinely used by airlines for the performance monitoring and diagnostics [13]. On the basis of our prediction algorithm, the EGTcor of upcoming flights can be predicted, and then a layout for trending of the deviation of main performance parameters for upcoming flights can be provided. Experienced power maintainer or engineer look at a given trend shift in the measurement deviations can identify significant behavior disparities, evaluate the aeroengine condition, and isolate the faulty module or component using the ECM trend plot reports and related PW4000 fingerprint.

In summary, accurately forecasting the aeroengine performance parameters is of great importance in the aeroengine condition monitoring. The anomalies of the aeroengine performance parameters can be detected in time by the predictive value calculated from our algorithm, then maintenance measures are taken early to prevent the aeroengine from failure effectively, which assists to reduce aircraft down time, and improve the reliability of the aeroengine.

Acknowledgment

This work is supported by the Fundamental Research Funds for the Central Universities under Grant no. ZXH2012 K005.