Abstract

We apply the polynomial function to approximate the functional coefficients of the state-dependent autoregressive model for chaotic time series prediction. We present a novel local nonlinear model called local polynomial coefficient autoregressive prediction (LPP) model based on the phase space reconstruction. The LPP model can effectively fit nonlinear characteristics of chaotic time series with simple structure and have excellent one-step forecasting performance. We have also proposed a kernel LPP (KLPP) model which applies the kernel technique for the LPP model to obtain better multistep forecasting performance. The proposed models are flexible to analyze complex and multivariate nonlinear structures. Both simulated and real data examples are used for illustration.

1. Introduction

Chaos is widely encountered in nature and many scientific areas since Lorenz found chaotic motion in his research on meteorology [1]. Over the latest decades, researchers in many fields have paid much attention to chaos [213], and modeling and prediction of chaotic time series have become popular in the areas of meteorology [1, 2], medicine [3], economics [4], signal processing [5], traffic flow [6], power load [7], Sunspot prediction [811], and many others [12, 13].

Although prediction of chaotic time series is very difficult, the chaos theory [14] provides a useful tool to predict chaotic time series. With the development of chaos theory and research on its application technique, the global [15] and local [16] methods are proposed as two main categories. The global method makes an attempt to approximate the whole time series on all attractors and seeks a function which is valid at every point. Then, we can get the future values from knowledge of all the previous elements of the time series. However, the parameters may be changed when new information is added into the model. The local method only use part of the past information to approximate the local attractor. The future values can be inferred by using the neighborhoods of the current points. Many prediction methods have gained popularity in practice during the last decades based on two main categories. Such as adaptive prediction [1719], the support vector machine (SVM) [2027], polynomial estimation [2830], and neural network [811, 3136].

Recently, some researchers studies have shown that local methods can obtain generally better results than those obtained with global methods [18]. And some researchers found that the forecast accuracy can be improved by using some combining techniques both in the global and local method. By using those combining techniques, the parameters can be obtained faster and the analysis of residual can be underestimated if residuals are not randomness. Such as SVM combined with neurofuzzy model [21], SVM combined with PCA [26], ARMA combined with RESN [8], and neural network combined with neurofuzzy model [34]. Those methods can obtain generally better results than those obtained with single model, but they are complex, affected by personal experience and easy to be overfitted.

To overcome these disadvantages, we propose to approximate the local attractor by using local polynomial coefficient autoregressive prediction (LPP) model. The functional coefficients of the state-dependent autoregressive model can effectively fit nonlinear characteristics of nonlinear time series. By using the polynomial function to approximate the functional coefficients, we can use LPP model to approximate the local attractor. The local attractor can be obtained by utilizing Takens embedding theorem [10]. The kernel prediction can improve the forecast accuracy [37, 38]. We combine the LPP with the kernel technique to develop a kernel LPP (KLPP) model to improve the multistep forecast accuracy. The KLPP model can be applied to choose proper radius of the nearest neighbors by using the kernel function. In this work, we compare our models with those reported in the literature to illustrate the effectiveness of our models in predicting chaotic time series.

The paper is organized as follows. In Section 2, the concept of the state-dependent autoregressive model, the LPP model, and the KLPP model are proposed, and the optimal parameter set is established by using GDF which is a tool to evaluate the goodness of the model with the chosen number of neighbors [12]. In Section 3, the prediction errors of the generated time series are computed to analyze the prediction effectiveness. Section 3 uses three simulated chaotic systems and one real life time series as examples to evaluate the proposed models. The conclusion of this paper is given in Section 4.

2. The LPP Model and KLPP Model

2.1. Data Structure in Phase Space Reconstruction

Suppose that we have a scalar chaotic time series . The first step of a local prediction is the phase space reconstruction. According to Takens embedding theory, the phase space can be reconstructed and the construct phase points , where , and . The embedding dimension and the time delay can be obtained by the Cao method [39]. Then, a continued vector mapping or can be used to describe the unknown evolution from to . That is, and .

2.2. The LPP Model

The state-dependent autoregressive model can be described as follows:where and functional coefficient , is a continued vector mapping .

On main purpose of this work is to estimate the mapping by using this model. The state-dependent autoregressive model in an -dimensional reconstructed phase space can be given as follows:where , is the embedding dimension, and is the time delay.

We approximate by a polynomial function , where the as the lag variable and . Then, the LPP model in an -dimensional phase space can be described as follows:where , the is the lag variable , and is the length of the polynomial function. Then, we haveThe classical state-dependent autoregressive model has shortcomings such as low accuracy and low processing speed and it is affected by personal experience. The proposed method use the polynomial function to approximate the functional coefficients of the state-dependent autoregressive model, which makes it characterized by parameter model to reduce personal experience and improves processing speed. Also the proposed method is combined with the chaos theory, which makes it obtain high prediction accuracy.

2.2.1. The LPP Estimation and Determination of the Optimal Parameters

In order to obtain LPP model as the approximation of the mapping at the current state point in the reconstructed phase space, we select nearest neighbor points by using the Euclidean distances and fit a LPP model to predict the future point of . We can estimate the parameters through the least square equationwhere and .

Define and set

It follows from least squares theory thatBy using (4), the prediction value can be calculated. We add to the training set and utilize the prediction point as the last phase point . Then, we can compute the multistep prediction by the same scheme in the forecast of .

By using a concept of generalized degrees of freedom (GDF) [12, 13], the error variance can be measured. And the optimal parameters of the prediction model can be obtained by comparing the calculated GDF with different parameters. The unbiased estimators can be obtained in every step of the prediction with different parameters as follows:where , , and is the number of nearest neighbor points.

2.3. The KLPP Model

We combine the LPP with the kernel technique to develop a kernel LPP (KLPP) model. The KLPP model can be applied to choose proper radius of the nearest neighbors by using the kernel function. We suppose that the spatial correlation between phase points could be measured by the kernel function. For the choice of the kernel function, we adopt the Epanechnikov kernel. It is a truncated function and is defined as follows:where . Epanechnikov kernel is the optimal kernel for the kernel density estimation [40] and the local polynomial estimation [41] which minimizes the mean square error. Also, it is a truncated function which will help choose the neighborhood radius. So, we adopt the Epanechnikov kernel for prediction, although there are other kernel functions such as normal function.

The estimators is the minimizer of the sum of weighted squares:where , , and . The parameter is used to control the boundary of the nonnegative kernel function which is introduced to choose radius of the nearest neighbors and emphasize neighboring observations around when estimating . It is clear to see that the prediction accuracy of KLPP model is sensitive to the radius of the neighbors. In this study, we choose the parameter , in terms of GDF, and the optimal value of is the minimizer of the estimators .

Thus, the estimators can be obtained from least squares theory thatwhere and is a diagonal matrix with as its th diagonal element, is a matrix with as its th row, and . And the GDF method chooses to minimizewhere , , and is the number of nearest neighbor points with the Euclidean distance less than .

Now, we outline the algorithm based on the basic idea of our models.

Step 1. For a scalar time series , the phase space reconstruct with the embedding dimension and the time delay by the autocorrelation function method and the Cao method; that is, .

Step 2. Compute the Euclidian distances and select the Epanechnikov kernel as the spatial correlation between phase points. Then, the kernel coefficient can be calculated by (9).

Step 3. Identify the parameters from (11).

Step 4. Calculate the in every step of the prediction with different parameters , , and and select the optimal parameters that make get minimum.

Step 5. Fit the KLPP with the optimal parameters and calculate the prediction value .

Some additional remarks are now in order.

(i) With the Epanechnikov kernel, let be from 1 to 15 and in Step 4, where is the Euclidian distance between the th nearest neighbor and the reference point and is the Euclidian distance between the th nearest neighbor and the reference point.

(ii) We acquire one-step and multistep prediction values by using real value and prediction value to the training set as the prediction steps developed, respectively.

(iii) To speed up the computation and avoid overfitting, we may let .

(iv) For the algorithm of LPP model, we calculate the parameters from (7) in Step 3.

3. Numerical Experiments and Performance Evaluation

Here, we consider three simulated chaotic systems, Lorenz, Mackey-Glass, and Henon and one real life time series, Sunspot time series, as examples to evaluate the proposed models. Later, the results of proposed models are compared with the results reported in the literature for the above examples. For the multistep prediction of chaotic time series, it will produce data overflow when order of polynomial function is larger and normalization can avoid this phenomenon. Thus, those chaotic time series are selected and scaled between as follows:

The prediction errors of the generated time series are computed to analyze the prediction effectiveness and compare the presented models with the results in the literature which are maximum absolute error (MAE), root mean squared error (RMSE), and normalized mean squared error (NMSE). Namely,

3.1. Performance Analysis in Lorenz Equations

The Lorenz time series can be produced as follows [1]:where , , and are dimensionless parameters and most commonly selected to be , , and . The standard fourth-order Runge-Kutta method is used to get the Lorenz time series and the -coordinate is used for prediction. A time series with a length of 2000 is randomly generated. 1800 samples are used for training, and the rest are treated as testing. The results of one-step and multistep predictions for Lorenz time series are shown in Figure 1, in which we use LPP model and KLPP model.

From Figure 1, it can be seen that, for the one-step prediction, both the prediction values of LPP and KLPP models have small error values. For the multistep prediction, we find out that KLPP model has smaller error values than LPP model and the values of prediction are in good agreement with the real time series.

The results of prediction are shown in Figures 2 and 3. From Figure 2, we can see that the LPP model parameters are in good agreement with the KLPP model parameters. The polynomial orders mainly take one or two, which avoids overfitting. The lag variables are selected from to at different phase points. The number of nearest neighbor points of both LPP model and KLPP model is chosen in the vicinity of 50, respectively. The radius of the neighbors changes from 0.05 to 0.3. From Figures 1 and 2, for one-step prediction, we can see that both LPP model and KLPP model have similar prediction performance. For the multistep prediction, the prediction values of LPP start diverging significantly from the 120th time step, and the error values are larger than KLPP model. In Figure 3, it can be seen that the parameters almost do not change in account with the parameters in Figure 2. This implies that the kernel can improve the performance of the multistep prediction of chaotic time series.

3.2. Performance Analysis in Mackey-Glass Equations

The Mackey-Glass model has been used in literature as a benchmark model due to its chaotic characteristics [42]. Mackey-Glass time series is generated by the following discrete form:where , , and and initial conditions . Thus, we can obtain a scalar chaotic time series samples set with length of 2300 from (16). A segment of 1800 samples is used for training, while the remaining part is treated as testing data. Figure 3 compares the real value with prediction value for the rest samples for testing.

From Figure 4, we can see that both the one-step prediction and the multistep prediction values of LPP and KLPP models have small error values. And the multistep prediction values come up with the real time series with a length of 500. The multistep prediction values are of more accuracy than Lorenz time series, which means the proposed models have different performances in different chaotic time series.

From Figures 5 and 6, we can see that the parameters are almost the same. The polynomial orders mainly take one or two. The number of nearest neighbor points is chosen in the vicinity of 50 and the radius of the neighbors changes from 0.05 to 0.3.

3.3. Performance Analysis in Henon Equations

Henon mapping is an evident dynamic system [43]. Its equations are written bywhere and . The values with a length of 1800 are generated to reconstruct the state space as training data, and 500 samples are used for testing. Results are shown in Figures 7, 8, and 9.

From Figure 7, we can see that the one-step prediction values of LPP and KLPP models have small error values. The multistep prediction values of LPP and KLPP models start diverging significantly from the 25th time step and 35th step, respectively. From Figures 8 and 9, we can see that the polynomial orders mainly take 3, 4, and 5; the number of nearest neighbor points is chosen from 50 to 100. This implies that the proposed models are overfitting.

3.4. Performance Analysis in Sunspot Time Series

The Sunspot time series is a good indication of solar activity for solar cycles. The monthly smoothed Sunspot time series has been obtained from the SIDC (World Data Center for the Sunspot Index). To compare the results with different models in the literature, data are selected in the same conditions reported by [9, 10]. Sunspot series from November 1834 to June 2001 (2000 points) are selected and scaled between . The first 1000 samples of time series are selected to train the prediction models and the remainder 1000 samples are kept to test the prediction models.

From Figure 10, it can be seen that, for the one-step prediction, both prediction values of LPP and KLPP models have accuracy prediction. For the multistep prediction, we find out that KLPP model has smaller error values than LPP model, and the values of prediction are in good agreement with the real time series. The prediction values of LPP and KLPP models start diverging significantly from the 10th and 320th time step, respectively. This implies that the kernel can improve the performance of the multistep prediction of chaotic time series.

From Figures 11 and 12, we can see that the polynomial orders mainly take one or two. The number of nearest neighbor points of both LPP and KLPP models is chosen in the vicinity of 20, respectively. The radius of the neighbors changes from 0.05 to 0.3 except for a few phase points. For one-step prediction, the LPP and KLPP models have better prediction performance for Sunspot time series. And for the multistep prediction of Sunspot time series, the KLPP model has more accuracy prediction than LPP model.

3.5. Results and Discussion

We compare the proposed models with some of the models reported in the literature and the results are shown in Tables 14. In Table 1, we can see that the proposed models in this paper are better than some of the existing methods. But the best results are of ERNN [9], ARMA-RESN [8], and IEC-LSSVM [20]. This is because these methods have optimized the values for the embedding dimensions for the phase space reconstruction. They also have the advantage of the architectural properties of different models in residual analysis or some models combined with different models which further improve the results. These are also seen for the rest of time series in Tables 24. And the data cannot be selected in the same conditions reported in the literature; thus, the conclusions from Tables 14 have a few mistakes.

Table 2 presents that the proposed models are better than most of the existing methods except for the IEC-LSSVM [20]. In Table 3, the proposed models have the best results. For the real-world time series in Table 4, we can see that the proposed models are better than some of the existing methods, which is similar to the best results reported in the literature.

4. Conclusions

In this paper, we apply the polynomial function to approximate the functional coefficients of the state-dependent autoregressive model. Based on the phase space reconstruction, we present a novel local nonlinear model called LPP model for chaotic time series prediction. The LPP model can effectively fit nonlinear characteristics of chaotic time series with simple structure and have better one-step forecasting performance. But, we find out that the LPP model have bad multistep forecasting performance. Then, we propose a kernel LPP model, which applies the kernel technique for the LPP model, so that it may have better multistep forecasting performance than LPP model.

The LPP and KLPP models are simple and are not affected by personal experience. Simulated and real data examples illustrate that the proposed models are flexible to analyze complex and multivariate nonlinear structures. The numerical experiments show the KLPP model has more accuracy prediction than LPP model for multistep prediction. We compare LPP and KLPP models with different models and the results show that our models are feasible.

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

Acknowledgments

The project was supported by Natural Science Foundation Project of China (Grant no. 11471060), Fundamental and Advanced Research Project of CQ CSTC of China (Grant no. cstc2014jcyjA40003), and Natural Science Foundation Project of CQ CSTC of China (Grant no. CSTC2012jjA00037).