Discrete Dynamics in Nature and Society

Volume 2015, Article ID 329487, 13 pages

http://dx.doi.org/10.1155/2015/329487

## Local Functional Coefficient Autoregressive Model for Multistep Prediction of Chaotic Time Series

School of Mathematics and Statistics, Chongqing University of Technology, Chongqing 400054, China

Received 19 June 2015; Revised 13 August 2015; Accepted 19 August 2015

Academic Editor: Ivan Area

Copyright © 2015 Liyun Su and Chenlong Li. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

#### Abstract

A new methodology, which combines nonparametric method based on local functional coefficient autoregressive (LFAR) form with chaos theory and regional method, is proposed for multistep prediction of chaotic time series. The objective of this research study is to improve the performance of long-term forecasting of chaotic time series. To obtain the prediction values of chaotic time series, three steps are involved. Firstly, the original time series is reconstructed in *m*-dimensional phase space with a time delay *τ* by using chaos theory. Secondly, select the nearest neighbor points by using local method in the *m*-dimensional phase space. Thirdly, we use the nearest neighbor points to get a LFAR model. The proposed model’s parameters are selected by modified generalized cross validation (GCV) criterion. Both simulated data (Lorenz and Mackey-Glass systems) and real data (Sunspot time series) are used to illustrate the performance of the proposed methodology. By detailed investigation and comparing our results with published researches, we find that the LFAR model can effectively fit nonlinear characteristics of chaotic time series by using simple structure and has excellent performance for multistep forecasting.

#### 1. Introduction

In recent decades, researchers have paid much attention to chaos motion in many fields, such as meteorology, medicine, economics, signal processing, traffic flow, power load, Sunspot prediction, and many others [1–12] and bring about lots of new models for predicting chaotic time series. In the late 1960s, researchers found it is a difficult task to forecast chaotic time series which is the evolution of a chaotic system’s observations by using traditional time series forecasting methods [1]. Then a series of theories and methods was established for understanding essence of chaos motion, such as Takes’ embedding theory [13]. Now, chaos theory has become an important part of nonlinear science and is used for forecasting chaotic time series.

Up to now, modeling of chaotic systems constructed from observed data and predicting one or several future values of the time series have become an important issue [14]. There are many prediction methods that have been proposed, such as adaptive prediction [15], the support vector machine (SVM) [16–20], polynomial estimation [21–24], and neural network (NN) [25–29]. In most of the published literature, single-step prediction was considered. For multistep prediction, the direct and iterative methods are proposed as two main categories. The direct multistep prediction does not use the prediction values in the future; the iterative multistep prediction uses short-term predictor and is built through recursive prediction, which means the future values are calculated by the predictor itself. However, multistep prediction becomes a difficult task because of the limited largest Lyapunov exponent of the chaotic system. Some researchers have been focusing on multistep prediction and using NN or its extended models to improve the performance of multistep prediction [29–31]. Some researchers’ studies show that the accuracy of prediction can be improved by using hybrid technique, such as combined SVM and Neuro-Fuzzy [32] and neural network and Neuro-Fuzzy [25, 26]. Researchers’ studies also show that hybrid technique can appear to have good performance by using the prediction error, such as the combined PCA and SVM [19] and ARMA and RESN [7]. The generalized nonlinear filtering methods are investigated for 5-step prediction of chaotic time series in [33]. These methods generally prompt better results than those single models, but they are complex, affected by personal experience, and easy to overfit.

In this paper, we propose to use functional coefficient autoregressive (FAR) model instead of local linear structure to approximate the local attractor in reconstructed phase space. As in [34], it is a nonparametric estimation of nonlinear dynamics. The proposed method combines chaos theory and local technique and has excellent spatial adaptation to effectively fit nonlinear characteristics of chaotic time series. Unlike RBF-AR model, the LFAR model has reasonable simple implementation and is rarely affected by personal experience. Furthermore, the LFAR model can avoid overfitting by controlling the dimension of the primary functions which are used for estimating the functional coefficients of LFAR model. In this study, an algorithm based on the dynamic least squares criterion for estimation of local functional coefficients is proposed. The effectiveness of the proposed model is demonstrated by the application to simulated data (Lorenz and Mackey-Glass systems) and real data (Sunspot time series). In these cases, we analyze and estimate the functional coefficients by using the proposed algorithm and examine the properties of iterative multistep prediction.

The remainder of this paper is organized as follows. Section 2 reviews the concept of the LFAR model and the optimal parameter set is established by using GCV. Section 3 uses the simulated chaotic systems and one real life time series as examples to evaluate the proposed models and discuss the properties of model’s parameters and also compares the results with published researches. Section 4 presents the conclusion of this paper.

#### 2. Methodology

##### 2.1. Phase Space Reconstruction for Chaotic Time Series

For a scalar chaotic time series , the phase space can be reconstructed by Takes’ embedding theory and the reconstructed phase points are , where . The embedding dimension and the time delay can be obtained by using Cao’s method [38]. Then, a continued vector mapping or can be described by the unknown evolution from to or . That is, or .

##### 2.2. The LFAR Model and Estimation Method

A chaotic time series prediction model for describing evolution from to can be written asThe continued vector mapping is the best prediction function in the sense that minimizes the expected prediction error:

The saturated nonparametric function cannot be estimated with reasonable accuracy due to the curse of dimensionality [30]. A LFAR model for chaotic reconstruction data is presented in -dimensional reconstructed phase space. That is,where is the lag of the model-dependent variable , is the embedding dimension, is the time delay, and functional coefficients are continued functions.

The functional coefficients are difficult to estimate because they are considered as nonparametric and do not have a conformed form. There are many nonlinear forms that can be used. Finding a good nonlinear form is hard by trying one model after another. Here the local nonparametric method is applied to obtain the estimations of unknown functional coefficients . Using Taylor’s series expansion, with -order derivative near the point can be described as follows:

Let , , , is the bandwidth for controlling the number of the nearest neighbor points, and , . Ignoring the higher order infinitesimal, we can approximate by the -dimensional primary functions as follows:where is used to replace variable and represent the weight coefficients of primary functions.

The LFAR model at the current state point in -dimensional phase space can be described as follows:where , . In order to obtain a LFAR model at the current state point in the reconstructed phase space, we select nearest neighbor points by using the Euclidean distances . The estimations of the parameters of the LFAR model at state point can be obtained by solving the following weighted least squares (WLS) regression problem:whereand is a kernel function and is a nonnegative function which emphasizes neighbor observations around . The parameter is used to determine the weights of neighboring observations around in estimating .

Let ; we have . Then we can obtain a prediction of next point:

SetThen the WLS solution iswhere is a diagonal matrix with as its th diagonal element, which entails .

##### 2.3. Determination of Optimal Parameters

In the process of modeling LFAR, the parameter set needs to be estimated. The embedding dimension and the time delay can be calculated by Cao’s method and the autocorrelation function method. For the remaining part of the parameter set, it is clear to see that the accuracy of prediction based on LFAR model is sensitive to the kernel function and the parameter . Here we have the form of kernel function as follows:Conventionally, we have and . Let ; then for any . And let , the kernel for any , which dues to the local linear model. For , the kernel changes from to , which means that the weight at the same neighbor point changes from to . The parameter can adjust the weight and the parameter can adjust the convergence speed of kernel when or . In this study, we consider simulated chaotic systems and real life time series as examples to investigate a proper parameter to achieve the best performance for modeling a LFAR model.

It is also very crucial to choose a proper dimension of primary functions to achieve the best performance for modeling chaotic time series. Low dimension leads to bad simulation results, and high dimension increases the complexity of computation and leads to overfitting. Hence, our main purpose of this section is to determine the parameter set . Generally, the optimal dimension should be selected to minimize the mean squared error (MSE) or its improved versions. Here we use a simple and quick method which is proposed in [34]. It can be regarded as a modified generalized cross validation criterion. Let and be two given positive integers and satisfy . First we use subseries with sample size to estimate the unknown coefficient functions and then to compute the multistep forecasting errors of the next part with sample size based on the estimated model. For example, the data with sample size is used to get the estimated model, and the prediction errors for the next data are computed. Then, the data with sample size is used and so on. The average prediction error or the standard prediction errors [31] use the subseries which is given bywhere . The overall prediction error is given byFan et al. [34] set and , and Meng and Peng [28] set and . The selected bandwidth does not critically depend on the choice of and as long as is reasonably large. In practical implementations, we select Meng’s method.

We select the proper bandwidth by minimizing . The function is minimized by comparing its values in a finite set of scale parameters in a grid . We can obtain different with different , , and . It is also very important to choose the parameter set . Finally we can determinate the optimal parameter set .

##### 2.4. Algorithm Description and Multistep Prediction

Select the optimal parameter set in executable LFAR algorithm that contains the process of the phase space reconstruction parameter set , the number of nearest neighbor points , the dimension of primary functions , and the weight of neighbor observations parameter . Here, we choose the number of nearest neighbor points from to . To speed up the computation and avoid overfitting, we let . We select and and set . The optimal parameters’ selection is described in detail in Algorithm 1.