Mathematical Problems in Engineering

Volume 2016 (2016), Article ID 3791504, 13 pages

http://dx.doi.org/10.1155/2016/3791504

## Chinese Stock Index Futures Price Fluctuation Analysis and Prediction Based on Complementary Ensemble Empirical Mode Decomposition

^{1}School of Economics & Trade, Hunan University, Changsha, Hunan 410082, China^{2}Financial Research Institute, Wenzhou University, Wenzhou, Zhejiang 325035, China

Received 8 January 2016; Accepted 24 May 2016

Academic Editor: Xiaodong Lin

Copyright © 2016 Ruoyang Chen and Bin Pan. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

#### Abstract

Since the CSI 300 index futures officially began trading on April 15, 2010, analysis and predictions of the price fluctuations of Chinese stock index futures prices have become a popular area of active research. In this paper, the Complementary Ensemble Empirical Mode Decomposition (CEEMD) method is used to decompose the sequences of Chinese stock index futures prices into residue terms, low-frequency terms, and high-frequency terms to reveal the fluctuation characteristics over different time scales of the sequences. Then, the CEEMD method is combined with the Particle Swarm Optimization (PSO) algorithm-based Support Vector Machine (SVM) model to forecast Chinese stock index futures prices. The empirical results show that the residue term determines the long-term trend of stock index futures prices. The low-frequency term, which represents medium-term price fluctuations, is mainly affected by policy regulations under the analysis of the Iterated Cumulative Sums of Squares (ICSS) algorithm, whereas short-term market disequilibrium, which is represented by the high-frequency term, plays an important local role in stock index futures price fluctuations. In addition, in forecasting the daily or even intraday price data of Chinese stock index futures, the combination prediction model is superior to the single SVM model, which implies that the accuracy of predicting Chinese stock index futures prices will be improved by considering fluctuation characteristics in different time scales.

#### 1. Introduction

As a powerful financial tool, stock index futures can help curb abnormal stock market fluctuations and effectively avoid investment risk by virtue of the primary functions of price discovery, hedging, and arbitrage; moreover, this tool is widely used in both developed and developing countries. The CSI 300 index futures officially began trading on April 15, 2010, which was a significant milepost in the Chinese capital markets. However, the Chinese stock index futures market remains in an emergent state, lacking mature market management mechanisms; thus, shocks (such as government policies, excessive market speculation) often induce wild fluctuations in stock index futures prices. Simultaneously, these price fluctuations can spread rapidly and be amplified by certain trading mechanisms, including margin trades, two-way trades, and mandatory liquidations, which can easily lead to market risk. Therefore, effectively analyzing and accurately forecasting fluctuations in Chinese stock index futures prices is of great theoretical and practical significance for promoting the healthy development of the capital markets.

Many analyses of stock index futures prices frequently use traditional statistical and econometric techniques. For example, Meneu and Torró [1] studied volatility spillovers between the spot and futures markets in Spain using asymmetric multivariate GARCH structures. Zhong et al. [2] investigated the relationship between Mexican stock index futures and the spot market using a modified EGARCH model and cointegration test. Xiong et al. [3] analyzed the long-term and short-term price discovery function of SGX FTSE/Xinhua China A50 Index Futures and A-Share Market using a cointegration test, error correction model, and the impulse response function. Moreover, using high-frequency CSI 300 stock index and futures data in China, Zuo et al. [4] investigated the Granger causality and long-run equilibrium between daily realized variance and realized bipower variation and jump variation. Based on five-minute high-frequency data, Chen and Zhang [5] researched the effects of the CSI 300 index futures trading on the jump risk of the spot market by employing a nonparametric method and the Granger causality test. However, there are some key drawbacks with respect to these traditional models. On one hand, many factors (including economic fundamentals, the effects of shocks from significant economic and financial events, and investors’ risk preferences [6]) stir up price fluctuations in stock index futures at different time scales. Peters [7] suggested that traders in the financial market would disperse and reflect various time scales due to differences in investment philosophy, ultimately engaging in transactions in different investment time scales: short-term, medium-term, and long-term. As a new product in an emerging market, Chinese stock index futures have attracted a variety of investor types, and their different investment strategies also have a great impact on stock index futures price fluctuations in different time scales. However, traditional models conduct their analyses based solely on the entire price series, and they cannot explain the inner driving forces of price fluctuations [8]. On the other hand, stock index futures price series are nonlinear and nonstationary financial time series, while traditional models are based on the assumption of linear data and obey a normal distribution; thus, these models are unable to adequately capture the nonlinear patterns in the price series.

The Empirical Mode Decomposition (EMD) method [9] and the Ensemble EMD method [10] are effective tools to address nonlinear, nonstationary data, and these methods can analyze the fluctuation characteristics of a time series indifferent time scales by decomposing them into Intrinsic Mode Functions (IMFs) with independent information. In particular, the EEMD method is a substantial improvement over the EMD method and overcomes the drawback of mode mixing in the EMD method by adding white noise. Thus, the EEMD method is popular and has been widely used and applied in the financial sector. For example, Zhang et al. [8] used the EEMD method to decompose crude oil prices into a trend, with a slowly varying part and a fluctuating process, and then analyzed the long-term, medium-term, and short-term fluctuation characteristics of crude oil prices. Ruan and Bao [11] applied the EEMD method to decompose housing prices into a trend term, a low-frequency term, and a high-frequency term to reveal the intrinsic characteristics of fluctuations in housing prices. Li and Feng [12] employed the EEMD method to separately decompose investor sentiment and the stock index price series into a short-term fluctuation term, a medium-term significant event term, and a long-term trend term, integrating econometric models to expand their research. However, the EEMD method cannot effectively offset the residue noise from the added white noise. Therefore, Yeh et al. [13] further proposed the Complementary Empirical Mode Decomposition (CEEMD) method. This method, which has been used increasingly in recent years [14, 15], provides a fast and accurate method for processing data that not only can address the mode mixing in the EMD method but also can completely offset the added residual white noise in the decomposition process that are generated by the EEMD method. However, this latest data processing method has not yet been applied to the stock index futures market.

In addition, by combining with other prediction models, EMD methods can also improve the predictive precision of a single prediction model. For example, Yu et al. [16] built an EMD-Fuzzy Neural Network- (FNN-) Adaptive Linear Neural Network (ALNN) model to forecast crude oil prices; this model yielded superior predictive results in comparison with either a single FNN model or a single ALNN model. In addition, Yang et al. [17] built an EMD-SVM-SVM model to forecast crude oil prices and also improved the predictive results of a single SVM model. Meanwhile, The Support Vector Machine (SVM) model [18] is now widely applied in the stock and stock index futures markets. Huang et al. [19] used the SVM model, a Back Propagation Neural Network (BPNN), Linear Discriminant Analysis (LDA), and other methods to make predictions regarding the Nikkei 225 Index and found that the SVM model exhibited the highest predictive performance among these prediction models. Ince and Trafalis [20] demonstrated that the SVM model has greater precision in predicting stock prices than the Multilayer Perception (MLP) and Autoregressive Integrated Moving Average (ARIMA) methods. Using five real futures contracts, such as the Standard & Poor 500 stock index futures, Tay and Cao [21] examined the feasibility of SVM modeling by comparing it with a BP neural network; these authors showed that SVMs are better for forecasting futures series. Sai et al. [22] optimized four SVM models with different kernel functions based on the Genetic Algorithm (GA) and the Particle Swarm Optimization (PSO) algorithm; these authors built eight different programs to forecast Chinese stock index futures prices and discovered that the linear kernel function SVM model based on the PSO algorithm yielded superior predictive results for Chinese stock index futures prices. The above papers on the SVM prediction model bolster the conclusion that the SVM model has been used in the stock index futures market, although mainly as a single SVM model.

To improve the analysis of Chinese stock index futures prices and the ability to forecast them, this paper first drew ideas from Zhang et al. [8] and other scholars. Then, we used the CEEMD method to decompose the Chinese stock index futures price series into a residue term, a low-frequency term, and a high-frequency term. Finally, we analyzed the long-term, medium-term, and short-term intrinsic fluctuation characteristics of Chinese stock index futures prices. In addition, substantial fluctuations in the low-frequency term are related to the impact of significant events. To define the events that play a role in the low-frequency term, we use Iterated Cumulative Sums of Squares (ICSS) [23–25] to detect the breakpoints in the low-frequency term. Then, we focus on the problem of forecasting Chinese stock index futures prices by referring to Yu et al. [16] and Yang et al. [17] to construct a combination prediction model and to forecast Chinese stock index futures prices by building a PSO-based [26–28] CEEMD and SVM combination prediction model. In addition, although previous studies on stock and stock index futures markets are based on low-frequency daily, weekly, and monthly price data [1–3, 19–22, 29], current research regarding intraday high-frequency data gradually became a focus [4, 5, 30]. Therefore, we also use this model to forecast the daily data and the intraday high-frequency data of Chinese stock index futures prices. The main innovation of this paper is the use of the CEEMD method to reveal the intrinsic fluctuation characteristics of Chinese stock index futures prices from various time scales, which provides a new research perspective on the fluctuation analysis of stock index futures prices. The fluctuation characteristics of various scales are introduced into predicting stock index futures prices, thus improving the predictive ability of a single SVM model with respect to the stock index futures market.

The remainder of this paper is structured as follows. Section 2 introduces the methods used in this paper. Section 3 analyzes the Chinese stock index futures prices at various time scales. Section 4 forecasts Chinese stock index futures price. Finally, Section 5 provides a summary.

#### 2. Introduction of Methodology

##### 2.1. The CEEMD Method

EMD methods, such as the EMD method, the EEMD method, and the CEEMD method, are self-adaptive data processing methods that can decompose a time series into a series of Intrinsic Mode Functions (IMFs) with independent information. These IMFs are effective tools to address nonlinear, nonstationary data. In this paper, we employ the latest extension, that is, the CEEMD method, to analyze Chinese stock index futures prices. The detailed introduction of the CEEMD method is stated as follows.

The EMD method proposed by Huang et al. [9] is the first method to extract series of IMFs and residue terms from the original time series. In particular, IMFs must satisfy the following two conditions: the number of extrema and zero-crossing points must be equal or differ by a maximum of one and the mean value of the upper and lower envelopes must be zero at all times. Accordingly, decomposition of the EMD method ends when the last residue term (abbreviated as res) is a monotonic function or a constant; then the original series can be expressed as where is recorded as . Because mode mixing can easily emerge in the EMD method, Wu and Huang [10] further proposed the EEMD method to address this drawback of the EMD method. The EEMD method assumes that the observed data are amalgamations of true time series and noise and that the ensemble means of data with different noises are closer to true time series. Hence, adding white noise as an additional step may help extract the true IMFs in the time series, and the noise can be offset by ensemble averaging. However, residue noise continues to be generated by the added white noise. Therefore, Yeh et al. [13] then proposed the CEEMD method by adding pairs of positive and negative white noise to generate complementary sequences. Specifically, the complementary sequences are obtained by the following equation:

In (2), represents th added white noise and and represent the sequences after positive and negative white noise is added for th time to the original series. We can generate complementary IMFs via the EMD method, and these are expressed as follows:

Then, the final IMFs with no residue noise are extracted by averaging the two complementary IMFs:

##### 2.2. ICSS Algorithm

The ICSS algorithm proposed by Inclán and Tiao [23] is a relatively mature approach to detecting breakpoints in sequences. The algorithm assumes that, in the stable time series , denotes the mean return for time series , and is the residual series with a mean of 0 and a variance of . Letting , be the cumulated sum squares of the residual series , the statistic can be defined as follows:

If breakpoints do not occur in the variance in the iteration process, the value of the statistic will vary around 0. By contrast, if there are one or more breakpoints in a sequence, the value of the statistic will deviate significantly upward or downward from the 0. Let be the value for at . If exceeds the preset credible interval, will be taken as an estimated break point, where is a normalizing factor.

##### 2.3. SVM Prediction Model

The SVM model employs a kernel-based method in which the basic idea is to construct a hyperplane with low risk in high-dimensional feature space. This method has the advantage of generalization capabilities and good function approximation. Among the various types of SVM models, the model proposed by Vapnik [18] is mainly used to resolve prediction problems. Given a set of training patterns , where is the input variable and is the output variable, the prediction model is learned from these patterns and is used to forecast the output variables of unseen input variables. Specifically, the constraints and objective function for the model are

In (6), and are the slack variables that represent the allowable upper limit and lower limit of training errors, respectively, under the constraint of insensitive loss, ; is the penalty factor controlling the degree of penalty on samples whose errors exceed .

The Lagrange function must be introduced to resolve (6). According to the Duality Theory and the saddle point condition, the dual form of (6) can be obtained as follows:

In (7), is the nonnegative Lagrangian multiplier of , and represents the kernel function. Any function that satisfies Mercer’s condition can be a kernel function. The common kernel functions include the linear kernel function and the Radial Basis Function (RBF) kernel function . Ultimately, the hyperplane for the underlying prediction problem is given by the following:

##### 2.4. PSO Algorithm

The PSO algorithm proposed by Kennedy [26] is an effective algorithm to address function parameter optimization. Each particle in the algorithm denoting a potential solution in the solution space has three indicators, including speed, position, and fitness value. The fitness value is determined by the fitness function, which can estimate the merit of the particles. The velocity of the particle determines the movement direction and distance, and the velocity’s dynamic adjustment follows the movement experience of the particle itself and other particles, thus enabling the optimal selection of an individual in the solution space. The details are as follows.

Suppose that, in a -dimensional solution space, the population comprising n particles is , the position of th particle is , and the fitness value of each particle can be calculated based on the objective function and position of each particle. The velocity of th particle is , its individual extremum is , and the group extremum is . In each iterative process, the particle updates its velocity and position through the individual extremum and the group extremum. The updating equation is as follows:where represents the inertia weight, , is the current iteration number, is the velocity component of the particle, and are the acceleration factors and nonnegative constants, and and are random numbers in the range . Because the SVM prediction model is sensitive to the model parameters and because the literature [22, 27, 28] indicates that the PSO algorithm performs very well in the parameter optimization process of the SVM model, this paper uses the PSO algorithm to optimize the parameters of the SVM prediction model.

#### 3. Analysis of Chinese Stock Index Futures Prices in Different Time Scales

##### 3.1. Data Selection and Description

The CSI 300 index futures are the earliest stock index futures listed in China, and it has been widely used in many studies [4, 5, 22]. Therefore, this paper selects the daily closing price of the stock index futures as this section’s research object. The time span runs from April 16, 2010, to November 20, 2014; the sample data total 1,115; and the data are from the Wind database. See Figure 1 and Table 1 for the sample data distribution and the descriptive statistics, respectively.