Abstract

The interacting impact between the crude oil prices and the stock market indices in China is investigated in the present paper, and the corresponding statistical behaviors are also analyzed. The database is based on the crude oil prices of Daqing and Shengli in the 7-year period from January 2003 to December 2009 and also on the indices of SHCI, SZCI, SZPI, and SINOPEC with the same time period. A jump stochastic time effective neural network model is introduced and applied to forecast the fluctuations of the time series for the crude oil prices and the stock indices, and we study the corresponding statistical properties by comparison. The experiment analysis shows that when the price fluctuation is small, the predictive values are close to the actual values, and when the price fluctuation is large, the predictive values deviate from the actual values to some degree. Moreover, the correlation properties are studied by the detrended fluctuation analysis, and the results illustrate that there are positive correlations both in the absolute returns of actual data and predictive data.

1. Introduction

The objective of this work is to investigate the relationships between the crude oil market and the stock market and examine whether the shocks in crude oil price transmitted to Chinese stock market will receive considerable attention from investors. In the past decade, the crude oil demand of China is growing rapidly, and China has already become the second-largest oil importer in the world, after the United States. Fourteen years ago, China from an oil-exporting country became a net oil-importing country. From then on, the movement of crude oil prices had a strong influence on the economic behavior of individuals and firms, and as a result, it affects the economic development directly. In another aspect, since July 2009, China has taken the place of Japan to be the world’s second-largest stock market, and the stock market has played an important part in its economy. China has two stock markets: Shanghai Stock Exchange and Shenzhen Stock Exchange. The indices studied in the present paper are Shanghai Composite Index (SHCI) and Shenzhen Compositional Index (SZCI). These two most influential indices play an important role in Chinese stock markets. We also consider Shenzhen Petrochemical Index (SZPI) and the stock price of China’s largest oil company: China Petroleum & Chemical Corporation (SINOPEC). Daqing oil field and Shengli oil field are the first and the second largest oil fields in China respectively, the crude oil prices of Daqing and Shengli have a strong impact on Chinese energy market. The data for these crude oil prices and indices in the 7-year period is selected and analyzed by the statistical method and the neural network method.

Recently, some progress has been made in the study of fluctuations for the financial market and the energy market in China, for example see [17]. Artificial neural networks (ANNs) are one of the technologies that have made great progress in studying the stock markets [3, 811]. ANN have good self-learning ability, a strong antijamming capability, and they have been widely used in financial fields such as stock prices, profits, exchange rate, and risk analysis and prediction. Although the historical data has a great influence on the investors’ positions, we think that the impacts of different historical data on the stock price are not same. In the present paper, we suppose that the degree of impact of a data depends on its occurring date (or time), we give a high level effect of a data when it is very near to the current state. Furthermore, we also introduce the Brownian motion and Poisson jump in the model [3, 6, 1115], in order to make the model have the effect of random movement and random jump while maintaining the original trend. In a financial market, jumps in financial assets play a crucial role in volatility forecasting. And jumps have a positive and mostly significant impact on future volatility. In this work, the artificial neural network model based on jump stochastic time effective function is applied to forecast the fluctuations of SHCI, SZCI, SZPI, Daqing, Shengli, and SINOPEC. We study the statistical behaviors and the linear regression for these indices, and the simulation plots and the comparisons of the observed data are given. We introduce mean absolute error (MAE), mean relative error (MRE), Theil’s inequality coefficient (Theil’s IC), bias proportion (BP), variance proportion (VP) and covariance proportion (CP) to evaluate the predictive results. Detrended fluctuation analysis (DFA) is developed to study both the stock markets and the crude oil markets [1619]. DFA is one of the statistical analysis methods, which is applied to study the extent of long-range correlations in time series, it gives a statistical approach that reduces the effects of nonstationary market trends and focuses on the intrinsic autocorrelation structure of market fluctuations over different time horizons. DFA provides a simple quantitative parameter, the scaling exponent 𝛼, to represent the correlation properties of time series. In the last part of Section 3, the empirical analysis shows the positive correlations in the absolute returns of the actual data and the predictive data by calculating the scaling exponent 𝛼.

In this paper, we introduce a new method: the jump stochastic time effective function in the neural network, to investigate the relationships between the crude oil market and the stock market. And the intelligent system, artificial neural networks with random theory are integrated in this work. The method is different from the methods used in previous papers [13, 14, 20], which also investigate the relationships between the crude oil market and the stock market. This paper also extends the method mentioned in [3] by introducing the random jump process, which can make the model have the effect of random jump while maintaining the original trend. And we do the different statistical analysis with the work in [3]. In the present paper, we improve the forecasting method in the neural network, each historical datum is given a weight (random with jump) depending on the time it occurs in the model, and we also use the probability density functions to classify the various variables from the training samples. The empirical research exhibits that the improved neural network model takes advantage over the traditional neural network models to some degree.

2. A Brief Description of Oil Market and Stock Market in China

Chinese oil market is attracting more and more attentions from all over the world. China has been the world’s second-largest oil consumer since 2003, and its oil demand reached 9% of the world’s total demand in 2006. Figure 1 shows the monthly output and the monthly growth rate of the crude oil production in China from January 2003 to December 2009. The plot indicates that the crude oil output has almost reached the high limit, whereas the oil demand will grow by 4.5% in the coming three years. This displays that the stronger relationships between the international oil market and Chinese oil market become obvious.

In fact, China has become a net importer of crude oil since 1996; and the import dependence has exceeded 51% in 2008. Figures 2(a) and 2(b) present China’s crude oil import and consumption monthly in the recent 7 years. The plots exhibit that the trends of the curves in Figures 2(a) and 2(b) are similar, which implies that the oil demand relies heavily on the international oil market. At the same time, the total values of China stock markets A shares reached 3.21 trillion US dollars on July 15; 2009, ranking as the world’s second-largest stock market. The listed oil companies usually are the large cap companies, so the market capitalization value of these companies is not only a main part of the stock market value but also an important component of the stock market indices. Although some research work has been done in studying the relationship between the crude oil market and the stock market [4, 13, 14, 2022], there has been relatively little empirical work done to analyze the relationships in China. In this paper, we select the data of SHCI, SZCI, SZPI, Daqing (Daqing crude oil price), Shengli (Shengli crude oil price), and the price of SINOPEC for each trading day in 7-year period from January 2, 2003 to December 31, 2009. And the corresponding statistical behaviors and comparisons of prices changes are studied in the following.

3. Forecasting and Statistical Analysis

In the real crude oil market, understanding the process by which oil prices evolve is fundamental to our knowledge of this market. Many empirical evidences, like the asymmetric and leptokurtic feature of return distributions and volatilities, strongly suggested an inappropriateness for the usage of Brownian motions in the Black-Scholes model. More precisely, it is often observed that the return distribution is skewed to zero and has a higher peak and fatter tails than those of the corresponding normal distribution. To explain those empirical phenomena, many researches propose innovative models such as normal jump diffusion models (see [1215]), and continuous-time stochastic volatility models are becoming an increasingly popular way to describe moderate-and high-frequency financial data. These models introduce discontinuities, or jumps, into the volatility process, this can improve the empirical performance of these models. The distribution behavior of jumps for oil prices often represent an important piece of the temporal crude oil price dynamics. We establish the presence of jumps in the data of the financial model, where the jumps that disrupt the entire term structure represent the most significant jump events. For example, in the present paper, these jump events may include the changing of international energy markets, the amount of oil production in China, the crude oil reserve in China, Chinese oil consumption, Chinese energy policy, the wars, and the political events in the world, so on. These random events may be responsible for generating jumps in crude oil price dynamics. Since the fluctuation behaviors of the crude oil prices are also nonlinear, unstable, and random, we introduce the stochastic time effective function in the neural network. The function is supposed to follow a Brownian motion plus a compound Poisson process with a random jump distribution, in order to describe the above-mentioned empirical phenomena. We assume that the historical data of the crude oil market can reflect these random events, and affect the price volatility of the current oil market. For the model, the proposed stochastic time effective function may reflect the large fluctuations of the oil prices. Further, the function is a time-dependent random variable and also shows that the recent information has a stronger effect than the old information for the investors.

3.1. Jump Stochastic Time Effective Neural Network Model for Forecasting

There are various methods to forecast the volatilities of the time series, for example, the autoregressive conditional heteroscedasticity model has been applied by many financial analysts [23]. These financial time series models are based on the financial theories and require some strict assumptions on the distributions of the time series, so sometimes it is hard to reflect the market variables directly in the models. Usually stock prices can be seen as a random time sequence with noise, artificial neural networks, as large-scale parallel processing nonlinear systems that depend on their own intrinsic link data, providing methods and techniques that can approximate any nonlinear continuous function, without a priori assumptions about the nature of the generating process. The ANN model is a nonparametric method and can forecast future results by learning the pattern of market variables without any strict theoretical assumption [11]. Brooks demonstrated that it is applicable to forecast the volatilities of the financial time series by ANN [24].

First we introduce the three-layer BP neural network model in Figure 3, (for the details see [810]), and for any fixed neuron 𝑛(𝑛=1,2,,𝑁), the model has the following structure: let {𝑥𝑖(𝑛)𝑖=1,2,,𝑝} denote the set of input of neurons, {𝑦𝑗(𝑛)𝑗=1,2,,𝑚} denote the set of output of hidden layer neurons, 𝑉𝑖 is weight that connects the node in 𝑖 the input layer neurons to the node 𝑗 in the hidden layer, 𝑊𝑗 is weight that connects the node 𝑗 in the hidden layer neurons to the node 𝑘 in the output layer, and {𝑜𝑘(𝑛)𝑘=1,2,,𝑞} denote the set of output of neurons. Then the output value for a unit is given by the following function 𝑦𝑗(𝑛)=𝑓𝑝𝑖=1𝑉𝑖𝑥𝑖(𝑛)𝜃𝑗,𝑜𝑘(𝑛)=𝑓𝑝𝑖=1𝑊𝑗𝑦𝑗(𝑛)𝜃𝑘,(3.1) where 𝜃𝑗,𝜃𝑘 are the neural thresholds, and 𝑓(𝑥)=1/(1+𝑒𝛼) is Sigmoid activation function. Let 𝑇𝑘(𝑛) be the actual value of data sets, then the error of the corresponding neuron 𝑘 to the output is defined as 𝜀𝑘=𝑇𝑘𝑜𝑘.

Obviously, the real data follow normal distribution ingeneral. However, the tail of the real distribution is fatter than the normal, which is called fat-tail phenomena. It is caused by drastic fluctuation of stock price. Moreover, we can find that the log return of stock price will fluctuate rapidly at intervals. In view of the above reality problem, the error of the output is defined as 𝜀=𝜀2𝑘/2, then the error of the sample 𝑛(𝑛=1,2,,𝑁) is defined as1𝑒(𝑛,𝑡)=2𝜙(𝑡)𝑞𝑘=1𝑇𝑘(𝑛)𝑜𝑘(𝑛)2,(3.2)

where 𝜙(𝑡) is the jump stochastic time effective function. Now we defined 𝜙(𝑡) as follows 𝜙𝑡1𝑡𝑛=1𝜏exp𝑡1𝑡𝑛𝜇(𝑡)𝑑𝑡𝑡1𝑡𝑛𝜎(𝑡)𝑑𝐵(𝑡)+𝑁(𝑡1𝑡𝑛)𝑙=1𝐽𝑙,(3.3) where 𝜏(>0) is the time strength coefficient, 𝑡1 is the current time or the time of newest data in data set, and 𝑡𝑛 is an arbitrary time point in data set. 𝐽𝑙 (𝑙=1,2,,𝑁(𝑡)) are independent and identically distributed jump processes and 𝐽𝑙 obey the normal distribution with mean 𝜇𝐽 and variance 𝜎𝐽. 𝑁(𝑡)(𝑡0) is a Poisson process with intensity 𝜆. 𝜇(𝑡) is the drift function (or the trend term), 𝜎(𝑡) is the volatility function, and 𝐵(𝑡) is the standard Brownian motion [5]. The stochastic time effective function implies that the recent information has a stronger effect for the investors than the old information. In detail, the nearer the events happened, the greater the investors and market are affected. Then the total error of all data training set in the set output layer with the jump stochastic time effective function is defined as1𝐸=𝑁𝑁𝑛=1=1𝑒(𝑛,𝑡)𝑁𝑁𝑛=11𝜏𝑒𝑡1𝑡𝑛𝜇(𝑡)𝑑𝑡𝑡1𝑡𝑛𝜎(𝑡)𝑑𝐵(𝑡)+1𝑛)𝑁(𝑡𝑡𝑙=1𝐽𝑙𝑞𝑘=112𝑇𝑘(𝑛)𝑜𝑘(𝑛)2.(3.4)

Data is divided into two sections: the data from 2003 to 2007 is used for training and the rest is used for testing. For the stock indices, we input five kinds of stock prices: daily open price, daily closed price, daily highest price, daily lowest price, and daily trade volume, and one price of stock prices in the output layer: the closed price of the next trade day. And for the crude oil prices, we input five kinds of prices: the crude oil price of Brent, WTI, Dubai, Daqing, and Shengli, and the crude oil price of Daqing (or Shengli) of the next trade day is in the output layer. The number of neural nodes in input layer is 5, the number of neural nodes in the hidden layer is 13, and the number of neural nodes in output layer is 1. In this section, we take 𝜇𝐽 and 𝜎𝐽 to be the mean and the variance of reality historical data of SHCI, and let the intensity 𝜆 be 1/30. That is to say, jump will happen 10 times a year in average. Moreover, we suppose that the values of vector (𝜇(𝑡),𝜎(𝑡)) are (1,1). The training algorithms procedures of the neural network is described as follows.

Step 1. Normalize the data as follows: 𝑆(𝑡)=(𝑆(𝑡)min𝑆(𝑡))/(maxS(𝑡)min𝑆(𝑡)).

Step 2. At the beginning of data processing, connective weights 𝑉𝑖 and 𝑊𝑗 follow the uniform distribution on (1,1), and let the neural threshold 𝜃𝑘,𝜃𝑗 be 0.

Step 3. Introducing the jump stochastic time effective function 𝜙(𝑡) in the error function 𝑒(𝑛,𝑡). Choosing different volatility parameter. Giving the transfer function from input layer to hidden layer and the transfer function from hidden layer to output layer.

Step 4. Establishing an error-acceptable model and setting preset minimum error. If output error is below preset minimum error, go to Step 6, otherwise go to Step 5.

Step 5. Modify connective weights by calculating backward for the node in output layer: 𝛿𝑜1(𝑛)=𝜏𝑒𝑡1𝑡𝑛𝜇(𝑡)𝑑𝑡t1𝑡𝑛𝜎(𝑡)𝑑𝐵(𝑡)+1𝑛)𝑁(𝑡𝑡𝑙=1𝐽𝑙[]𝑜(𝑛)𝑜(𝑛)𝑇(𝑛)][1𝑜(𝑛).(3.5) Calculate 𝛿 backward for the node in hidden layer: 𝛿1(𝑛)=𝜏𝑒𝑡1𝑡𝑛𝜇(𝑡)𝑑𝑡𝑡1𝑡𝑛𝜎(𝑡)𝑑𝐵(𝑡)+1𝑛)𝑁(𝑡𝑡𝑙=1𝐽𝑙[]𝑜(𝑛)1𝑜(𝑛)𝑊𝑗𝛿(𝑛),(3.6) where 𝑜(𝑛) is the output of the neuron 𝑛, 𝑇(𝑛) is the actual value of the neuron 𝑛 in data sets, 𝑜(𝑛)[1𝑜(𝑛)] is the derivative of the sigmoid activation function and is each of the node which connect with the node and in the next hidden layer after node . Modifying the weights from this layer to the previous layer: 𝑊𝑗(𝑛+1)=𝑊𝑗(𝑛)+𝜂𝛿𝑜(𝑛)𝑦(𝑛)or𝑉𝑗(𝑛+1)=𝑉𝑗(𝑛)+𝜂𝛿𝑘(𝑛)𝑥(𝑛),(3.7) where 𝜂 is learning step, which usually take constants between 0 and 1.

Step 6. Output the predictive value.
Next, according to the computer simulations of the given neural network model, we do the comparisons between the predictive data of the model and the actual data of SHCI, SZCI, SZPI, Daqing, Shengli, and SINOPEC. And these comparison results are plotted in Figure 4.
In Figure 5, by using the linear regression method, we compare the predictive data of the neural network model with the actual data of SHCI, SZCI, SZPI, Daqing, Shengli, and SINOPEC. It is known that the linear regression attempts to model the relationship between two variables by fitting a linear equation to observed data. And it is usually used to fit a predictive model to an observed data set of two variables. Through the regression analysis, there are different linear equations in SHCI, SZCI, SZPI, Daqing, Shengli, and SINOPEC respectively, in Figure 5. We set the predictive data as 𝑥-axis and set the actual data as 𝑦-axis, and the linear equation is 𝑦=𝑎𝑥+𝑏. A valuable numerical measure of association between two variables is the correlation coefficient 𝑟. Table 1 shows the values of 𝑎, 𝑏, and 𝑟 for the indices.

3.2. Experiment Analysis

In Section 3.1, the financial price model is modeled by the neural network system. In order to evaluate the prediction of the model, we introduce some statistics in this section: mean absolute error (MAE), mean relative error (MRE), Theil inequality coefficient (Theil’s IC), bias proportion (BP), variance proportion (VP) and covariance proportion (CP). We set 𝑥𝑖, 𝑦𝑖, 𝑥, 𝑦, 𝜎𝑥, 𝜎𝑦, and 𝑟 as the predictive value, the actual value, the mean of the predictive value, the mean of the actual value, and the variance of the predictive value, the variance of the actual value and the correlation, respectively. These statistics are defined as follows:1MAE=𝑛𝑛𝑖=1||𝑥𝑖𝑦𝑖||1,MRE=𝑛𝑛𝑖=1||||𝑥𝑖𝑦𝑖𝑦𝑖||||,Theil'sIC=(1/𝑛)𝑛𝑖=1𝑥𝑖𝑦𝑖2(1/𝑛)𝑛𝑖=1𝑥2𝑖+(1/𝑛)𝑛𝑖=1𝑦2𝑖,(3.8) where the value of Theil IC is in [0,1], and the smaller value means the better prediction of the model.BP=𝑥𝑦2𝑛𝑖=1𝑥𝑖𝑦𝑖2𝜎/𝑛,VP=𝑥𝜎𝑦2𝑛𝑖=1𝑥𝑖𝑦𝑖2,/𝑛CP=2(1𝑟)𝜎𝑥𝜎𝑦𝑛𝑖=1𝑥𝑖𝑦𝑖2/𝑛=1BPVP,(3.9) where BP denotes the normalized difference between the mean of the predictive value and the mean of the actual value, and VP denotes the normalized difference between the variance of the predictive value and the variance of the actual value. Their values range from 0 to 1. The prediction of the model is effective when the value of CP is close to 1. Form the computer computation, Table 2 presents the values of the above statistics. Table 2 also gives a description of the deviating degrees between the predictive data and the actual data.

In the next part, we will discuss the relationship between the crude oil price fluctuation of Daqing and the predictive values of the model. It is apparent in Figure 6(a) when the fluctuation is small, the predictive values are close to the actual values. In another aspect, when the fluctuation is large, the predictive values deviate from the actual values in some extent. We also can see in Figures 6(b) and 6(c) that the small fluctuation leads to the small relative errors and the small errorbars and the large fluctuation leads to the big relative errors and the big errorbars. So there is a relationship between the fluctuation and the prediction. To investigate this relationship, we choose the predictive values and the actual values of Daqing as the research object. First, we measure the fluctuation in absolute returns, which is denoted by |𝑅(𝑡)|. Then we divide the data into five groups by the absolute return intervals. The intervals are [0,0.01), [0.01,0.02), [0.02,0.03), [0.03,0.04), and [0.04,𝑀], where 𝑀 denotes the maximum of absolute returns. Table 3 shows the relationship between the actual fluctuation and the prediction by the absolute return intervals.

3.3. Return Analysis

In this section, we discuss the statistical properties of SHCI, SHZI, SZPI, Daqing, Shengli, and SINOPEC in the 7-year period from January 2003 to December 2009. Figure 7 presents the figures of the returns time sequence for these indices. We denote the daily price at time 𝑡 by 𝑆(𝑡) (𝑡=0,1,2), then the return of the stock price (or index) is given by 𝑅(𝑡)=𝑆(𝑡+1)𝑆(𝑡)=𝑆(𝑡)𝑆(𝑡+1)𝑆(𝑡)1.(3.10) Table 4 presents the statistical analysis of the returns for the actual data. Note that the daily price fluctuation is limited in China, that is, the changing limits of the daily returns for stock prices and stock indices are between 10% and 10%, whereas the returns of the crude oil price can change in a larger value range. Table 5 presents the statistical analysis of the returns for the predictive data. In these two tables, they show the values of mean, variance, kurtosis and skewness of the returns, and we also can compare these values between the actual data and the predictive data.

3.4. Detrended Fluctuation Analysis

Detrended fluctuation analysis (DFA) is a scaling analysis method providing the scaling exponent 𝛼 to represent the correlation properties [7, 1618]. There are two advantages in DFA method. One is that it permits the detection of long-range correlations embedded in seemingly nonstationary time series. The other is that it avoids the spurious detection of apparent long-range correlations that are artifact of nonstationarity. Briefly, for a given stochastic time series 𝑆(𝑖), 𝑖=1,2,,𝑁, with the sampling period Δ𝑡, the DFA method can be implemented as follows.

Step 1. Compute the mean 𝑆=(1/𝑁)𝑁𝑖=1𝑆(𝑖) and obtain an integrated time series 𝑦(𝑗)=(1/𝑁)𝑗𝑖=1(𝑆(𝑖)𝑆). Then divide the integrated time series into boxes of equal size, 𝑛.

Step 2. In each box, fit the integrated time series by using a polynomial function, 𝑦t(𝑖). For order-𝑙 DFA, 𝑙 order polynomial function should be applied for the fitting and in this paper, 𝑙=2. Then calculate the detrended fluctuation function as follows: 𝑌(𝑖)=𝑦(𝑖)𝑦t(𝑖).(3.11)

Step 3. For a given box size 𝑛, calculate the root mean square fluctuation: 1𝐹(𝑛)=𝑁𝑁𝑖=1[]𝑌(𝑖)21/2.(3.12) A power-law relation between 𝐹(𝑛) and the box size 𝑛 indicates the presence of scaling: 𝐹(𝑛)𝑛𝛼. The parameter 𝛼, called the scaling exponent or correlation exponent, represents the correlation properties of the time series: if 𝛼=0.5, there is no correlation and the time series is uncorrelated; if 𝛼<0.5, the signal is anticorrelated; if 𝛼>0.5, there are positive correlations in the time series.
In this paper, we use DFA to analyze the absolute returns of the actual data and the predictive data, see Figure 8. 𝛼𝐴 and 𝛼𝑃 denote the scaling exponents of the absolute returns for the actual data and the predictive data respectively. Table 6 shows that 𝛼𝐴 and 𝛼𝑃 are all larger than 0.5, which means that there are positive correlations in the absolute returns of the actual data and the predictive data.

4. Conclusion

In this paper, we introduce the jump stochastic time effective neural network model to forecast the fluctuations of SHCI, SZCI, SZPI, Daqing, Shengli, and SINOPEC. The corresponding statistical behaviors of these indices are investigated; and several kinds of comparisons between the actual data and the predictive data are given. Further, the absolute returns of the actual data and the predictive data are studied by the statistical method and the detrended fluctuation analysis.

Acknowledgments

The authors were supported in part by National Natural Science Foundation of China Grant nos. 70771006 and 10971010, and BJTU Foundation grant no. S11M00010.