Crude Oil Price Prediction Based on a Dynamic Correcting Support Vector Regression Machine

Shu-rong, Li; Yu-lei, Ge

doi:https://doi.org/10.1155/2013/528678

Abstract and Applied Analysis

On this page

Abstract Introduction Conclusions Acknowledgments References Copyright Related Articles

Special Issue

Artificial Intelligence and Data Mining: Algorithms and Applications

View this Special Issue

Research Article | Open Access

Volume 2013 | Article ID 528678 | https://doi.org/10.1155/2013/528678

Crude Oil Price Prediction Based on a Dynamic Correcting Support Vector Regression Machine

Li Shu-rong¹and Ge Yu-lei¹

Academic Editor: Fuding Xie

Received10 Dec 2012

Accepted28 Jan 2013

Published18 Mar 2013

Abstract

A new accurate method on predicting crude oil price is presented, which is based on ε-support vector regression (ε-SVR) machine with dynamic correction factor correcting forecasting errors. We also propose the hybrid RNA genetic algorithm (HRGA) with the position displacement idea of bare bones particle swarm optimization (PSO) changing the mutation operator. The validity of the algorithm is tested by using three benchmark functions. From the comparison of the results obtained by using HRGA and standard RNA genetic algorithm (RGA), respectively, the accuracy of HRGA is much better than that of RGA. In the end, to make the forecasting result more accurate, the HRGA is applied to the optimize parameters of ε-SVR. The predicting result is very good. The method proposed in this paper can be easily used to predict crude oil price in our life.

1. Introduction

In recent years, crude oil prices have experienced four jumps and two slumps. The fluctuation of crude oil price adds more changes to the development of world economy. Grasping the change of oil price can provide guidance for economic development [1]. Therefore, it is very important to predict the crude oil price accurately.

The predicting methods can be divided into two aspects. One is from the qualitative angle [2]; the other is from quantitative angle, such as econometric model and statistical model [3, 4]. And the latter method is adopted by most scholars. But it is a difficult job to predict crude oil price, since the price is nonlinear and nonstationary time series [5]. The traditional predicting methods such as model, model, and model, base on linear model. They are only suitable for linear prediction and cannot be applied to model and predict nonlinear time series [6]. Wang got the predicting model by using time series and artificial neural network in 2005 [7], Xie proposed a new method for crude oil price forecasting based on support vector machine (SVM) in 2006 [8], Mohammad proposed a hybrid artificial intelligence model for crude oil price forecasting by means of feed-forward neural networks and genetic algorithm in 2007 [9], and Guo proposed a hybrid time series model on the base of GMTD model in 2010 [10]. The experimental results tell us that the prediction accuracy of these methods is better than traditional models. But the results is still existing biggish errors especially when the crude oil price is fluctuating violently.

Neural network technique provides a favorable tool for nonlinear time series forecasting. But the predictive ability of conventional neural network is low, because of the problems such as the local minimum, over learning, and the lacking of theoretical direction for selecting the hidden layer nodes. The SVM was proposed in the 1990s [11]; it can get the optimal results on the basis of the current information. The basic idea of SVM is that it fits the sample capacity of functions on the basis of regulating the upper bound of the minimum VC dimension, which also means the numbers of support vector. Compared with neural network [12, 13], SVM has strong generalization ability of learning small samples and with the inferior dependence on quantity. But the prediction performance of SVM is very sensible to parameter selection. On the other hand, the research on parameter optimization of SVM is very few at the moment. The parameters are usually determined on experience or trial method. In this way, if the parameters are not suitably chosen, the SVM will lead to poor prediction performance. So, it is important to find one good method to get the optimal parameters of SVM.

In this paper, an -support vector regression machine with dynamic correction factor is proposed. And a novel hybrid RNA genetic algorithm (HRGA) is proposed to obtain the optimal parameters for a SVM. The HRGA is from the development of biological science and technology; the structure and information of RNA molecular are known profoundly. To improve the optimal performance of genetic algorithm, one genetic algorithm which bases on coding and biological molecular operation has been widely concerned [14]. This method improves the search efficiency and optimization performance through coding the individuals into biological molecules by use of bases [15, 16]. The appropriate mutation operator can improve the population diversity and prevent premature. While the mutation operator of classical RNA genetic algorithm (RGA) is fixed, so we need to find a suitable method to determine the mutation operator. In 2003, Kennedy did some improvement on particle swarm optimization (PSO) and proposed the bare bones particle swarm algorithm [17].

In the proposed HRGA, the position displacement idea of bare bones PSO is applied to change the mutation operator. The nucleotide base encoding, RNA recoding operation, and protein folding operation are reserved in the new algorithm. Thus, the strong global search capability is kept. At the same time, to make sure of the directivity of local searching, the optimal experience of the whole population and the historical experience of individuals are used. The convergence speed and solution precision are improved. Furthermore, to test the validity of HRGA, three benchmark functions are adopted. The mean value of optimum of HRGA is smaller than that of traditional RNA genetic algorithm.

Once the support vector regression machine is designed optimally, it can be used to predict crude oil price. Dynamic correction factor is brought in to improve the predictive effect and can strengthen the robustness of systems. In order to test the performance of the proposed predicting method, we provided the predicting results by using a back propagation neural network and a traditional support vector regression machine which are also improved with dynamic correction factor [7, 8]. The results show that our predicting method obtains greater accuracy than that of the other two in this paper.

The paper is organized as follows. Section 2 discusses the support vector regression machine with dynamic correction factor. Section 3 presents HRGA based on bare bones PSO, and some testing examples are applied to verify the effectiveness of the algorithm. Section 4 applies the dynamic correcting to predict the crude oil price. Section 5 concludes the paper.

2. Support Vector Regression Machine with Dynamic Correction Factor

Consider the training sample set , , , as the input variable and as the output variable.

The basic idea of SVM is to find a nonlinear mapping from input space to output space [18–20]. Data is mapped to a high-dimensional characteristic space on the basis of the nonlinear mapping. The estimating function of linear regression in characteristic space is as follows: where denotes threshold value.

Function approximation problem is equal to the following function: where denotes the objective function, denotes the empirical risk function, denotes the sample quantity, denotes adjusting constant, and denotes the error penalty factor. reflects the complexity of in the high-dimensional characteristic space.

Since linear insensitive loss function has better sparsity, we can get the following loss function:

The empirical risk function is as follows:

According to the statistical theory, we bring in two groups of nonnegative slack variable and . Then, the question can be converted to the following nonlinear -support vector regression machine () problem: where denotes the insensitive loss function. is used to balance the complex item and the training error of the model.

We bring into Lagrange multipliers and , then the convex quadratic programming problem above can be changed into the below dual problem: where denotes the inner product kernel satisfying Mercer theorem.

We can get the function through solving the above dual problem:

When is used on prediction, it may have a certain error since the data fluctuates violently such as the crude oil price. To reduce the error in some certain as possible as we can, we bring in the dynamic correction factor . The main idea of the dynamic correction factor is that we use the error of back step with multiplying to revise the current predicting results. Thus, we can reduce the current predicting error. The dynamic correcting SVR can be defined as follows: where denotes the real results, denotes the final prediction results, denotes the initial predicting results, denotes the dynamic correction factor, and denotes the prediction steps.

In order to make the predicting results more accurate, the optimal value of and the parameters of involving (the variable in gauss kernel function) should be designed (in (8)). To this end, an HRGA is studied below to optimize the following problem:

3. HRGA Based on Bare Bones PSO

Assuming that population size is , the dimension of particle is . The position of particle on generation is . The speed of particle on generation is . The historic optimal value of individuals is .

Let the global optimal value be .

As to standard particle swarm, the position and speed are updated as where denotes the inertia weight [21], and denote the accelerating operators, and and are uniform distributed random numbers in .

In the bare bones particle swarm optimization (PSO), (10) is replaced by (11) as the evolution equation of particle swarm algorithm:

The position of particle is some random numbers which are gotten from the Gauss distribution. The distribution has the mean value of and the standard deviation of .

RNA genetic algorithm is on the basis of base coding and biological molecules operation. Since in the biological molecule, every three bases compose one amino acid. In other words, the bases’ length of individuals must be divided exactly by 3. When RNA recoding and protein folding [22], to reduce calculation and to control population size, we assume that the protein folding operation only occurs on the individuals without RNA recoding. Then the most important work is to change the mutation probability [23, 24].

Angeline told us that the essence of particle swarm’s position updating was one mutation operation in 1998 [25]. Traditional RNA genetic algorithm mutates as the fixed mutation probability, and the mutation is random with one direction. However HRGA can reflect the historic information of individuals and the sharing information of the population. HRGA can make every individual do directional mutation and improve search efficiency. Moreover, HRGA ensures the strong global search capability, since it does not change the selection and crossover operator.

The procedure of HRGA based on bare bones particle swarm algorithm to optimize the parameters and the dynamic correction factor is as follows.

Step 1. Get one group of parameters, and the dynamic correction factor randomly, code every parameter, and get the initial RNA population with individuals, crossover probability , and mutation probability . Assign values for every (individual’s historic optimal solution) and (population’s global optimal solution).

Step 2. Compute its error function and get the fitness function. Comparing it with corresponding fitness value of and , then update and .

Step 3. Execute the selection operation. Get current generation through coping individuals from the initial or the last generation.

Step 4. Decide whether the value meets the RNA recoding condition or not. If , recode RNA, then go to Step 6. If , go to Step 5.

Step 5. Decide meet the protein mutual folding condition or not. If , execute the protein mutual folding operation. If , execute the protein own folding operation.

Step 6. Execute the mutation operation as (11) for all the crossover individuals, on the basis of the and , which have been gotten from Step 2.

Step 7. Repeat Step 2 to Step 6 until the training target meets the condition. At last, we get the optimal parameters of and the dynamic correction factor.

The flowchart of HRGA to optimize the parameters and the dynamic correction factor is shown in Figure 1.

3.1. HRGA Testing

Three classical benchmark functions shown in Table 1 are used to test the property of HRGA.

In addition, among the three functions, Sphere is unimodal function, and the other two are multimodal function.

With the population size , and other parameters determined by multiple test for each function. Each function is tested by HRGA and standard RGA in different dimensions. Each experience is carried on 100 times. Record the mean value of target function’s optimum (shown in (12)). The result is displayed in Table 2:

In this equation, denotes the mean value of target function’s optimum; denotes the optimum of benchmark functions in every experiment.

As to the experimental results, with different dimensions having the same iterative times, the mean value of optimum of HRGA is smaller than that of RGA for the three benchmark functions. The average performance of HRGA is closer to the optimum. We can increase the mutation probability appropriately and enhance the convergence speed, since the mutation operator of HRGA has directional local search.

4. Crude Oil Price Prediction Based on a Dynamic Correcting SVR

In this paper, we get the crude oil price from the US Energy Information Administration Web [26]. Since the oil price fluctuates violently, in order to facilitate the processing and decrease the error, we adopt the Cushing, OK WTI Spot Price FOB (dollars per barrel)monthly from January 1986 to now. We take the one hundred data from January 1986 to April 1994 as the test sample. And give the next 20-month dynamic predicting data from May 1994 to December 1995. The relative error of forecasting is shown in Table 2. The prediction effect figure of HRGA and with dynamic correction factor is shown in Figure 2. We use Gauss function as the kernel function of , which is given as follows:

Parameter setting of HRGA- is with population size being 100, maximum evolution generation being 150, coding length of being 9, coding length of being 8, coding length of being 13, coding length of being 8, being 0.8, and being 0.1.

The optimization interval is set to be

When analyzing the results, we define the evaluation index:

The forecasting error analysis results are shown in Figure 3. In this figure, SVM refers to . The BP neural network and are with dynamic correction factor which differs them to the traditional method. From Figure 2, we can know that the prediction result is very close to the real value. The HRGA-SVR can be used to predict the crude oil price. Table 3 tells us the WTI crude oil price predicting relative errors of twenty months. Among the three methods in twenty months, the biggest absolute value of relative error of HRGA- is the smallest, which is 7.35%, and the smallest root-mean-square of relative error is 3.87%. As to Figure 3, the fluctuation range of HRGA- is smaller than those of the other two methods obviously. This means that HRGA- is the best one among the three methods.

5. Conclusions

In this paper, we have presented a novel method on predicting crude oil price. This method bases on an -support vector regression machine with dynamic correction factor correcting predicting errors. We also proposed the HRGA, with the position displacement idea of bare bones PSO changing the mutation operator, to optimize the parameters in an -SVR. The predicting result of crude oil price shows the validity of the proposed method. Thus, the -SVR model can also be applied to predict tendency in other practical areas.

Acknowledgments

The research was partially supported by Grant no. 60974039 from the National Science Foundation of China and by Grant no. ZR2011FM002 from the Natural Science Foundation of Shandong Province.

References

B. Hunt, P. Isard, and D. Laxton, “The macroeconomic effects of higher oil prices,” IMF Working Paper No.wp/01/14, 2001.
View at: Google Scholar
Y. Fan, K. Wang, Y. J. Zhang et al., “International crude oil market analysis and price forecast in 2009,” Bulletin of Chinese Academy of Sciences, vol. 4, no. 1, pp. 42–45, 2009.
View at: Google Scholar
C. Morana, “A semiparametric approach to short-term oil price forecasting,” Energy Economics, vol. 23, no. 3, pp. 325–338, 2001.
View at: Publisher Site | Google Scholar
S. Mirmirani and H. Cheng Li, “A comparison of VAR and neural networks with genetic algorithm in forecasting price of oil,” Advances in Econometrics, vol. 19, pp. 203–223, 2004.
View at: Publisher Site | Google Scholar | Zentralblatt MATH
Z. J. Ding, Q. Min, and Y. Lin, “Application of ARIMA model in forecasting prude oil price,” Logistics Technology, vol. 27, no. 10, pp. 156–159, 2008.
View at: Google Scholar
J. P. Liu, S. Lin, T. Guo, and H. Y. Chen, “Nonlinear time series forecasting model and its application for oil price forecasting,” Journal of Management Science, vol. 24, no. 6, pp. 104–112, 2011.
View at: Google Scholar
S. Y. Wang, L. Yu, and K. K. Lai, “Crude oil price forecasting with TEI@ I methodology,” Journal of Systems Sciences and Complexity, vol. 18, no. 2, pp. 145–166, 2005.
View at: Google Scholar | Zentralblatt MATH
W. Xie, L. Yu, S. Xu, and S. Wang, “A new method for crude oil price forecasting based on support vector machines,” Lecture Notes in Computer Science, vol. 3994, pp. 444–451, 2006.
View at: Publisher Site | Google Scholar
R. A. N. Mohammad and A. G. Ehsan, “A hybrid artificial intelligence approach to monthly forecasting of crude oil price time series,” in The Proceedings of the 10th International Conference on Engineering Applications of Neural Networks, pp. 160–167, 2007.
View at: Google Scholar
S. Guo and P. Lai, “The time series mixed model and its application in price prediction of international crude oil,” Journal of Nanjing University of Information Science & Technology, vol. 2, no. 3, pp. 280–283, 2010.
View at: Google Scholar
Y. B. Hou, J. Y. Du, and M. Wang, Neural Networks, Xidian University Press, Xi’an, China, 2007.
H. Zhu, L. Qu, and H. Zhang, “Face detection based on wavelet transform and support vector machine,” Journal of Xi'an Jiaotong University, vol. 36, no. 9, pp. 947–950, 2002.
View at: Google Scholar
R. Feng, C. L. Song, Y. Z. Zhang, and H. H. Shao, “Comparative study of soft sensor models based on support vector machines and RBF neural networks,” Journal of Shanghai Jiaotong University, vol. 37, pp. 122–125, 2003.
View at: Google Scholar
J. Tao and N. Wang, “DNA computing based RNA genetic algorithm with applications in parameter estimation of chemical engineering processes,” Computers & Chemical Engineering, vol. 31, no. 12, pp. 1602–1618, 2007.
View at: Publisher Site | Google Scholar
K. Wang and N. Wang, “A protein inspired RNA genetic algorithm for parameter estimation in hydrocracking of heavy oil,” Chemical Engineering Journal, vol. 167, no. 1, pp. 228–239, 2011.
View at: Publisher Site | Google Scholar
K. Wang and N. Wang, “A novel RNA genetic algorithm for parameter estimation of dynamic systems,” Chemical Engineering Research & Design, vol. 88, no. 11, pp. 1485–1493, 2010.
View at: Publisher Site | Google Scholar
D. Bratton and J. Kennedy, “Defining a standard for particle swarm optimization,” in Proceedings of the IEEE Swarm Intelligence Symposium (SIS '07), pp. 120–127, April 2007.
View at: Publisher Site | Google Scholar
N. Y. Deng and Y. J. Tian, A New Method of Data Mining and Germany: Support Vector Machines, Science Press, Beijing, China, 2004.
U. Thissen, R. Van Brakel, A. P. De Weijer, W. J. Melssen, and L. M. C. Buydens, “Using support vector machines for time series prediction,” Chemometrics and Intelligent Laboratory Systems, vol. 69, no. 1-2, pp. 35–49, 2003.
View at: Publisher Site | Google Scholar
K. J. Kim, “Financial time series forecasting using support vector machines,” Neurocomputing, vol. 55, no. 1-2, pp. 307–319, 2003.
View at: Publisher Site | Google Scholar
Y. Shi and R. Eberhart, “Modified particle swarm optimizer,” in Proceedings of the IEEE Congress on Evolutionary Computation, pp. 519–523, 1998.
View at: Google Scholar
D. P. Clark, Molecular Biology: Understanding the Genetic Revolution, Academic Press, New York, NY, USA, 2005.
J. Lis, “Genetic algorithm with the dynamic probability of mutation in the classification problem,” Pattern Recognition Letters, vol. 16, no. 12, pp. 1311–1320, 1995.
View at: Google Scholar
M. Serpell and J. E. Smith, “Self-adaptation of mutation operator and probability for permutation representations in genetic algorithms,” Evolutionary Computation, vol. 18, no. 3, pp. 491–514, 2010.
View at: Publisher Site | Google Scholar
P. J. Angeline, “Evolutionary optimization versus PSO: philosophy and performance differences,” Evolutionary Programming, vol. 7, pp. 601–610, 1998.
View at: Google Scholar
http://www.eia.gov/dnav/pet/hist/LeafHandler.ashx?n=PET&s=RWTC&f=M.

Copyright

Copyright © 2013 Li Shu-rong and Ge Yu-lei. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

PDF Download Citation

Download other formats

Order printed copies

Views

1966

Downloads

1563

Citations