Mathematical Problems in Engineering

Volume 2015, Article ID 201686, 10 pages

http://dx.doi.org/10.1155/2015/201686

## Research on Combinational Forecast Models for the Traffic Flow

^{1}School of Computer Science and Information Technology, Northeast Normal University, Changchun 130117, China^{2}Academy of Fine Art, Northeast Normal University, Changchun 130117, China^{3}Key Laboratory of Intelligent Information Processing of Jilin Universities, Northeast Normal University, Changchun 130117, China

Received 13 February 2015; Accepted 22 April 2015

Academic Editor: Chih-Cheng Hung

Copyright © 2015 Zhiheng Yu et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

#### Abstract

In order to improve the prediction accuracy of the traffic flow, this paper proposes two combinational forecast models based on GM, ARIMA, and GRNN. Firstly, the paper proposes the concept of associate-forecast and the weight distribution method based on reciprocal absolute percentage error and then uses GM(1,1), ARIMA, and GRNN to establish a combinational model of highway traffic flow according to the fixed weight coefficients. Then the paper proposes the use of neural networks to determine variable weight coefficients and establishes Elman combinational forecast model based on GM(1,1), ARIMA, and GRNN, which achieves the integration of these three individuals. Lastly, these two combinational models are applied to highway traffic flow on Chongzun of China and the experimental results verify their effectiveness compared with GM(1,1), ARIMA, and GRNN.

#### 1. Introduction

The traffic flow forecast is an important research in modern intelligent transportation and an accurate traffic prediction is a premise and a key to achieve traffic control and planning [1]. Long-term traffic flow forecast is based on hours, days, months, and even years for the unit of time [2], which is very important to the traffic forecast. On one hand, long-term traffic prediction can contribute to the planning and construction of the rational road distribution. On the other hand, it will help conduct the road maintenance in the operation department and schedule the construction progress timely [3].

Over the years, the scholars have been dedicated to research in this field and made a series of predictive models. Kim and Hobeika established a real-time traffic flow forecast using ARIMA model [4]. Lee and Fambro established a sub-ARIMA forecasting model in short-term traffic flow [5]. Yao and Cao used an ARIMA method to predict the traffic trend and analyzed the applicability of ARIMA with real data [6]. Brian and Demetsky used a neural network to forecast short-term traffic flow and it was proved that the method had better results [7]. Dougherty and Cobbett established a prediction model with BP neural network to predict the urban traffic [8] and Dia established a neural network forecasting model with time delay [9]. Ma et al. used BP (Backpropagation) and RBF (Radial Basis Function) to establish a dynamic traffic flow forecast model and raised the data preprocessing methods and forecast model evaluation [10]. Tiefeng used an improved genetic neural network model for urban traffic flow prediction [11] and the experiment also confirmed the effectiveness. Guo et al. who considered the delay and nonlinearity made use of model to predict urban road traffic flow [12].

In fact, these forecast methods mentioned above have their own advantages and disadvantages. After all, the predictive ability of a single forecast model is limited. If they can be effectively combined to achieve complementarities, the prediction precision can be improved greatly. The scientifically and rationally combinational forecast model can extremely avoid the adverse effects of the use of a single model and play their respective advantages. Bates and Granger first proposed a combinational forecast model [13], which has become an important development of forecast technology. Li et al. used a dynamic weight combination of the historical average, Euclidean distance, and dynamic time warping distance to predict the intersection traffic flow in Xiamen Lotus [14]. Baochun et al. proposed an adjustable parameter genetic gray system theory to predict short-term traffic flow [15]. Zhuoping and Yuxian established a variable weight combinational model for railway cargo traffic and got better prediction accuracy [16]. Zhang and Wang proposed a combinational forecast model of the genetic algorithm and the time delay neural network [17]. In addition, [18–25], respectively, merged the data mining technology, rough set theory, fuzzy logic, support vector machines, particle swarm optimization, ant colony algorithm, and chaos algorithm into the forecast models to predict the traffic flow.

From the above, we can see that the combinational model is an effective way to improve the accuracy rate of the traffic flow forecast. Because of an actual problem of long-term traffic flow, we propose two fusion forecast models of GM, ARIMA, and GRNN. The first model is a fixed weight coefficient combination which is established based on three individual models. After using the methods of the associate-forecast and the reciprocal absolute percentage error to allocate the fixed weights of different models, we build a combinational forecast model and output the final forecast results. The second method is a combination of variable weight coefficients according to Elman neural networks. , ARIMA, and GRNN are integrated to establish a random weight combination. After comparing the experimental results of two combinational prediction models with three individuals, the rationality and accuracy of the models in this paper are verified.

#### 2. GM(1,1)

In 1986, Julong proposed gray system theory [26], that is, to take small samples, poor information, and uncertain system as the research object and to extract valuable information through the partially known information processing and extending. Tapping the potential law among data has been found to achieve a correct understanding and effective control of the system behavior. Gray system theory with the help of concepts such as space and smooth discrete functions established a differential dynamic gray model in the use of discrete data sequence, which is called GM (Grey Model).

The modeling mechanism of GM is illustrated as follows:(1)The original irregular data is summed up into the spanning number by adding processing functions.(2)The data from GM must be generated via the inverse spanning reduction before they can be used.

is a GM model with -order equation and variables, which is applied to the analysis of the dynamic relationship between variables but not suitable for forecasting. , on the other hand, is a system model that contains a single variable and 1-order differential equation and is applicable to predicting the future changes according to the previous values of the single variable. It is the most common one in all the GM models.

The general steps of modeling are as follows:(1)The original data in each time point is accumulated sequentially to form a new sequence . The calculation formula of is and order .(2)Then there is a 1-order albino differential equation with a single variable: . and are undetermined coefficient, where is the development coefficient and is the amount of grey which can be obtained by using the least square method; namely,(3)By calculating the above result, the cumulative value of the sequence is gotten: (4)Finally, the prediction value is obtained by a reduction process:

does not require large samples or the data to be subject to certain distribution. According to only a small amount of data, it can complete a forecast satisfactorily. Because its predictable geometry is a monotonic smooth curve, it is suitable for the dynamic prediction that the time series is short with less data and the volatility is not too large. When the amount of data is large and it has strongly stochastic volatility, the prediction error is often big and the prediction accuracy tends to be low.

#### 3. ARIMA

ARIMA (Autoregressive Integrated Moving Average) proposed by Box and Jenkins is a time series forecast method [27] and it is also known as Box-Jenkins model. For , AR is autoregression and is the number of autoregressive items. MA and are similar to AR and , while is the number of differencing times when series becomes stationary.

The ARIMA model data which is processed must be stationary; that is, the mean and covariance of the sequence do not change with time advection. The data sequence first needs to go through a series of data testing such as a figure test, an autocorrelation and a partial correlation function test, a run test, a characteristic root test, ADF, KPSS, and other methods for their stability. If the data sequence is not smooth, you need to make it smooth through the methods such as difference and logarithmic differential treatment.

After steady data processing, ARMA (Autoregressive and Moving Average) is used to fit data. It is the mixture of AR (Autoregressive) and MA (Moving Average) with the form of differential equations. For normal, smooth, and zero-mean time sequence , if relies not only on the historic values of the previous steps as but also on each of the interferences with the previous steps as , we can get general ARMA according to the multiple linear regression:where . Formula (5) represents a -order AR and a -order MA model, recorded as . and denote the orders of AR and MA components and and denote, respectively, the parameters of each part.

The general modeling steps of ARMA are as follows.

*(1) Model Identification and Order*. Which model can be appropriate is often based on the ACF (autocorrelation function) and PCF (partial correlation function). Then, it is necessary to calculate the values of and . ACF and PCF are just a preliminary determination for the orders and the set of candidate models. Then we should choose the optimal focus from alternative models. If the regression coefficient passes by -test, it indicates a significant effect. On the contrary, some coefficients should be removed according to the actual situation but it is not necessary to be done completely and the characteristic roots are needed for further test.

*(2) Parameter Estimation and Adaptation Test*. After the completion of the initial fixed order, the mean, variance, and all the values of and need to be determined for further model application. Next, we must determine if the model can properly describe the given time series, namely, adaptive model testing; that is, the essence of the independence of the model is whether residual sequence is a white noise or not.

ARIMA can predict the curve for short-term and long-term forecasts and it has smaller prediction error than AR. When the amount of data is sufficient, it has relatively high prediction accuracy. However, its parameter estimation is a bit complex and when the amount of data is not enough and the regularity is not strong, a large amount of computation will result in obvious time delay which decreases the prediction accuracy.

#### 4. GRNN

GRNN (generalized regression neural network) proposed in 1991 by D. F. Specht, an American scholar, is a branch of RBF network. Based on nonparametric regression, it has a strong nonlinear mapping ability to suit for solving the problem of curve fitting. Compared with RBF, GRNN has stronger advantages on approximation ability and learning speed and its training process is also more convenient. Therefore, it has been widely used in the decision-making control system, signal process, finance, energy, and other fields [28, 29].

##### 4.1. The Structure of GRNN

GRNN is composed of four layers, namely, the input layer, the hidden layer, the summation layer, and the output layer. When the network input , the corresponding output . The structure of GRNN is shown in Figure 1.