Discrete Dynamics in Nature and Society

Volume 2018, Article ID 7696592, 9 pages

https://doi.org/10.1155/2018/7696592

## Space-Time Hybrid Model for Short-Time Travel Speed Prediction

^{1}Jiangsu Key Laboratory of Urban ITS, Southeast University, Si Pai Lou #2, Nanjing 210096, China^{2}School of Transportation, Southeast University, Si Pai Lou #2, Nanjing 210096, China^{3}Zhuhai Institute of Urban Planning & Design, Mei Hua Dong Road #302, Zhuhai 519000, China

Correspondence should be addressed to Wei Wang; moc.361@ratsnart_iewgnaw

Received 21 February 2017; Accepted 23 January 2018; Published 25 February 2018

Academic Editor: Gabriella Bretti

Copyright © 2018 Qi Fan et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

#### Abstract

Short-time traffic speed forecasting is a significant issue for developing Intelligent Transportation Systems applications, and accurate speed forecasting results are necessary inputs for Intelligent Traffic Security Information System (ITSIS) and advanced traffic management systems (ATMS). This paper presents a hybrid model for travel speed based on temporal and spatial characteristics analysis and data fusion. This proposed methodology predicts speed by dividing the data into three parts: a periodic trend estimated by Fourier series, a residual part modeled by the ARIMA model, and the possible events affected by upstream or downstream traffic conditions. The aim of this study is to improve the accuracy of the prediction by modeling time and space variation of speed, and the forecast results could simultaneously reflect the periodic variation of traffic speed and emergencies. This information could provide decision-makers with a basis for developing traffic management measures. To achieve the research objective, one year of speed data was collected in Twin Cities Metro, Minnesota. The experimental results demonstrate that the proposed method can be used to explore the periodic characteristics of speed data and show abilities in increasing the accuracy of travel speed prediction.

#### 1. Introduction

Driven by the need to promote Intelligent Transportation Systems (ITS) and traffic safety management, short-time future travel speed predicting is a crucial issue that attracted a number of studies. Common forecasting methods mainly can be divided into five categories: methods based on traditional statistical theory, intelligent model methods based on knowledge discovery, methods based on nonlinear system theory, methods based on hybrid model, and other forecasting methods. Concrete methods include regression method, neural network methods, wavelet network models, time series model, support vector regression methods, Kalman filtering methods, exponential smoothing methods, gray system model, trend extrapolation method, and artificial intelligence.

Travel speed data often shows periodic trends, with one to two minimum value peaks in one weekday; however, such a trend may not be similar from Monday to Sunday because of the different characteristics of the day trip within a week. Observing the cyclical changes in the regulation of the speed, one can grasp the daily variation characteristics, which could provide the planning and design indicators and basic services. Dendrinos [1] considered traffic as a combination of 17 periodic components and residual parts and as fitting the trend with Fourier series. Fei et al. [2] present a dynamic linear model to predict short-term travel time: travel time is the sum of the median of historical travel times, time-varying random variations in travel time, and evolution error model. Liu [3] presents that there is an intrinsic mode function in speed, and this part is predicted by ARIMA. Zhang et al. [4] present that periodic variation of traffic flow been analyzed by the statistical volatility model. It is important to note that periodic trend analysis cannot give the speed change in small scale, which means the prediction of small range float for speed becomes particularly important.

The residual part is the difference between the real-time series data and the periodic trend. With comparison and analysis of a great amount of data, the residual part shows an obvious fluctuation, but the volatility is stable within a certain range. For such series, autoregressive integrated moving average (ARIMA) method has a unique advantage at the time of prediction. ARIMA model is one of the most widely used regression techniques. Its applications in freeway traffic forecasting can be traced back to 1979 [5]. When using the ARIMA model for traffic flow prediction [6], difference processing is performed on the data if it is a nonstationary sequence and has a certain growth or decline trend. If there has a heteroscedasticity in the data, the nonstationary sequence is required to be smoothed with the difference model until the autocorrelation function and the partial correlation function of the processed data are not significantly different from zero.

Lund [7] used R language to simulate the ARIMA process, with a great result. However, Karlaftis and Vlahogianni [8] point out that the ARIMA model lacks the ability to capture long memory properties and does not jointly treat the mean and variance.

Research on the speed prediction at the present stage focuses primarily on characteristics of time variation on a single point on the road, considers changing speed with time, lacks spatial analysis of the target position, and ignores the influence of predicted results from the travel speed change of adjacent links. However, links in the road network are not isolated; speed change of target position follows its own change rule in the dimension of time and is affected simultaneously by upstream and downstream traffic conditions. The speed of the upstream section can be transmitted to the downstream section through the road with similar distribution characteristics, and the traffic state of the downstream will also react to the upstream. Adding spatial correlation analysis can effectively detect the occurrence and influence of nearby plugging points and can reflect the impact of emergencies on speed, might warn of upstream and downstream traffic issues, and reduce the congestion and network congestion caused by emergencies in a timely manner through management means paralysis. At first, van Lint [9] established a state-space neural network (SSNN) model to predict travel times directly from the data of adjacent section. Pan et al. [10] developed a spatial and temporal dynamic model with stochastic cell transmission. Zou et al. [11] put forward a hybrid model combined with spatial analysis and several time analyses to compare and analyze the results.

Most of methods encounter the problem of lack of accuracy or reliability when used separately in travel speed prediction, and so we proposed a hybrid model to properly combine them. Using the real data to test and verify results of periodic trend and ARIMA, ARIMA method, spatial analysis regression model, and the hybrid model we proposed, the comparison indexes of predictive effects like RAME, MAE, and MAPE all indicate that our hybrid model can predict short-time travel speed accurately and reliably.

#### 2. Methodology

The signal forecasting method for short-time travel speed has proved to be difficult in providing accurate information. We established a hybrid model combining the results of time prediction and spatial prediction so that the real speed changes of freeways can be more accurately simulated.

##### 2.1. Periodic Trend Analysis

Traffic speed usually changes daily. The cyclical change of travel speed is obvious, especially within a workday. Tang et al. [12] discussed the periodic characteristics of speed in detail. The periodic trend is a significant feature in dealing with speed data and should be considered first in analyzing speed change rules. Periodic analysis [7] is adopted to investigate the cyclicality in the daily traffic data and is also effective in analyzing travel speed rules.

From the perspective of regression analysis, change of travel speed appears a kind of trend of cyclical fluctuations; therefore, wave theory is feasible during the analysis. Fourier series express the periodic trend in traffic speed, which is driven by trigonometric functions. For a fixed period,* T*, the velocity of a point and the velocity after the period of this point often show some similarity and consistency, which function together by repeatability of driving behavior and stationarity of road conditions. We assume that cyclical change of travel speed conforms to a common expression of Fourier series [1], a complete cycle expression in a unit time interval shown as in which is harmonic of order. Consider the truth of travel speed prediction occurring from 0 a.m. to 12 p.m., where (1) has a general form: in which , which is a frequency index defined as cycles per unit time. is the cycle time. is the index of cycle series; represents the order of periodic elements in the series. and are parameters determined by historical data.

##### 2.2. Autoregressive Integrated Moving Average Model

The autoregressive integrated moving average (ARIMA) model has always been used to solve stationarity, randomness, and periodicity in time series analysis and is one of the most general models used to predict spot speed by its past speed data. The periodicity of traffic speed has been considered in the above analysis, but we should remove the periodic trend in daily speed data before using the ARIMA model to forecast travel speed. The remaining part represents variations in real-time specific traffic conditions.

In fact, ARIMA is the combination of the two algorithms: AR and MA, with representing the integrated term. A nonseasonal ARIMA model is classified as ARIMA model, in which AR means autoregressive, is the number of the autoregressive term, MA means moving average, is the order of the moving average, and is the number of difference when the time series is steady. The mathematical representation of an ARIMA is as follows: where is an original data series; is a white noise sequence, which is a sequence of random variables where the mean is zero and variance is ; represent lag operators, ; is autoregression operators, , and is autoregression order of model; is moving average operators, , and* q* is order of moving average; is a parameter, ; and is a mean value.

The process of ARIMA modeling and analyzing is as follows, which is implemented by** R**:

(1) Smoothing the data: if the data sequence is not stable, we should differentiate the data, and the times of difference are , until we get a stationary time series.

(2) Model identification: the initial ARIMA model gained from autocorrelation function (ACF) and partial autocorrelation function (PACF) of the processed time series could determine autoregression order and order of moving average preliminary.

(3) Parameter estimation and model diagnostic: when we gain the coefficients of the initial, we should test the significance of coefficients in the model and at the same time test the white noise of the model.

(4) Forecast and analyze the data using a model with appropriate parameters.

##### 2.3. The Spatial Correlation Analysis

The road consists of connected sections; each spot on the road has spatial accessibility. We can learn from the theory of spatial analysis that the relationship between the difference of attribute values in space and the distance between two points obeys the first law of geography [13]. This means that, from a statistical point of view, the closer the distance between two points, the higher the degree of similarity they have. The spatial correlation of traffic flow would reduce with the increase of distance in a certain space. For a section of same spatial distance in a road, the spatial correlation generally increased, accompanied by the increase of traffic load [14].

Correlation index of statistics could analyze the spatial correlation of travel speed (study on remote monitor system based on 3G mobile system). Assume that there are two points A and B in the adjacent position, , is the spot speed of A and B at the same time , and the correlation index is as follows:in which is the covariation of speed and at the time , ; and are the time series variance of speed in A and B at time .

As the vehicle has liquidity, the vehicle would arrive downstream from upstream after a certain period of time; the upstream traffic flow would be influenced if congestion or failure occurred in downstream. Therefore, we introduce lagged value while analyzing the spatial correlations. Assuming that a car traveled from A to E, with time-lagged value expressed as , the correlation index is changed as

Correlation response is the degree of linear correlation between different groups of data. After repeated verification, two points adjacent to the forecast point upstream and downstream are suitable for selection. If the forecast point has only one adjacent point, the sample size is not sufficient to predict accurately. If the predicted point has too many adjacent points, it will produce invalid calculations and reduce the computational efficiency. Assuming that there are five adjacent points on one road as shown in Figure 1, we can build a multiple linear regression (MLR) model to forecast travel speed of point C as follows:in which , *λ*_{n} are coefficients of MLR model. 1 to* i* is the number of lagged value between A and C, to is the number of lagged value between B and C, to is the number of lagged value between C and D, to is the number of lagged value between C and E. And is the total amount of time-lagged value at points A, B, D, and E and C, which is an integer greater than or equal to zero. And* i*,* j*,* m*,* n* all come from formula (5). The value of is also obtained from a large amount of data by multiple linear regression from formula (5). , , , , , , , are the minimum and maximum time-lagged value between point A and C, B and C, C and D, as well as C and E. The calculated time-lagged values come from formula (5). The value of each point and the point C after the spatial correlation analysis can be determined, in which, we define the smallest of one point and point C as the minimum value and the largest one as the maximum value. The time-lagged values are continuous between the maximum value and the minimum value. denote the speed of points A, B, D, and E at the period of , respectively. Note that we can determine the number of terms of multiple linear regression and the value of in each term only with specific data. is the random error.