Abstract

This paper focuses on the problems of short-term traffic flow forecasting. The main goal is to put forward traffic correlation model and real-time correction algorithm for traffic flow forecasting. Traffic correlation model is established based on the temporal-spatial-historical correlation characteristic of traffic big data. In order to simplify the traffic correlation model, this paper presents correction coefficients optimization algorithm. Considering multistate characteristic of traffic big data, a dynamic part is added to traffic correlation model. Real-time correction algorithm based on Fuzzy Neural Network is presented to overcome the nonlinear mapping problems. A case study based on a real-world road network in Beijing, China, is implemented to test the efficiency and applicability of the proposed modeling methods.

1. Introduction

It is of practical significance to predict traffic flow quickly, precisely, and timely. Short-term traffic flow forecasting provides an important basis for traffic guidance and control. Existing studies of short-term traffic flow forecasting can be classified into six categories in transportation literature:(a)linear system theory based models, such as Autoregressive Integrated Moving Average (ARIMA) [1] and Kalman Filtering model [2];(b)data mining based models, such as Neural Network [3], Nonparametric Regression [4], and Support Vector Machine [5];(c)nonlinear system theory based models, such as Wavelet Analysis [6], Catastrophe Theory [7], and Chaos Theory [8];(d)simulation based models [9];(e)combination model based models [10];(f)the other models.

In the era of big data, it brings both opportunities and challenges to short-term traffic flow forecasting. During data processing, traffic big data meets the same difficulties with the general big data, such as capture, storage, search, sharing, analytics, and visualization. Therefore, short-term traffic flow forecasting method needs to have the capacity to deal with traffic big data. Traffic big data holds several characteristics, such as temporal correlation, spatial correlation, historical correlation, and multistate. Considering the advantages of traffic big data, data-driven based mathematical models can be set up. The physical meaning of these models can by described clearly. In addition, we can put forward real-time correction algorithm to improve the accuracy of traffic flow forecasting.

However, taking into account all the present researches in this field, there is still a lack of consideration of traffic big data and real-time correction for traffic flow forecasting. Further researches remain to be conducted on the direction of traffic big data analysis. In this paper, the method of short-term traffic flow forecasting is proposed in detail. The remainder of this paper is organized as follows. Section 2 presents basic mathematical model. In Section 3, real-time corrected traffic correlation model is established. A case study based on a real-world road network is carried out in Section 4 to demonstrate the performance and applicability of the proposed method. Finally, conclusions are drawn in Section 5.

2. Basic Mathematical Model

2.1. Big Data Driven Based Traffic Correlation Model

Traffic big data has a strong temporal-spatial-historical correlation as follows.(i) In the temporal series, the traffic flow of last moment can be regarded as a continuation of current traffic flow. Dynamic traffic flow data continuously change over time with a certain trend.(ii) In the spatial series, the traffic flow of downstream sections can be seen as a continuation of the upstream traffic flow. There exists a spatial association between traffic flow data of neighboring junctions or sections and that of target junctions or sections.(iii) In the historical series, the traffic demand characteristics determine that traffic flow characteristics of the same day in the same period are similar. The law of traffic flow cycle is especially evident.

Therefore, the basic form of traffic correlation model [11] is expressed aswhere is the traffic flow parameter of section at time , representing flow , speed , or occupancy . , , and are the estimated value of . , , and are coefficients of these three variables.

is calculated by temporal correlation analysis, generally based on Regression Analysis Model [12]. is calculated by spatial correlation analysis, generally based on Neighbor Regression Model [13]. is calculated by historical correlation analysis, generally based on Discrete Fourier Transform Model [14].

Thus, simplified equation of traffic correlation model is obtained:where is regression coefficient of . is the number of .

2.2. Correction Coefficients Optimization Algorithm

It is found that the speed and accuracy of data processing are both important for big data driven method. To improve the speed of traffic correlation model, the number of unknown variables in formulation (2) must be reduced. However, variables reduction may decrease the accuracy of the model.

Therefore, this paper defines a threshold of computing speed and derives the maximum of acceptable number of variables. Thus, the number of unknown variable can be achieved. The correlation coefficients () between the studied section and the related section are used to choose variables, the number of which is :

In addition, for , a unique value of is determined. Therefore, a lot of variables are reduced. When the max of correlation coefficient () corresponding to each section is calculated, corresponding time delay () and the unique value of are obtained:

If the value of is large enough, the variable is preserved. Otherwise, the variable is reduced. After several data tests, it is found that the values of are not very different with relatively high values when is less than 4; the rapid decay of is observed with relatively low values when is more than 8. Therefore, .

Variables reduction makes meaningless. So, a new variable , which is normalized , is selected to replace the variable :

Since the alternative process will bring some errors, which are likely to be systematic, a linear correction algorithm is present. Two correction variables and are introduced for calibration error. Simplified traffic correlation model iswhere is the actual value of .

3. Model Improvement

3.1. Real-Time Correction Problem Statement

Basic mathematical model can be used for traffic flow forecasting. The error of traffic flow forecasting is written as

Therefore, is achieved:

The variation range of reflects the accuracy of traffic flow forecasting. To improve the accuracy of traffic flow forecasting, is made as a compensation variable for . Then, is replaced by in formulation (8):

For different traffic state, the propagation of traffic congestion is different. So, the temporal-spatial-historical correlation variables are dynamic. As shown in Section 2, basic mathematical model for traffic flow forecasting is put forward based on temporal-spatial-historical correlation characteristic of traffic big data. However, the multistate characteristic is largely ignored; the temporal-spatial-historical correlation variables are seen as static variables. Although a linear fitting method is used to improve accuracy, the error which is the main part of still exists.

Effective analysis of traffic correlation model is shown in Figure 1. The error of traffic congestion stage is larger than that of traffic smooth stage. The error of last moment may affect the error of current moment.

responds to changes with traffic flow state and temporal series. It is difficult for mathematical model to describe the characteristics of . This paper tries to present real-time correction algorithm based on nonlinear mapping.

It assumes that is influenced by level of service at time () and . Then, the nonlinear mapping can be written as

3.2. Real-Time Correction Algorithm

The main goal of real-time correction algorithm is to calculate the error term () in formulation (9). Because of nonlinear mapping, Fuzzy Neural Network (FNN) can be used to overcome this problem. The structure of Fuzzy Neural Network is shown in Figure 2.

Every input unit in Fuzzy Neural Network is corresponding to certain fuzzy subset of the input variables, which are and . The format of input signal is

Every output unit in Fuzzy Neural Network is corresponding to certain fuzzy subset of the output variable, which is . The format of output signal is

To get membership degree model of and , SAGA-FCM is used to get the clustering centers. is divided into six types [15], as . The membership degree of is calculated bywhere is the membership degree of which sample data that belongs to , . is the set of clustering centers for , , and .

is divided into five types, as . The membership degree model of is calculated bywhere is the membership degree of which sample data belongs to . is the set of clustering centers for , .

Historical date is applied to train the Fuzzy Neural Network. Input signal is corresponding to target output. After training, the network can be seen as a container of fuzzy relations and if we want to get other conclusions from the network, the only thing that needs to be done is to input the real value after defuzzification, as shown in

3.3. Real-Time Corrected Traffic Correlation Model

Real-time corrected traffic correlation model, which is seen as the improved traffic correlation model, is composed of static part and dynamic part. The static part is “,” which shows the physical meaning of traffic correlation. The dynamic part is “,” which shows the physical meaning of traffic multistate characteristic. Traffic flow forecasting framework is outlined in Figure 3.

The steps of traffic flow forecasting are as follows.

Step 1 (traffic correlation model). Based on historical data, temporal data, and spatial data, basic traffic correlation model, as shown in formulation (1), is built.

Step 2 (simplified traffic correlation model). Simplified traffic correlation model, as shown in formulation (6), is proposed based on correction coefficients optimization algorithm. Thus, “” is obtained.

Step 3 (real-time correction algorithm). Based on Fuzzy Neural Network, real-time correction algorithm is put forward. Thus, “” is obtained.

Step 4 (real-time corrected traffic correlation model). Real-time corrected traffic correlation model, as shown in formulation (8), is stored in the database as system knowledge.

Step 5 (short-term traffic flow forecasting). Based on system knowledge, real-time data is processed to calculate the forecasting results.

4. Case Study

4.1. Data Characteristics

Taking a section of the Second Ring Road (Section  1, as shown in Figure 4) and its surrounding trunk road in Beijing, China, as the object of study, it verifies the effectiveness and feasibility of the proposed method. Basic traffic flow data are detected by microwave detectors. Interval time of traffic flow data is 5 min.

4.2. Effective Analysis
4.2.1. Basic Mathematical Model

Based on historical data, basic mathematical model, including traffic correlation model and simplified traffic correlation model, is built. The static part of real-time corrected traffic correlation model, which is “,” is achieved, as shown in

Goodness of Fit is shown in Table 1, in which the value of shows the high effectiveness of formulation (16).

4.2.2. Model Improvement

Real-time correction algorithm is presented to obtain the dynamic part of real-time corrected traffic correlation model, which is “.” The steps are as follows.

Step 1 (clustering center search). Based on SAGA-FCM, is divided into six types; is divided into five types. Clustering Centers are shown in Tables 2 and 3.

Step 2 (calculation of membership degree). Formulations are shown is Section 3.2.

Step 3 (training Fuzzy Neural Network). Input layer has 11 nodes; the transmission function of the nerve cells in the hidden layer is transig; the output layer has 5 nodes and the transmission function of the nerve cells in the output layer is logsig, while the training function is traingdx. Historical data is used to train Fuzzy Neural Network.

4.2.3. Short-Term Traffic Flow Forecasting

Making one day as an example, the result is shown in Figure 5.

Mean Absolute Percentage Error () is used to evaluate the accuracy of the proposed model:

Evaluation result is shown in Table 4. Comparing with these two kinds of model, real-time corrected traffic correlation model is better than the basic mathematical model. That is, the static part of real-time corrected traffic correlation model reflects the temporal-spatial-historical correlation by way of mathematical model; the dynamic part of real-time corrected traffic correlation model modifies the mathematical model based on multistate characteristics of traffic big data.

5. Conclusions

Traffic big data strongly shows temporal-spatial-historical correlation and multistate characteristic. Traffic correlation model is established based on temporal-spatial-historical correlation. Correction coefficients optimization algorithm is put forward to reduce parameters and ensure calculation accuracy. In order to improve the effectiveness of short-term traffic flow forecasting, real-time correction algorithm is presented based on multistate characteristic. Fuzzy Neural Network is used to overcome the problem of nonlinear mapping. Case study shows that real-time correction algorithm can improve the effectiveness of traffic correlation model.

The core of this paper is to present a short-term traffic flow forecasting based on traffic big data analysis. The advantages of real-time corrected traffic correlation model for traffic flow forecasting are as follows.(1)The temporal-spatial-historical correlation, which is considered in the static part of model, explains the physical meaning of traffic flow forecasting by mathematical model.(2)The multistate, which is considered in the dynamic part of model, explains the dynamic characteristic of traffic flow.(3)Real-time correction algorithm improves the accuracy of traffic flow forecasting. Case study shows the high efficiency and applicability of the proposed methods.

Moreover, the proposed methods can be extended to 15- or 30-minute-ahead forecasting. The next steps of this work are to study traffic incident and its influence for short-term traffic flow forecasting. In addition, how to deal with the long period traffic flow forecasting like one hour or even longer can also be focused on.

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

Acknowledgments

The authors are grateful to the editor and anonymous reviewers for their valuable suggestions. This research was funded by the “Twelfth Five-Year” National Science & Technology Pillar Program (2014BAG01-B04) and Beijing Science and Technology Plan (no. Z121100000312101).