Abstract

Traffic congestion in the adjacent region between the highway and urban expressway is becoming more and more serious. This paper proposes a traffic speed forecast method based on the Macroscopic Fundamental Diagram (MFD) and Gated Recurrent Unit (GRU) model to provide the necessary traffic guidance information for travelers in this region. Firstly, considering that the road traffic speed is affected by the macroscopic traffic state, the adjacent region between the highway and expressway is divided into subareas based on the MFD. Secondly, the spatial-temporal correlation coefficient is proposed to measure the correlation between subareas. Then, the matrix of regional traffic speed data is constructed. Thirdly, the matrix is input into the GRU prediction model to get the predicted traffic speed. The proposed algorithm’s prediction performance is verified based on the GPS data collected from the adjacent region between Beijing Highways and Expressway.

1. Introduction

With the continuous growth of the scale of China’s highway network and traffic volume, the traffic load of the intercity highway in some developed cities is increasing. Many adjacent regions (network of in- and out-of-town roads) between highway and expressway have become part of the urban commuting road. Besides, due to the restrictions of traffic management measures, trucks are not allowed to enter the urban area in fixed hours. They can only drive on adjacent regions between the highway and urban expressway, which caused the trucks to accumulate, resulting in low road capacity and service level during peak hours, severe traffic congestion, and frequent traffic accidents. What is more, the abnormal weather also often leads to local traffic congestion, which gradually evolves into spread congestion in the regional network. To sum up, it is essential to conduct research on traffic states predicting at the intersection of highways and urban expressways and publish accurate traffic guidance information to travelers to alleviate traffic congestion.

The evolution of road traffic flow has complex nonlinear characteristics [1], which makes it challenging to realize accurate traffic flow predictions. Many machine learning algorithms have been used to research transportation [2], especially traffic flow prediction. Liu et al. proposed a hybrid road network traffic speed prediction model based on the state-space neural network and extended Kalman filter [3]. Zhang et al. predicted the traffic speed considering the heterogeneity of different roads [4]. Dhivyabharathi et al. proposed a method for predicting river traffic time using the particle filtering method [5]. Zhao et al. integrated the charging data and microwave detection data to predict traffic speed [6]. Zhao et al. proposed a prediction algorithm combining equal spacing interpolation and Sage–Husa adaptive Kalman filtering [7]. Wang et al. improved the reliability method of driving time prediction based on GPS point velocity distribution by calculating the variable velocity distribution coefficient [8]. Zhou et al. proposed a recurrent neural network based microscopic car-following model on predicting traffic oscillation [9]. Wang and Goodchild developed a logit model to determine the truck route’s influencing factors and estimate the driving time [10]. Jula et al. developed a hybrid method composed of dynamic programming and genetic algorithm to find trucks’ shortest path [11]. Dong et al. proposed a traffic crash prediction method based on the support vector regression (SVR) model [12]. Yu et al. proposed the random forests based on the near neighbor (RFNN) method to predict bus travel time [13]. Xie and Wei proposed Elman neural network to predict truck speed [14]. Wang and Xu constructed the short-term traffic flow prediction model of urban expressway based on the Long Short-Term Memory (LSTM) network under deep learning [15]. Luo et al. proposed a short-term traffic flow prediction model based on deep learning combining the features of convolutional neural network and support vector regression classifier [16]. Yao et al. discussed the application of Support Vector Machine theory to predict road travel time [17]. Wu et al. [18] proposed a traffic flow prediction model based on Deep Neural Networks (DNN) by utilizing the weekly/daily periodicity and space-time characteristics of traffic flow [18]. Jia et al. proposed a deep learning method for short-term traffic speed information prediction-deep belief network (DBN) model [19]. Due to the problem that the commonly used weight optimization algorithm could not adjust the learning rate adaptably, Zhao et al. adopted Adam, Adadelta, and Rmsprop to optimize the weight in the GRU model of the deep learning algorithm [20]. Wang et al. established a travel time prediction model based on the LSTM (Long Short-Term Memory) considering the precipitation data [21]. Zang et al. proposed an all-day traffic speed prediction method for elevated highways based on deep learning [22]. Peng et al. proposed a 3D Convolutional Neural Network-Deep Neural Network method to recognize and predict traffic status from aerial videos [23]. The author has also conducted many studies on how to improve prediction accuracy [2426]. The existing machine learning algorithm cannot fully dig out the essence of traffic flow characteristics. Deep learning model, such as GRU, can help us learn and seize the inherent complex features effectively and predict traffic flow without prior knowledge [27]. Using the deep learning algorithm to mine traffic flow rules becomes the direction of traffic state prediction.

The prediction of road segment traffic speed belongs to the study of microscopic traffic flow characteristics. It is also affected by the traffic states at the macrolevel, such as road network and neighboring region. At present, it is rare to combine macro- and microcharacteristics to predict traffic speed. The Macroscopic Fundamental Diagram (MFD) is a model reflecting the road network’s macroscopic traffic state. According to specific indicators, MFD divides a complex and large road network into several independent subregions and implements appropriate control optimization strategies according to the subareas’ characteristics. Based on the subregion division results, several road sections that are most similar to the predicted target road can be selected. Meanwhile, the spatial and temporal correlation between traffic flows in each subregion can be analyzed. The subregions with strong correlation can be selected to construct the traffic flow sequence dataset and input the prediction model, which is conducive to improving the prediction accuracy. Many scholars have studied the subregional division method of MFD. Ji and Geroliminis proposed a static normalized cut (Ncut) based subregions division method based on traffic congestion’s spatial characteristics and minimizing the subarea’s vehicle density [28]. Ji et al. also proposed a dynamic subarea delineation method based on GPS data targeting the maximum connected element [29]. Haddad and Geroliminis proposed a division method based on the operational stability of the subregions [30]. Ma et al. used a spectral approach to divide traffic zones based on neighboring intersections [31]. Ncut is a graph theory-based partitioning algorithm derived from the image partitioning domain. This algorithm not only considers the similarities within regions but also normalizes the similarities within regions using the similarities between regions. And then, the cutting scheme that minimizes the similarities between regions after normalization is found. In this paper, the Ncut algorithm is used to divide the traffic subregions. Then, the stability of the MFD for each subdivided subregion is calculated and analyzed to justify the division results.

GPS data has a wide coverage area and can better reflect the characteristics of urban road traffic flow [32]. What is more, in recent years, China has strengthened the supervision of freight vehicles. It is required that the GPS devices be installed on the large heavy-haul trucks to monitor the trucks’ running status. This produces a large amount of trajectory data, especially in the adjacent region between highway and expressway. In this paper, based on the average roadway speed and flow data extracted from the truck GPS data, a short-time traffic flow prediction method combining MFD and GRU is proposed. Using the characteristics of MFD, the road network area is divided into subregions, and the microtraffic flow characteristics and macrotraffic conditions are combined to develop a traffic forecasting method. The test results of real traffic flow data show that the method proposed in this paper has lower prediction errors and higher accuracy than the existing prediction models. It is a reasonable and effective method to predict short-time traffic flow. The technical framework of this paper is shown in Figure 1.

2. Subdivision Method of Road Network Based on MFD

2.1. Construction of Road Network Weighted Graph Based on Traffic Operation Similarity

A stable MFD exists in a network of roads with operational homogeneity. A large area can be divided into subregions based on the operational homogeneity of traffic. The starting point of the MFD theory is to study the relationship between traffic demand and traffic supply in the road network, the maximum traffic volume can directly reflect the traffic supply of each road section and the overall road network, and the traffic volume data can be easily obtained through traffic flow detection. Thus, the traffic volume is taken as the fundamental traffic characteristic of the road section in this paper. The road section’s maximum traffic volume is used to define the traffic operation similarity between adjacent connected sections.

Let the similarity degree of traffic operation between adjacent connected sections i and j in road network G be: where is the maximum traffic volume of section i in road network G and is the maximum traffic volume of section j in road network G. Using the natural constant transformation, the difference value of the maximum traffic volume between adjacent connected sections is squared mapped to the interval of 0-1. If the similarity is 0, the traffic operation similarity between the road sections is the least. If the similarity is 1, the traffic operation similarity between the road sections is the biggest.

Based on graph theory, the road network is first “node-arc transformation,” so that the similarity degree of traffic operation between adjacent road sections is expressed as the weight of the arc section in the graph. The Laplace matrix is constructed based on the similarity degree of traffic operation, which is the basis for the subgraph division in graph theory. The details are as follows.

When road segment i in road network G is connected with road segment j, . When segment i is disconnected from segment j, . When i is equal to j, . The weighted adjacency matrix of road network G is W, and the element in W is

This paper adopts road sections as node V in the undirected graph G, so that . The matrix D is the diagonal matrix:

D-W is the Laplace matrix of road network G, in which the sum of all the rows and columns is zero. Based on the transformations and calculations, it is possible to obtain a road network weighted by traffic operations similarity between adjacent connecting sections.

2.2. Road Network Subarea Division Method Based on Normalized Cut

Ncut is one of the neutron graph partition methods in graph theory, which is a subgraph division method at the macrolevel. The focus is not on the graph’s details, but on the overall characteristics of the graph. The optimal normalized cut problem of the graph can be expressed as

It is an NP-hard problem to solve the minimum value of the normalized cut. The spectral clustering method is a widely used method that can solve NP-hard problem approximately by solving eigenvalue and eigenvalue vector. Therefore, the Fiedler method is used to calculate the eigenvalues and eigenvectors of the matrix and divide the subareas of the road network. The point set V of figure G is divided into two subsets, and the transformation can express the optimal normalized cut problem of A and B:where , , , and is a vector of columns consisting of 1 and −1. When the i-th node is subdivided into subregion A, ; when it is subdivided into subregion B, .

Since all rows and columns in a Laplace matrix have a sum of 0, the matrix always has an eigenvalue of 0. If graph G is connected, then the second small eigenvalue is positive. The corresponding eigenvector is called the Fiedler vector, which contains important information about the graph; that is, the numerical size of the elements in the Fiedler vector reflects the correlation of their corresponding vertices. When the road network is divided according to the Fiedler vector, the vertices corresponding to the Fiedler vector can be divided according to different critical value S. There are many methods to select the S value, among which the 0-point method is practical and straightforward.

3. Stability Calculation of MFD

By calculating the stability of MFD, the rationality of the subregion division can be proved. MFD stability depends on stability in the critical state. In the critical state, if the average traffic volume fluctuates less under the same road network density, the road network traffic operation will be more controllable. This article refers to the method (Fuzzy c-means algorithm) in our research [33] to divide the test data sets into three categories: unblocked, critical, and congested. Two indicators, the regional traffic volume and regional density, are used to classify the three traffic states. Firstly, FCM (Fuzzy c-means algorithm) is used to divide the data points of the spatial distribution of multidimensional data into specific classes. Each data point belongs to a certain class to some extent, and the membership degree is used to indicate the degree to which each data point belongs to a certain clustering. FCM divides vectors into fuzzy groups and calculates each group’s clustering center to minimize the objective function of nonsimilarity indexes. Then, the traffic state is divided into three stages: unblocked, critical, and congested.

The dispersion of road network traffic operation, that is, the dispersion of weighted average traffic volume of road network in the critical state, represents MFD’s stability in the critical state. The lower the dispersion of road network traffic operation is, the more stable the road network operation is, and the higher the MFD stability is. The higher the dispersion of road network traffic operation, the more unstable the road network operation will be and the lower the MFD stability. The dispersion of road network traffic operation is where is the average traffic volume in the test data. is the critical average traffic volume of road network, is the maximum average traffic volume of road network, and is undetermined parameters, .

The whole road network’s dispersion degree is calculated by the weighted average of each subarea’s dispersion degree. The calculation results can be used as the judgment index of the subarea division to characterize the whole network’s MFD stability. If the entire road network is divided into N subareas, and the dispersion of road network traffic operation in subarea i is , then the dispersion of the whole road network traffic operation is where is the critical average traffic volume in subarea i.

4. Traffic Speed Prediction Method Based on the Spatial-Temporal Correlation of Subareas

The evolution of traffic speed on a road section in a specific subarea is affected not only by the temporal evolution law of the traffic flow on the road sections but also by the spatial influence of the road sections in other subareas. This paper proposes a traffic speed prediction algorithm that considers the spatial-temporal correlation of subareas.

4.1. Correlation Analysis of Subareas
4.1.1. Spatial Correlation Analysis

Firstly, the spatial correlation between each subarea is analyzed.

In spatial correlation analysis, it is necessary to measure the adjacency relationship of the neighboring subregions. This requires quantitatively describing the adjacency relationship of adjacent regions to perform the calculation of spatial correlation statistics.

In this paper, the spatial adjacency matrix is used to express the spatial relationship between subregions.

Suppose that there are subregions in the study area, and the spatial weight matrix is used to express the spatial correlation between the subregions: where is dimensional spatial weight matrix and is the spatial weight between the regional units and . Besides, to ensure that the subregions cannot be adjacent to themselves specifies that when , . When two subregions are sharing one or more nodes, it is adjacent.

The standardized formula of the spatial weights is

4.1.2. Temporal Correlation Analysis

The Pearson correlation coefficient formula is improved to measure the time correlation of the two regions. If two subregions in the study area have spatial adjacencies, the time correlation can be calculated by the following formula over a certain period:where and are the traffic volume of subregions i and j at time t; and are the mean traffic volume for i and j; and and are the variances.

According to the road network area studied in this paper, the regional correlation is calculated as , where is the spatial correlation of areas i and j and is the correlation of two areas in the study period.

In this paper, the data of the K most relevant regions to the region where the predicted target segment is located are selected for constructing the input matrix of the prediction model. In this paper, the K values are determined as follows: K (K = 0, 1, ..., 4) are used to input the data from the most relevant regions into the GRU model, and the K-value with the smallest prediction error is taken.

One input sample of the deep learning algorithm is where is timestep, is the number of sections of road included in Area , and represents the target section.

4.2. GRU-Based Traffic Speed Prediction Algorithm

RNN (recurrent neural network) is a kind of deep neural network designed to process sequence data, which plays an important role in the field of sequence mining.

The GRU model is an improvement of recurrent neural network, which is one of the hot technologies of deep learning in recent years. Different from the traditional recurrent neural network, the internal structure of the GRU’s hidden layer nodes does not use a single activation function.

The specific calculation steps of GRU are as follows: firstly, the current state input and the previous time output are input into the update gate, and then a value between 0 and 1 can be output, where 0 represents the complete discarding information and 1 represents the complete reserving information, and the calculation formula is as shown in formula (12). Secondly, and entering the reset gate of the sigmoid layer output the value between 0 and 1. Meanwhile, layer will create a new candidate value vector , and the calculation formulas are shown in equations (13) and (14). Thirdly, the update gate is used as the weight vector, and the candidate vector and the output vector at the last moment are weighted averages to obtain the output of GRU cells. The calculation formula is shown in equation (15):where represents the update gate vector; represents the reset gate vector; represents the bias vector; represents the input weight; represents the cyclic weight; represents the input vector at t time; and represents the output vector at t time.

Regularization is generally defined as the modification of the learning algorithm, and the goal is to reduce generalization error rather than training error. Common regularization methods include L1 and L2 parameter paradigm penalty, Dropout, multitask learning, and early termination, which are common, where the penalty terms L2 and L1 of parameter normal form can be expressed as where can be expressed as the reciprocal of weight of each layer, indicating that, for the layer with too high weight learned, its updating degree should be reduced. On the contrary, for the node with too low weight learned in the layer, its updating degree should be improved to achieve the goal of amortizing the ownership value in the layer.

To sum up, the flow of the stroke speed prediction algorithm proposed in this paper is shown in Figure 2. The input is a three-dimensional vector composed of features, time step, and samples. This 3D tensor is input to the GRU model with a dropout layer and fully connected layer to get the travel time’s predicted result. One column of the matrix in equation (11) corresponds to the input of one time step of the GRU model.

5. Algorithm Validation

5.1. Road Network Subarea Division and Stability Calculation of MFD

This paper uses truck GPS data as the basis for algorithm validation. As shown in Figure 3, the experimental area selected in this paper is located in Beijing’s southeast. Beijing’s expressway and the main road, including the 5th and 6th ring roads, Jingtai, Jinghu, and Jingha highway, are selected to verify algorithm accuracy. The area is approximately 110 square kilometers. The total length of roads in the road network is about 131 km. According to the analysis, the selected area has more accidents and more GPS data of truck.

The time range of the data used for validation is May 1, 2018, to July 31, 2018.

The methods of map matching, anomaly data processing, and traffic speed time series extraction of truck GPS data in this paper are from literature [20]. In this paper, the collected truck GPS data is organized into the form of a time series of the traffic speed of the road section with a period of 5 minutes. Then, according to the chosen K-value, it is organized into equation (11). L takes 12; that is, the prediction is made using the previous hour’s data.

Sample size is a critical concern when using probe vehicles to collect real-time traffic information, and it is necessary to determine the number of probe vehicles needed for traffic state estimation. In this paper, the required sample size for different combinations of confidence levels of the study area is determined with reference to the method in [34].

May and June’s average speed data are used as the training data set for GRU model training. The rest of the data serves as a test set for the algorithm. In this paper, the study area’s road network is abstracted into the road network diagram, as shown in Figure 4, and there are 32 road sections and 21 nodes.

The regional division of the cases of the parties is shown in Figure 5. The dispersion of traffic operation in subareas and the whole road network is shown in Table 1. When the network is divided into 5 subareas, the whole road network’s traffic discrete degree is the smallest, 0.05673. The network has been divided into smaller regions. If continuing, the change of discrete degree of the whole road network is not too big. However, for speed prediction, the dimensionality of the data input to the model will increase. Thus, the prediction difficulty will increase. So, take five as the optimal scheme of areas division.

The MFD of each subarea is shown in Figure 6. The traffic state classification results based on the FCM algorithm for subarea 1 are shown in Figure 7. The clustering centers are shown in Table 2.

5.2. Traffic Speed Prediction

Two indexes, MAPE and RMSE, are selected to evaluate the prediction accuracy of the model. The calculation method of MAPE and RMSE is shown in the following formulas:where is the predicted traffic speed at time t, is the actual traffic speed at time t, and is the total number of predicted cycles.

This paper selects nodes 17 to 18 (Section 1) and 9 to 10 (Section 2) belonging to different regions as the experimental verification sections. The accuracy of the algorithm was verified in four different scenarios. The prediction results were compared with the GRU prediction algorithm based on a single time series of the road segment. This GRU model has the same parameter settings as the model presented in this paper.

The first step is to determine the number of regions K that are input to the GRU model, so the relationship between the number of inputs and the prediction accuracy of the model is analyzed. Table 3 shows the prediction accuracy of Section 1 for different K values. Table 4 shows the prediction accuracy of Section 2 for different K values. From Table 3, it can be seen that the K of road Section 1 take 1. From Table 4, it can be seen that the K of road Section 2 take 1.

5.2.1. Working Days

(1) Section 1. The predicted results of Section 1 on July 2 (working day) are shown in Figure 8. The errors are shown in Figure 9. It can be seen that on July 2, the average speed of the morning rush hour and noon rush hour sections is low, in a congested state, and the road section was in a state of smooth flow at night. The algorithm proposed in this paper can achieve acceptable prediction results. The results of error evaluation indicators MAPE and RMSE are shown in Table 5. MAPE was 2.30%, 3.05%, and the RMSE was 1.34 and 1.68, respectively.

(2) Section 2. The predicted results of Section 2 on July 2 (working day) are shown in Figure 10. The errors are shown in Figure 11. The algorithm proposed in this paper can achieve acceptable prediction results. The results of error evaluation indicators MAPE and RMSE are shown in Table 6. MAPE was 1.42%, 2.53%, and the RMSE was 1.05 and 1.79, respectively.

5.2.2. Weekend

(1) Section 1. The predicted results of Section 1 on July 1 (weekend) are shown in Figure 12. The errors are shown in Figure 13. The algorithm proposed in this paper can achieve acceptable prediction results. The results of error evaluation indicators MAPE and RMSE are shown in Table 5. MAPE was 3.93%, 4.25%, and the RMSE was 2.17 and 2.31, respectively.

(2) Section 2. The predicted results of Section 2 on July 1 (weekend) are shown in Figure 14. The errors are shown in Figure 15. The algorithm proposed in this paper can achieve acceptable prediction results. The results of error evaluation indicators MAPE and RMSE are shown in Table 6. MAPE was 2.46%, 3.93%, and the RMSE was 1.89 and 2.18, respectively.

5.2.3. Rainy Day

(1) Section 1. The predicted results of Section 1 on July 5 (rainy day) are shown in Figure 16. The errors are shown in Figure 17. The algorithm proposed in this paper can achieve acceptable prediction results. The results of error evaluation indicators MAPE and RMSE are shown in Table 5. MAPE was 2.86%, 3.23%, and RMSE were 1.65 and 1.86, respectively.

(2) Section 2. The predicted results of Section 2 on July 5 (rainy day) are shown in Figure 18. The errors are shown in Figure 19. The algorithm proposed in this paper can achieve acceptable prediction results. The results of error evaluation indicators MAPE and RMSE are shown in Table 6. MAPE was 1.49% and 2.55%, and the RMSE was 1.03 and 1.82, respectively.

5.2.4. Accident

The predicted results of Section 2 on July 11 are shown in Figure 20. The errors are shown in Figure 21. There was a traffic accident in the early hours of the morning. The results of MAPE and RMSE are shown in Table 6. MAPE was 7.35% and 8.94%, and the RMSE was 5.05 and 5.83, respectively.

It can be seen from the above figures and tables that, compared with the traditional prediction method based on single segment time series, the prediction accuracy of the proposed algorithm is improved. When no accident happened, MAPE increased by about 0.5%, and on the day of the accident, MAPE increased by about 1.5%. The RMSE also improved more on accident day, compared with the day of no accident.

6. Conclusions

The traffic speed in adjacent regions between highway and expressway has gradually become important information concerned by highway managers and travelers. This paper proposes a prediction method of road traffic speed that considers microscopic traffic flow characteristics and macroscopic traffic status based on the road section average speed and flow data extracted from the GPS data.

Based on MFD, road network subareas are divided and evaluated. Firstly, the Ncut algorithm is used for the division of the road network. Secondly, to ensure the stability of the divided subarea’s MFD, the definition of the road network’s discrete degree is proposed. The traffic state is divided combined with FCM to get the best scheme for dividing the subregions after the calculation of the discrete degree of the whole network.

The spatial-temporal correlation coefficient is proposed to measure the correlation between subareas. Then, the traffic speed time sequence of the study subarea and the related area is used to build a matrix of traffic speed. The regional matrix of traffic speed data is input into the GRU model, and the output result is the predicted traffic speed of the studied region.

This paper takes the adjacent region between the highway and expressway of Beijing as an example to verify the algorithm. The southeast corner of the Beijing road network is selected as the research area. The area consists of two ring expressways and three highways with a total area of approximately 110 square kilometers. Truck GPS data from this region is the basis of this study. The proposed algorithm’s accuracy is verified under the working days, weekend, rainy days, and accident scenarios. The result shows that, compared with the traditional prediction method based on single segment time series, the prediction accuracy of the proposed algorithm is improved. This will enhance the level of traffic information services in the adjacent region between the highway and urban expressway and ease traffic congestion.

Data Availability

The data used to support the findings of this study have not been made available because the authors have signed the confidentiality agreement with the data providers.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.

Acknowledgments

This work was supported by the National Key Research and Development Program of China (no. 2018YFB1600900) and National Natural Science Foundation of China (71871011, 71890972/71890970, and 71621001).