Abstract

Traffic prediction aims to predict the future traffic state by mining features from history traffic information, and it is a crucial component for the intelligent transportation system. However, most existing traffic prediction methods focus on road segment prediction while ignore the fine-grainedlane-level traffic prediction. From observations, we found that different lanes on the same road segment have similar but not identical patterns of variation. Lane-level traffic prediction can provide more accurate prediction results for humans or autonomous driving systems to make appropriate and efficient decisions. In traffic prediction, the mining of spatial features is an important step and graph-based methods are effective methods. While most existing graph-based methods construct a static adjacent matrix, these methods are difficult to respond to spatio-temporal changes in time. In this paper, we propose a deep learning model for lane-level traffic prediction. Specifically, we take advantage of the graph convolutional network (GCN) with a data-driven adjacent matrix for spatial feature modeling and treat different lanes of the same road segment as different nodes. The data-driven adjacent matrix consists of the fundamental distance-based adjacent matrix and the dynamic lane correlation matrix. The temporal features are extracted with the gated recurrent unit (GRU). Then, we adaptively fuse spatial and temporal features with the gating mechanism to get the final spatio-temporal features for lane-level traffic prediction. Extensive experiments on a real-world dataset validate the effectiveness of our model.

1. Introduction

Intelligent transportation systems (ITS) include driving behaviour understanding [1], path finding [2], map matching [3], and traffic prediction [4]. Traffic prediction refers to predict the future state of traffic by analyzing and mining traffic information in the mining history [5]. As the foundation and important part of ITS, accurate traffic prediction can help formulate real-time control strategies, which is of great importance for scientific planning of traffic management and people’s safe and efficient travel [6].

Early efforts in this field use statistical learning methods for traffic prediction, such as differential average moving autoregression [7], which converts unstable sequences into stationary sequences by difference for prediction. Traffic state information has significant nonlinear and uncertain characteristics, and machine learning methods such as K-nearest neighbor (KNN) [8] and support vector regression (SVR) [9] are also used for traffic prediction. But they have higher requirements for features, which often requires complex feature processing. In recent years, deep learning methods have become the mainstream method of traffic prediction due to automatic feature modeling and effective data mining capabilities. For example, recurrent neural network (RNN)-based methods [10] can effectively model the temporal features in traffic flows. Convolutional neural network (CNN)-based methods [11] regard traffic flows as image and model spatial features or temporal features in European space. The road network or road sensor network is naturally a graph and has a typically non-Euclidean structure. Recently, researchers have used graph-based methods for traffic prediction [12]. CNNs and RNNs can only be used on Euclidean data, while graph-based methods can effectively model the non-Euclidean structure of graphs for more accurate predictions. With a graph as input, graph-based methods have achieved superior performance in traffic prediction. The topology of the graph is represented by an adjacent matrix, and graph-based methods are directly affected by the adjacent matrix.

Although related work in the past has proposed many effective algorithms in the field of traffic prediction, there are still some limitations and challenges. (1) Previous studies ignore the differences between different lanes and mainly focus on road segment prediction. In reality, there is a wide demand for lane-level traffic prediction. For example, automated vehicles or human-driven vehicles can select appropriate lanes according to the prediction results at the lane level. Traffic congestion can thus be avoided or alleviated [13, 14]. Besides, lane-level traffic prediction can provide more refined and accurate traffic information and help humans or machines make more appropriate and effective decisions. There are different and related patterns of traffic states in different lanes [1517]. As shown in Figure 1, there are two lanes in both road section 1 and road section 2, and the traffic information between different lanes in the same road section has a similar change pattern, while there still exist some differences in the change pattern in many details. In the road segment-level traffic prediction, the road segment is regarded as a whole, and the prediction results are too macroscopic to provide precise information for lane-level decisions. (2) Graph-based methods rely heavily on adjacent matrices, while most methods build static adjacent matrices, ignoring that the correlation between different nodes on the graph may be different in different situations. For example, there may be similar change patterns for two nodes that are far apart. Besides, the traffic situation of nodes may change in different time periods. It is difficult for the static adjacent matrix to respond timely and effectively to spatio-temporal changes.

To address the aforementioned challenges, we propose a deep learning model for lane-level traffic prediction, which is mainly composed of data-driven GCN and GRU. GCN is used to extract spatial features. To adapt GCN for lane-level traffic prediction, we treat different lanes at the same location as different nodes on the graph. The adjacent matrix of the graph is calculated in a data-driven manner and consists of a traditional distance adjacent matrix and a dynamic lane correlation matrix. GRU is used to extract temporal features. Then, spatio-temporal features are obtained by fusing temporal and spatial features adaptively through the gating mechanism. Finally, lane-level traffic prediction is performed based on the learned spatio-temporal features.

The main contribution of this paper can be summarized as follows:(1)A data-driven adjacent matrix is proposed, which consists of a distance-based adjacent matrix and a dynamic lane correlation matrix. It can respond effectively to spatio-temporal changes in a timely manner.(2)We propose a deep learning model for lane-level traffic prediction, which learns spatial features with GCN, learns temporal features with GRU, and obtains fused adaptive spatio-temporal features with the gating mechanism.(3)Extensive experiments on a real-world dataset validate the effectiveness of the model.

The remainder of this paper is organized as follows: In Section 2, we introduced the related work, which includes general methods and deep learning methods. Section 3 formulates the lane-level traffic prediction task. Section 4 introduces the construction of the data-driven adjacent matrix and the architecture of our model in detail. The comprehensive experiment result on a read-world dataset is demonstrated in Section 5. Finally, we conclude the paper and present future work in Section 6.

2.1. General Methods

Traditional traffic prediction methods can be divided into parametric methods and nonparametric methods [10]. Parametric methods rely on the assumption of data stationarity and provide explicit formulations for valuable interpretations of traffic characteristics. Classical parametric methods, such as the autoregressive integrated moving average model (ARIMA) and its variants [7, 18, 19], have been proven to be effective in many scenarios. For example, some studies have found that ARIMA can model highway time series data with high precision [20]. Some other parametric methods include exponential [21], multivariate time series models [22], and Kalman filtering models [23]. However, the dependency on stationarity makes parametric methods difficult to effectively model the uncertainty and irregular volatility of traffic data. The structure and parameters of the nonparametric methods are not fixed, and the data requirements are not as strict as those of the parameterized methods. The nonparametric methods are more able to deal with complex data such as noisy data and missing data [24]. Typical nonparametric methods include support vector regression [8], K-nearest neighbor [9], the Bayesian network [25], the extreme gradient boost [26], and artificial neural networks (ANN) [24, 27, 28]. Among them, ANN can mine the latent information of traffic data and has nonlinear modeling ability, which is one of the most widely used nonparametric methods. Although nonparametric methods have some achievements in the field of traffic prediction, these methods are limited in their ability to predict lane-level traffic. Besides, both parametric and nonparametric methods are mainly used to model the temporal features and are weak in modeling the spatial features.

2.2. Deep Learning Methods

With the rapid development of high-performance data storage and processing technologies, traffic prediction is moving from nonparametric methods to deep learning methods [10]. An important step in traffic prediction is to extract spatio-temporal features from traffic data. For the recurrent neural network (RNN) and its variants like long short-term memory (LSTM) [29] and gated recurrent unit (GRU) [30] which can effectively utilize temporal data, RNN-based methods [31] play an important role in mining temporal traffic features. Ma et al. [32] first applied LSTM to solve the prediction of highway traffic speed and flow. Zhao et al. [10] utilized GRU, which has fewer neurons than LSTM, for traffic prediction. Gu et al. [20] built a fusion system to capture temporal features. RNN-based methods [33, 34] have shown promising results in traffic prediction field, while they are not good at mining spatial features in traffic flow. In terms of spatial traffic features, traffic flows in nearby locations are often strongly correlated [35]. For the power of handling image data, CNN has been used in traffic prediction by treating the traffic flow data as an image. Ke et al. [36] constructed a multichannel CNN model for multilane traffic speed prediction. Liu et al. [37] developed an attention-based CNN structure for traffic speed prediction with the use of traffic flow, speed, and occupancy. However, CNN and RNN can only be applied to Euclidean data; they cannot model the topological structure of the road network or the road sensor network. Neither CNN-based methods nor RNN-based methods are perfect for spatio-temporal feature extraction.

The road network or the road sensor network is naturally a graph. Recently, researchers have applied graph neural networks (GNN), especially graph convolutional networks (GCN) [38], for traffic prediction, and they have superior performance compared to previous approaches. For the ability to model non-Euclidean graph structures, GNNs are ideal for solving traffic prediction problems. Li et al. [39] treated the traffic flow as a diffusion process and proposed DCRNN, which uses bidirectional random walks on the graph and GRU to capture spatial and temporal features, respectively. Zhao et al. [10] proposed T-GCN, which stacks GCN and GRU for traffic prediction. Yu et al. [40] proposed STGCN to extract spatio-temporal features with complete convolutional structures. Guo et al. [41] established a HGCN model which operates the convolution operation on both micro- and macrotraffic graphs. Zhu et al. [42] employed GCN in multigraph to analyze correlations from multiple perspectives. Guo et al. [43] proposed a dynamic GCN for traffic prediction on the basis of Laplace matrix estimation. Cao et al. [44] combined self-attention with GCN for traffic flow prediction. Although there is a lot of excellent work for traffic prediction, most of them are not suitable for lane-level traffic prediction. Besides, most existing works treat the road or sensor network as a static graph. We propose a deep learning model for lane-level traffic prediction with a dynamic adjacent matrix driven by data. As for lane-level works, Gu et al. [20] combined LSTM and GRU for lane-level traffic speed prediction. Ke et al. [36] introduced a two-stream multichannel CNN model. Ma et al. [45] proposed a convolutional LSTM network for multilane short-term traffic forecasting. Lu et al. [46] described a mix deep learning model for lane-level traffic speed forecasting. Wang et al. [47] presented a heterogeneous graph convolution model for lane-level traffic flow prediction. Existing lane-level traffic prediction methods mostly use RNN or CNN to model spatial features, which has certain limitations.

3. Problem Formulation

In this work, we aim to predict the traffic state of lanes in a period of time on the basis of the historical traffic state information recorded on the road sensors. Traffic state is a general concept that includes traffic speed, traffic flow, and other numerical information related to the road. Specially, we predict lane-level traffic flow in the experiment section.

Definition: Lane Network . To describe the non-Euclidean structure of the lane network, we define it as graph . On graph , is the set of nodes, where represents the -th lane and is the number of lanes. Note that we treat different lanes on the same road section as different nodes. is the set of edges. The edge between lane and lane only exists if their distance is less than a certain threshold and there exists a traffic flow from to . To better represent the real situation, we consider the traffic flow between different lanes in the same section of the road to be interconnected. is the adjacent matrix.

Let represent the traffic flow of lanes on each time stamp . Suppose the traffic flow data is the graph signal of , given time and lane network , the lane-level traffic flow prediction problem in our work can be defined aswhere represents the learned mapping function, is the input sequence length, and is the predicted sequence length.

The key symbols used in this paper are summarized in Table 1.

4. The Proposed Approach

In view of the lack of work on lane-level traffic prediction, this paper proposes a lane-level traffic prediction model. The architecture of our model is illustrated in Figure 2. Specifically, we first establish a data-driven adjacent matrix that can respond to spatio-temporal changes based on the geographic location and historical traffic information of the sensor. The data-driven adjacent matrix is fed into the graph convolutional network (GCN) to capture spatial features, and we model the temporal features with a gated recurrent unit (GRU) model. Then, we adaptively fuse the spatial and temporal features with the gating mechanism to get comprehensive spatio-temporal features. Finally, we make multistep lane-level traffic predictions based on the spatio-temporal features.

4.1. Data-Driven Adjacent Matrix

The graph depicts the topological relationship structure between nodes through the adjacent matrix, and the construction of the adjacent matrix directly affects the expressive power of the graph [48]. However, most GCN-based traffic prediction works only construct a static adjacent matrix with fixed weights, without considering that the relationship of different nodes may change in various situations. In particular, it is difficult for a static adjacent matrix to respond to spatio-temporal changes in a timely manner, which makes the model hardly achieve accurate prediction. In our work, we propose a data-driven dynamic adjacent matrix, which is composed of the basic distance-based adjacent matrix and the dynamic node correlation matrix .

The graphs include directed graph and undirected graph. For undirected graphs such as social networks, the adjacent matrix is symmetric. In the road sensor network, the traffic flows on roads have directions due to the restriction of traffic rules. Graph is a directed graph, and the adjacent matrix is asymmetric.

For the basic distance-based adjacent matrix , as most works did [49], we calculate one element in withwhere represents the influence degree of lane on lane , is the distance between and , and is the standard deviation of d. The distance between different lanes on the same road segment is 0. has a positive value only if is smaller than threshold and from to exists a traffic flow.

To compensate for the defects caused by the static characteristics of the distance-based adjacent matrix, we further introduce the dynamic correlation matrix . is filled with the Pearson correlation coefficient calculated from the observed input data of the lanes. To be specific, at time is calculated withwhere and are the index of lane and lane , is the value of traffic flow on observed at time , and are means of and , respectively. The absolute value of is closer to 1, the higher the correlation between and .

Combining the basic distance-based adjacent matrix and the dynamic correlation matrix , we propose the data-driven adjacent matrix ,where is a constant that controls how much contributes to . On the one hand, provides geographic relationships that are fundamental and important for spatial feature extraction; on the other hand, can implement timely adjustments to the adjacent matrix with reference to changes of historical information.

4.2. Spatial Feature Modeling

Spatial features play an important role in traffic prediction for traffic flow sequences at different locations with connection to some extent. Before the employment of graph-based methods, research studies usually extract the spatial features with multivariate time series models or CNNs [50]. However, limited by the structure, multivariate time series models mostly cannot model the nonlinear relationships between different sequences. Although CNN-based methods can alleviate the situation, the architecture of CNN is bounded to Euclidean space, which is not enough for lane network’s topological structure modeling. Recently, graph-based methods have attracted wide attention for their ability in modeling non-Euclidean structure. Specifically, we extract the spatial features with GCN. The GCN model built a filter in the spatial domain, and the spatial features between different nodes on a graph are extracted with the usage of filter. As illustrated in Figure 3, the central node models the topological relationship by aggregating the information of its neighboring nodes in GCN. The topological structure of the graph is encoded to acquire spatial features.

After the construction of the data-driven adjacent matrix , the GCN model extracts spatial features withwhere represents the feature matrix in -th layer, is the degree matrix and . denotes normalize the adjacent matrix , which can help to keep the distribution of the feature matrix during the information transfer process. is the learned weight matrix in -th layer and represents the activation function. Note that and is the input traffic information matrix.

4.3. Temporal Feature Modeling

Traffic data have significant sequence structure, which means that temporal features are the key to traffic prediction. The recurrent neural network (RNN) is widely used in the processing of sequence data. However, RNN has the problem of gradient disappearance or gradient explosion with the sequence length growing. To solve this problem, researchers proposed many variants of RNN such as LSTM and GRU. These variants have been proved to be effective in sequence data modeling. Both LSTM and GRU control the dissemination and update of information with the gating mechanism. Compared with LSTM, GRU has smaller training time, fewer parameters, and relatively simpler structure. Therefore, we employ GRU to extract the temporal features of the traffic data. There are two gates in GRU, which arewhere represents the update gate and controls how much history information the current moment has, represents the reset gate and controls how much history information needs to be forgotten. is the traffic information at time , and is the hidden state at time . are learnable parameters. With the gating signals of and , the cell state and output hidden state can be calculated withwhere can be regarded as the history information stored at time , and represents element-wise multiplication.

As shown in Figure 4, GRU models the hidden state in time by taking the current traffic information and the last hidden state . With this operating mechanism, GRU is capable of retaining the historical information while utilizing current traffic information, and then it is able to model temporal features.

4.4. Our Model

With GCN and GRU, we obtain spatial and temporal features, respectively. Our next target aims to fuse the spatial and temporal features into comprehensive spatio-temporal features. A convenient operation is to add these two kinds of feature directly, which may weaken the characteristics of the spatial and temporal features themselves. Inspired by the gating mechanism, we introduce a learnable gate to fuse features. Similarly, to the role of gates in GRU, controls the degree to which spatial and temporal features contribute to the final spatio-temporal features.where represents the extracted spatial features. Suppose the layer of GCN is , then equals to the output feature matrix in -th layer. represents the extracted temporal features. We stack all units’ output hidden state to get the final temporal features. , , and are learnable parameters.

Let and denote the real traffic state and the predicted traffic state at time , respectively. In the training process, the target is to minimize the error between and , and the loss function is defined aswhere the second term is an L2-regularization term that helps avoid the overfitting problem, is the regularization parameter, and represents the weighted parameters.

In summary, we propose a lane-level traffic prediction model. On the one hand, our model uses GCN to model the topological structure of the lane network to extract spatial features. On the other hand, we mine traffic information’s dynamic changes to get the spatial features with GRU model. Finally, we fuse these features to make multi-step traffic predictions.

5. Experiment

In this section, we conduct extensive experiment on a real-world dataset to evaluate the proposed method and several baselines answering the following questions:(i)RQ1: How does our method perform compared to baselines in lane-level traffic prediction, and do different components of our method show real improvement?(ii)RQ2: How do various hyperparameters affect the performance of our method?(iii)RQ3: Does our method really work in real situations?

5.1. Dataset

We perform experiments on a real-world dataset. The dataset is collected from Sutai Expressway captured by the remote microwaves traffic sensors in Zhejiang, China. Affected by the distance between sensors, we selected 30 sensors’ traffic data for experiment. The frequency of traffic information collection for each sensors is once every 5 minutes. The collected data mainly includes lane-level traffic speed and traffic flow. All traffic information is the average value that passes through the sensor over a sampling time interval. Since some sensors have serious data missing in some lanes, the number of valid lanes in 30 sensors is 53. The record number for each lane is 16,032 and the entire dataset includes 849,696 records in total. To make the prediction result more reliable, we filled in the missing data with the data from the previous moment.

5.2. Baselines

To validate the effectiveness of our model for lane-level traffic prediction, we compare the proposed model with several commonly used baselines:(1)MLP, which contains three fully connected layers. We concatenate all lanes’ traffic information and flatten it, then we input MLP with the flattened information. With the power of nonlinear modeling, MLP can model the spatio-temporal features to some extent.(2)LSTM [29], which is a variant of RNN. It achieves efficient modeling of sequence data with gating mechanism. It can only capture temporal features for traffic prediction.(3)GRU [30], which is also a variant of RNN and is good at modeling sequence data as well. Compared to LSTM, it has simpler structure. See section for more details. It can only capture temporal features for traffic prediction.(4)GCN [38], which can aggregate neighbor information via convolution operations on graph. See Section for more details. It can only capture the spatial features for traffic prediction.(5)T-GCN [10], which captures spatial features with GCN and captures temporal features with GRU. Compared to our method, T-GCN constructs a static adjacent matrix and inputs the learned spatial features for spatio-temporal features modeling. It can model the spatio-temporal features for traffic prediction.

5.3. Implementation and Metrics

We implement our method with Pytorch and optimize it using the Adam optimizer. The history sequence length for prediction is 12 which means we use the last 60 minutes’ traffic data for traffic prediction. The batch size is set to 32. The GCN layer is set to 1. We perform searches for learning rate in , and choose the learning to be finally. In the experiments, the data in all lanes are divided into two parts, the first 80% of the data is used for training and the rest 20% for testing.

To compare the performance of our method and other baselines, we evaluated the prediction results with three widely used metrics.(i)Mean absolute error (MAE):(ii)Root mean squared error (RMSE):(iii)Explained variance score (VAR):where is the predicted value and is the real value. means to calculate the variance. Specifically, for MAE and RMSE, the smaller the value, the better the performance, since they measure the error between the predicted value and the real value directly. VAR measures the quality of the prediction result with calculating the correlation coefficient. The closer the value of VAR is to 1, the better.

5.4. RQ1: Experiment Results

Table 2 shows the performance of different models on the prediction of traffic flow of 5 minutes, 10 minutes and 15 minutes using the last 60 minutes traffic flow information, where “ours w/o gate” means that spatial and temporal features are directly added in the feature fusion step without using the gating mechanism in our model. From the results, we have the following observations.

5.4.1. Observations on Our Method

Our model consistently works the best compared to all the baselines, which illustrates the superiority of our method for lane-level traffic prediction. (1) Our model performs better than LSTM and GRU. The main reason is that LSTM and GRU can only model the temporal features while they are incapable of capturing spatial features. The results validate the importance of spatial feature modeling in traffic prediction tasks. (2) Our model obtains better results than GCN. Similarly to the reason that our model is better than GRU and GCN, GCN models the spatial features only while it ignores the fact that the traffic data have a significant sequence structure. This means that GCN lacks the ability to model temporal features. The results verify the importance of temporal feature modeling. (3) Our model outperforms T-GCN. One possible reason is that T-GCN is designed for road-level traffic prediction. When faced with complex application scenarios such as lane-level traffic prediction, it is difficult for T-GCN to adopt to such changes, which also verifies the rationality of our model’s change for lane-level traffic prediction. Besides, the dataset only contains 53 lanes’ traffic information, which means the constructed graph only has 53 nodes. T-GCN performs poorly on such a graph while our model performs well, which further verifies the adaptability of our model to special scenarios. (4) Our model also outperforms MLP. The major reason is that though MLP can model spatio-temporal features to some extent, while it mines the spatio-temporal features roughly and our model can capture rich and effective spatio-temporal features for traffic prediction. (5) Our method without the gating mechanism performs worse than the full method. The results validate the effectiveness of using the gating mechanism for feature fusion. However, we find that the gating mechanism brings about a slight improvement. One possible reason is that due to the special characteristic of the dataset, spatial features are the main component in the final spatio-temporal features. Different feature fusion methods have little effect on the composition of spatio-temporal features. (6) Our model consistently achieves the best performance when the prediction horizon ranges from 5 minutes to 10 15 minutes. The results indicate that robustness of our method. In addition to the short-termlane-level traffic prediction, it may also be used for long-termlane-level traffic prediction tasks.

5.4.2. Other Observations

(1) The performance of LSTM and GRU on all prediction horizons has little difference, although GRU has a simpler structure than LSTM. With limited resources, it is more reasonable to choose GRU instead of LSTM to capture spatial features. (2) GRU obtains better results than GCN. The main reason is that, limited by the number of lanes in the dataset, the ability of GCN to mine spatial features is restricted, and the spatial features captured by GCN are roughly. Also, Traffic data is presented in the form of a sequence, which is naturally more suitable for temporal feature mining. (3) T-GCN outperforms GCN. The reason is that the T-GCN can capture not only the spatial features but also the temporal features, while GCN can only capture the spatial features. The results further verify that both the spatial features and temporal features play an important role in traffic prediction tasks. (4) GRU performs better than T-GCN, though it mines temporal only. One possible reason is that the mechanism for T-GCN to obtain spatio-temporal features is not perfect. When T-GCN is used for lane-level traffic prediction, the extracted spatial features are coarse and inhibit the ability of T-GCN instead.

5.5. RQ2: Parameter Sensitivity Analysis

In this part, we conduct sensitivity analysis of two important hyperparameters in our model. The two hyperparameters are the input sequence length and the parameter that controls the degree to which the dynamic correction matrix contributes to the data-driven adjacent matrix .

5.5.1. Sensitivity of Input Sequence Length

As the input sequence length depicts the amount of history traffic information used in traffic prediction, which will be constructed as the initial input features of the model. It is necessary to control the amount of history traffic information for traffic prediction tasks. Specifically, we run our model with the input sequence length from 12 to 72. The results of the experiment are shown in Figure 5. The performance of our method decreases when the input sequence length increases from 12 to 36, then it increases conversely when keeps growing. Overall, the performance remains relatively stable. From the results we can find that further considering longer sequences as input does not bring additional benefit, even a drop has occurred. There exist two possible reasons. (1) Longer sequences may introduce more noise and will limit the predictive power of the model. (2) What we mainly do is short-term traffic prediction and short-term traffic state fluctuates more than long-term traffic state. Shorter input sequences may better help to extract dynamic changes in the data. Such results illustrate that the input sequence length for traffic prediction must be chosen properly. Therefore, the input sequence is supposed to contain sufficient traffic information when the input sequence length is set to 12.

5.5.2. Sensitivity of

The data-driven adjacent matrix is composed of distance-based adjacent matrix and dynamic correlation matrix . We use parameter to control how much contributes to . It is also necessary to find out the relationship between the performance of our method and the parameter . The input sequence length is set to 12. As shown in Figure 6, our method performs best when . As increases from 0 to 0.1, the performance improves. The main reason is that when is too small, has little effect on . With increasing the influence of is gradually released to compensate for the static defect caused by the distance-based adjacent matrix, and the performance of the model becomes better. The results verify the necessity and validity of modeling a data-driven adjacent matrix. When is larger than 0.1, the performance starts to drop. The possible reason is that larger makes the model focus more on the dynamic part of the adjacent matrix and starts to undermine the fundamental role of the static distance-based adjacent matrix. Finally, we set to be 0.1.

5.6. RQ3: Model Interpretation

In this part, we aim to answer RQ3 by visualizing the adjacent matrix and model prediction results.

5.6.1. Visualization of Adjacent Matrix

To figure out how the data-driven adjacent matrix affects the prediction results of the model, we visualize the distance-based adjacent matrix and lane correlation matrix in the morning, afternoon, and evening of a random day, respectively. As shown in Figure 7, the distance-based adjacent matrix and the correlation matrix show different relationship patterns between lanes in different situations. More specifically, is static and presents different data distributions at different times. It can be seen from the answer of RQ3 that the construction of the dynamic correlation matrix plays a positive role in the prediction of the model. The reason is that the traffic information at different times changes dynamically, and it is difficult for the static distance-based adjacent matrix to respond to such dynamic changes in a timely and effective manner. With the help of the dynamic correlation matrix, the data-driven adjacent matrix has acquired the ability to model dynamically changing traffic information. Further, the performance of the model can been improved.

5.6.2. Visualization of Prediction Result

To better understand our proposed method, we randomly selected 4 lanes in different sections of the road and visualize the ground truth of the test set and the prediction result. As shown in Figure 8, the results show the following:(1)Similar patterns exist between the ground truths of different lanes. The geographic distance between some lanes may be too large, in which case the distance-based adjacent matrix ignores the similar pattern. As a result, the lanes that are too far apart cannot cooperate with each other to improve the final prediction quality. Our proposed data-driven adjacent matrix effectively alleviates this problem on the basis of distance-based adjacent matrix.(2)There is a certain error between the prediction results of the model and the ground truth. The reasons are threefold: (a) GCN defines a smooth filter and models the spatial features with the filter on the spatial domain. This process of aggregating neighbor information results in smooth predictions. (b) Though the construction of data-driven adjacent matrix can help to improve model’s performance. However, the use of the correlation matrix enables GCN to aggregate more information from more neighbors, which further leads to a smoother prediction result of the model. (c) When the ground truth is small, the same gap can cause a larger relative error.(3)Our model can capture the traffic trend at the lane level. This property can help formulate effective and detailed traffic control strategies in real time and realize scientific traffic management planning.

Besides, Figure 9 displays the ground truth and prediction results for a randomly chosen weekend and a randomly chosen weekday. As depicted in Figure 9, our model can capture trends of traffic flow throughout the day. Whether it is weekdays or weekends, peak or low-peak times, traffic flow trends are well modeled, allowing for our model to realize stable and reliable prediction.

6. Conclusion

In this paper, we propose a lane-level traffic prediction model for lane-level traffic prediction tasks. Specifically, we capture spatial features with GCN. To adapt to the lane-level traffic prediction, we treat different lanes of the same road segment as different nodes on the graph. Furthermore, considering that most existing graph-based methods build static adjacent matrix, we construct a data-driven adjacent matrix, which consists of a static distance-based adjacent matrix and a dynamic lane correlation matrix. Additionally, we utilize GRU to capture temporal features. Then we adaptively fuse spatial features and temporal features through the gating mechanism to obtain spatio-temporal features for multi-steplane-level traffic prediction. Experiment on a real-world dataset verified the effectiveness of our model for lane-level traffic prediction.

In the future, we plan to consider more auxiliary features, such as utilizing speed information in traffic flow prediction. The current model can also be improved by considering more detailed and realistic lane relationships for more than two lanes. Besides, we consider to combine with some advanced techniques such as attention network to achieve more accurate and reliable lane-level traffic prediction.

Data Availability

The data used during the study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

This research was supported by the National Natural Science Foundation of China under Grant no. 62202131, the Natural Science Foundation of Zhejiang Province under no. LQ20F020015, and the Key Science and Technology Project of Zhejiang Province under no. 2020C01165.