#### Abstract

Travel time prediction is playing an increasingly important part in advanced traveler information system (ATIS), which is of great significance to alleviate urban traffic congestion. Although graph convolutional networks have been widely used in road network traffic prediction, spatiotemporal dynamic modeling of urban traffic is still an intractable task. In this study, we propose an improved graph convolutional network (IGC-Net) for travel time prediction. Specifically, we design a modified adjacency matrix by fusing distance and correlation matrix with original adjacency matrix to capture spatial dynamic feature. We then establish three components based on temporal property to capture recent, daily-periodic, and weekly periodic correlations. The comparison experiments with baseline models and variants on a real-world dataset in Beijing are conducted. The results show that the IGC-Net outperforms baseline models in different prediction horizons and has stronger robustness for dynamic traffic prediction.

#### 1. Introduction

In recent years, the problem of urban traffic congestion has become more and more serious. The duration of commuting directly determines people’s travel experience, which is not only an important index to measure the efficiency of urban operation but also an important factor affecting the life quality of residents. Road travel time can directly reflect the traffic congestion state of the road segment, which is an important basis for the development of intelligent transportation system (ITS) [1, 2]. Travel time prediction is of great significance and application value to both traffic users and traffic managers.

The study of traffic forecasting has been developing over the past few decades. Initially, the researchers used statistical methods including historical average (HA) and autoregressive integrated moving average (ARIMA) [3, 4] to predict temporal traffic parameters. However, simple statistical models are difficult to capture the nonlinear relationship of traffic data. Thanks to the progress of computer technology and the diversification of data acquisition system, neural networks modeling and multisource information fusion are widely used in traffic prediction research, which greatly improves the accuracy of traffic prediction. Wu et al. [5] used the support vector regression model (SVR) for traffic prediction. Tian and Pan [6] utilized long short-term memory network (LSTM) to predict traffic flow, which demonstrated the effectiveness of recurrent neural network (RNN) to forecast time-series data. However, the traditional machine learning model still has deficiencies in capturing periodic features and selecting model parameters. Some researchers further proposed hybrid models to predict traffic parameters [7–9]. For example, Li et al. [10] established a model based on ensemble empirical mode decomposition and random vector functional link network for travel time prediction on highway network. They also proposed another model based on a deep belief network, which is optimized by the multiobjective particle’ swarm algorithm [11]. Moreover, traffic prediction research is also extended to the spatial scope. The researchers first analyzed the characteristics of spatiotemporal data. For instance, Zhang et al. [12] divided the city into a grid to explore the spatial distribution and correlation of the cellular traffic and analyzed temporal dynamics between different cells using autocorrelation coefficient. Based on the analysis of temporal and spatial characteristics, the convolutional neural network (CNN) was widely applied [13, 14] to capture traffic features as images. However, due to the real traffic network is complex, the standard convolution with Euclidean grid is no longer suitable for general graphs.

In recent years, there are two ideas to explore how to generalize convolutional neural network into structured data forms. One is to expand the spatial definition of convolution network in the spatial domain [15], and the other is to process it by graph Fourier transform in spectral domain [16]. The former directly defines the convolution operation on the connection relationship of each node, which is more similar to the traditional convolution neural network. The latter is to realize convolution operation on the topological graph with the help of the spectral graph theory. The graph neural network (GNN) [17] has been widely used in traffic prediction. In addition, deep learning techniques can automatically extract features of multisource data [18] and model more complex spatial and temporal traffic patterns in various traffic scenarios. The sequence-to-sequence (Seq2Seq) model with encoder-decoder structure [19, 20] combined with graph convolutional network (GCN) which has been widely used to construct spatiotemporal prediction models. For instance, Yu et al. [21] adopted the method of spectral graph convolution combined with gated recurrent neural network to obtain spatiotemporal features. Guo et al. [22] proposed an attention-based spatial-temporal graph convolutional network (ASTGCN) model to realize the traffic flow forecasting. Nevertheless, due to the complexity and dynamics of the actual traffic network, the traditional adjacency matrix cannot effectively capture the time-varying spatial dynamic characteristics.

To solve the above problems, an improved graph convolutional network is proposed in this paper to improve the accuracy of travel time prediction. The primary contributions of this paper are as follows:(i)We propose a modified adjacency matrix to better capture the spatial features by integrating dynamic weight information between road segments. The main idea is to construct distance weight matrix and correlation weight matrix, respectively, based on geographical location attributes and dynamic traffic information.(ii)According to the temporal property, we establish the recent, daily, and weekly component to model the temporal dependencies. Furthermore, we use the same improved graph convolutional network in three components to capture spatiotemporal characteristics.(iii)By using real-world datasets, we conduct baseline model comparison and ablation experiments to evaluate our model performance. The prediction results demonstrate the superior performance of our proposed model.

The remainder of this paper is organized as follows. In Section 2, we present the basic concepts and problem formulation and describe the concrete modeling process. In Section 3, we introduce the experimental environment and setting. The experimental results are analyzed in detail in Section 4. Finally, Section 5 summarizes the study and looks forward to the prospects for the future.

#### 2. Methodology

##### 2.1. Preliminaries

In this section, we first describe the notations of variables and formalize the traffic prediction problem.

###### 2.1.1. Traffic Network Topology

The road network is defined as a graph *G* = (*V*, *E*, **A**), where *V* is the set of *N* vertices (i.e., road segments) and *E* is edges between different vertices () *E*. The adjacency matrix **A** reflecting the connectivity between road segments can be indicated as . if node and are accessible; otherwise, .

###### 2.1.2. Travel Time

The travel time of each road segment is normalized to unit length time (s/m) considering the influence of unequal segment length. The travel time of the whole traffic network *G* at *t*th time slot is defined as , where *F* denotes the feature of vertices. Therefore, denotes all node values over the first *p* time steps of input.

###### 2.1.3. Meta Features

In this study, we aggregate traffic readings into 5 minutes’ window. As shown in Figure 1, time-series traffic data have specific temporal characteristics, including recent, daily-periodic, and weekly periodic correlations. We set time labels’ vector *l*_{1} = {0, 1,…, 287}, *l*_{2} = {0, 1,…, 6}, and *l*_{3} = {0, 1} to represent time-of-day, day-of-week, and weekday-or-weekend, respectively. For instance, 23 : 55 is labeled as , Friday as *l*_{2} = 4, and weekdays as *l*_{3} = 0. Hence, at each time slot *t*, the exogenous variable *E*_{t}, which is an additional inputs to the model, can be composed of the above time attribute characteristics.

###### 2.1.4. Historical Average Travel Time

Historical traffic information can reflect the trend of daily traffic conditions. The historical average travel time at *t*th time slot of road segment *i* is denoted by .

###### 2.1.5. Problem Definition

The task of travel time forecasting is to use the past traffic observations to predict the future value of each road segment in a certain period. Given traffic network graph *G*, the prediction problem is formulated aswhere denotes the predicted value of the *q*th time step, represents the historical values of the first *p* time steps, and is the conditional probability function.

##### 2.2. Improved Graph Convolutional Network

In this study, the urban traffic network is regarded as a graph structure, and the features of each node are the signals on the graph [23]. In spectral convolution, the graph is represented by its corresponding Laplace matrix.

###### 2.2.1. Spectral Graph Convolution

As it is difficult to express meaningful translation operator in the node domain [17], the spectral convolution on graphs is defined as the multiplication of a signal (a scalar of all nodes at a time slot) with a filter parameterized by in the Fourier domain [24]. According to the convolution theorem and Fourier transform of graphs, the spectral graph convolution is defined aswhere represents a graph convolution operation, is the Fourier basis composed of eigenvectors, and is a diagonal matrix of eigenvalues.

However, when the scale of the graph is large, it is computationally expensive to decompose the eigenvalue of the Laplace matrix in equation (2). Hammond et al. [25] used Chebyshev polynomials to circumvent this problem. Furthermore, a layerwise linear model can be built by stacking multiple localized graph convolution layers with 1-order approximation of graph Laplacian [26].

###### 2.2.2. Modified Adjacency Matrix

The spatial dependence in the actual traffic network is complex, which is affected by the geospatial distance and dynamic traffic flow. We further construct distance matrix and correlation matrix to capture spatial attention.

Distance matrix : we first calculate the shortest distance *d*_{ij} between node and by Dijkstra algorithm and then use the reciprocal of the distance to represent the weight between two vertices:where is set to 1000 to control the sparsity of the matrix.

Correlation matrix : due to the daily commuting rules, urban traffic has significant periodic characteristics. Therefore, based on the Pearson correlation coefficient method, we utilize historical average value to obtain the dynamic correlation between road segments:where the correlation coefficient *r*_{ij} < 0.2 means that the two vertices are almost uncorrelated.

Based on the above, the modified adjacency matrix is defined aswhere is the Hadamard product operator.

###### 2.2.3. Improved Graph Convolution (IGC)

With the modified adjacency matrix , the calculation formula between multiple graph convolutional layers is as follows:

To capture a larger range of spatial correlations, we leverage the residual learning to our model, which has been proved to achieve better results in deep network training [21]. The residual units are denoted aswhere is the modified adjacency matrix added self-connections and . , , and are the input, output, and a trainable matrix of the *l*th layer, respectively. denotes a nonlinear activation function, e.g., ReLU.

##### 2.3. Framework of the Prediction Model

Figure 2 presents the overview of our proposed prediction model, which is mainly composed of four components to model temporal dependencies and spatial correlations. IGC_1, IGC_2, and IGC_3 have the same network structure, which can capture recent, daily-periodic, and weekly periodic correlations, respectively. According to the definitions in Section 2.1, the sampling frequency of historical observations is 288 times per day, the current time slot is *t*, and the prediction horizon is *q* in this study. Therefore, the input of the above three components can be defined as follows, where *l*_{r}, *l*_{d}, and indicate the lengths of three parts of data.(i)Recent: (ii)Daily: (iii)Weekly:

Then, the result is obtained from the output of each component by means of parametric-matrix-based fusion:where , , and denote trainable matrices.

In the rightmost component of Figure 2, the metadata are transformed into the binary vector by one-hot encoding and the fully connected neural network is utilized to process binary feature. Finally, by integrating with the output , we yield the predicted value by means of Tanh function:

We leverage minimizing mean squared error to calculate the loss between predicted value and historical observation. The loss function is expressed aswhere represents all learnable parameters in the model.

#### 3. Experiments

##### 3.1. Data

The data used in our experiments includes road network geographic information (as shown in Figure 3) and road segments travel time data from October 1 to December 31, 2019, in Beijing, for a total of 13 weeks. In the datasets’ division, we divide the data of the first ten weeks into nonoverlapping training set and verification set samples on the timeline and take the data of the last two weeks as the test set.

##### 3.2. Experimental Settings

The modeling process is carried out on the Anoconda3 using Python. The experiments are performed on a server with an Intel Core i9-9900 KF 3.60 GHz CPU and 32-GB RAM. Furthermore, a GPU with 8G memory is used to accelerate the model learning process.

In the experiments, the model is trained by the Adam optimizer [27], where the learning rate is set to 0.001 and the batch size is 64. Taking the mean square error (MSE) as the objective function, the early stop technique is applied on the verification set to avoid overfitting. For our proposed model, we use the data of previous 60 minutes to predict single-step (5 min) and multistep (15 min and 30 min) traffic in the future. Take 30 min as an example, the prediction horizon *q* = 6, and the lengths of three parts are set as *l*_{r} = 12, *l*_{d} = 18, and .

##### 3.3. Comparison Algorithms and Metrics

In order to verify the performance of the proposed model, we construct five baseline models for comparison, including the statistical model (HA and ARIMA), the traditional neural network model (LSTM), and the graph convolutional network-based deep learning model (DCRNN, STGCN, and graph WaveNet). Also, IGC-Net_E is utilized as variant of IGC-Net to compare the influence of exogenous variables, respectively.(i)HA: we predict the future travel time by the average value of travel time in the corresponding time interval.(ii)ARIMA: the autoregressive integrated moving average method has been widely used in time-series forecasting.(iii)LSTM: it is a variant model of RNN [28]. Here, the model has two hidden layers with 64 units.(iv)DCRNN: we built the diffusion convolutional recurrent neural network according to the open source code provide by [29].(v)STGCN: it stands for spatial-temporal graph convolutional network. For the sake of fairness, the hyperparameters are set according to the original model [21].(vi)Graph WaveNet [30]: graph WaveNet captures spatial-temporal correlations by combining graph convolution with dilated casual convolution.(vii)IGC-Net_T: it is a variant of our proposed model without the classification modeling of temporal property. That is, the input data are not distinguished by recent, daily-periodic, and weekly periodic correlations.(viii)IGC-Net_E: It is a variant of our proposed model without metadata input.

Furthermore, the prediction results of models are evaluated by MAE, MAPE, and RMSE. The calculation formula is as follows:where *m* is the number of test samples and and are the *i*th real and prediction value, respectively.

#### 4. Results and Discussion

##### 4.1. Model Comparisons

In this section, we use the data of previous 60 minutes to predict the travel time of the next 15 minutes and 30 minutes, i.e., the input step is set to 12, and the output step is set to 3 and 6, respectively. Table 1 shows the prediction performance of different algorithms. It can be seen that our proposed model outperforms baseline models in different prediction horizons. The prediction accuracy of the statistical model tends to be worse, which is caused by the model’s inability to model nonlinear and complex traffic. Besides, LSTM only models the time dependence without considering the spatial correlation, and the prediction performance is also greatly reduced.

Compare our IGC-Net with the graph convolutional network-based model, all of which model the spatial and temporal characteristics. However, our model performs the best. The reasons are as follows. On the one hand, we consider the daily-periodic and weekly periodic correlations of time-series data; on the other hand, we modify the adjacency matrix of graph convolution.

As can be seen from Figures 4 and 5, the deep learning model has better prediction performance compared with the traditional neural network model, but consumes longer calculation time. DCRNN has the lowest running efficiency because of the time-consuming sequence calculation in recurrent neural network. STGCN has relatively poor prediction performance, despite the short running time. On the basis of integrating spatial dynamic features, the running time of IGC-Net is only slightly higher than those of Graph WaveNet. As the prediction horizon increases, our model becomes more advantageous.

In addition, IGC-Net_E removes exogenous variables, which reduces prediction performance by 1.13%. The prediction performance of IGC-Net_T decreases more significantly due to not considering the temporal dependencies. However, IGC-Net_T outperforms LSTM, which proves the importance of considering spatial dynamic correlation in traffic prediction.

##### 4.2. Effect of Input Sequence Length and Prediction Horizon

For a time-series model, data acquisition and prediction requirements directly affect the final prediction performance. Therefore, we conduct sensitivity analysis on the input sequence length and prediction horizon of the proposed model.

As shown in Figure 6, for each input sequence length, the prediction performance decreases as the prediction horizon increases. Intuitively, increasing the prediction horizon will make the model require more input sequence to capture temporal correlation information. However, the prediction results become complex when we increase the input sequence length in two datasets. We find that it cannot meet the requirement of long-term prediction when the input is 3 steps, especially the prediction horizon of 30 min; however, when the input is increased to 12 steps, the prediction result also becomes worst. The reason is that too much historical information causes data redundancy, which will weaken the temporal correlation and reduce prediction accuracy.

Therefore, to learn more relevant historical information for accurate prediction, it is beneficial to appropriately increase the input sequence length. However, too much or too little input will reduce model performance.

##### 4.3. Effect of Spatial Dynamic Modeling

To verify the performance of the improved graph convolutional network, we use a variant GC-Net, which only contains the adjacency matrix for spatial modeling without considering the spatial dynamic correlation, to carry out comparative experiments.

Table 2 shows the comparison results of two models in 5 minutes, 15 minutes, and 30 minutes ahead forecasting. Furthermore, we can more intuitively analyze the comparison results of model performance from Figure 7. IGC-Net achieves better performance in different prediction horizons, which demonstrates the effectiveness of the modified adjacency matrix in spatial dynamic modeling.

Further, to analyze the prediction performance of different road segments more intuitively, the results on December 20-21 are taken as an example. As shown in Figure 8, the GC-Net has a poorer ability to capture the dynamic change of travel time, especially in peak periods.

**(a)**

**(b)**

**(c)**

#### 5. Conclusions

In this study, we propose an improved graph convolutional network (called IGC-Net) for travel time prediction. We construct distance weight matrix and correlation weight matrix, respectively, to modify the adjacency matrix of traditional GCN. Furthermore, we establish recent, daily, and weekly component to model the temporal dependencies according to the temporal property. Our proposed model can not only capture the static spatiotemporal characteristics but also realize the modeling of spatial dynamic correlation. The comparison experiments are carried out using Beijing road network traffic data. The results prove that our proposed model baseline models and the modified adjacency matrix can significantly improve the model’s prediction accuracy.

For future work, we will investigate the applicability of our model to other urban traffic forecasting tasks and further explore the method of dynamic spatial modeling in graph convolutional network.

#### Data Availability

The data that support the findings of this study are available from the corresponding author upon reasonable request.

#### Conflicts of Interest

The authors declare that they have no conflicts of interest.

#### Acknowledgments

This work was supported by grants from National Key Research and Development Program of China (no. 2018YFB1601600).