Abstract

With the advent of the data-driven era, deep learning approaches have been gradually introduced to short-term traffic flow prediction, which plays a vital role in the Intelligent Transportation System (ITS). A hybrid predicting model based on deep learning is proposed in this paper, including three steps. Firstly, an improved Complete Ensemble Empirical Mode Decomposition with Adaptive Noise (CEEMDAN) method is applied to decompose the nonlinear time series of highway traffic flow to obtain the intrinsic mode function (IMF). The fuzzy entropy (FE) is then calculated to recombine subsequences, highlighting traffic flow dynamics in different frequencies and improving prediction efficiency. Finally, the Temporal Convolutional Network (TCN) is adopted to predict the recombined subsequences, and the final prediction result is reconstructed. Two sensors of US101-S on the main road and on-ramp were selected to measure the prediction effect. The results show that the prediction error of the proposed model on two sensors is notably decreased on single-step and multistep prediction, compared with the original TCN model. Furthermore, the proposed improved CEEMDAN-FE-X framework can be combined with prevailing prediction methods to increase the prediction accuracy, among which the improved CEEMDAN-FE-TCN model has the best performance and strong robustness.

1. Introduction

With the development of the social economy, the existing transportation supply has gradually been unable to meet the increasing traffic demand. Urban traffic congestion is continuously aggravated, resulting in economic losses, environmental pollution, and energy waste [1]. Intelligent Transport System (ITS), as an essential part of traffic management, combines the advanced technology of communication, information, and artificial intelligence. ITS aims to deliver real-time traffic information accurately to help travelers better route planning. At the same time, it can also improve the identification ability of traffic evolution trends and particular traffic situations, support the traffic management department to give early warning and command of emergencies, and effectively reduce casualties and economic losses [24]. Specifically, one of the critical technologies of ITS is short-term traffic prediction, which is the core of the active control of urban traffic systems [5]. Through the deep excavation of big data, the inherent evolution law of traffic flow can be mastered to achieve accurate and real-time prediction, which provides precise travel information for travelers and policy suggestions for managers to control beforehand. Traffic prediction of different periods has its application value. Short-term traffic flow prediction is essential for the traffic control department and travelers. For the traffic management department, short-term traffic flow prediction can help identify the evolution of traffic flow to formulate short-term traffic control measures such as lane closure and ramp control in advance, effectively alleviating potential traffic congestion. Besides, it can help travelers better understand the operation condition of the road network and make path planning accordingly [69]. Therefore, short-term traffic flow prediction is of practical significance and worth studying.

Two methods have dominated traffic forecasting research in the existing literature: statistical methods and machine learning methods [10]. Statistical methods based on linear statistics include the ARIMA method [11], Kalman filter method [12], Markov chain method, etc. [13], which is more suitable for the road section with stable traffic conditions. However, the traffic flow nonlinearity is prominent when the prediction interval becomes smaller, resulting in low accuracy. Because of the fluctuation of traffic flow, the prediction method based on machine learning has drawn increasing attention, through which the inherent law of traffic data is excavated to capture the dynamics of traffic flow. For example, Wu et al. applied Support Vector Regression (SVR) to predict travel time and mapped the data to a high-dimensional space for regression, achieving good prediction results [14]. The SVR model is robust to noisy data and is more suitable for a small sample size. Cai et al. introduced K-Nearest Neighbor (KNN) model to realize multistep prediction on space and time, but the time complexity of calculation was high [15]. Besides, Csikos et al. constructed Artificial Neural Network (ANN) to learn the traffic speed dynamics through traffic speed samples in a month for prediction [16]. In recent years, with big data acquisition, deep learning models can capture more complex traffic features and have prospective applications [17, 18]. As one of the most typical methods, the Recurrent Neural Network (RNN) has a circular structure different from ANN. By feeding back the hidden layer information of the last moment to the input of the current moment, the temporal correlation of the traffic flow can be captured [19]. Traditional RNN mainly includes three structures: Elman Neural Network [20], Time-Delay Neural Network (TDNN) [21], and Nonlinear Autoregressive with Exogenous inputs Neural Network (NARX NN) [22]. Unfortunately, they all result in gradient vanishing and explosion problems, making it challenging to capture long-term information.

Nevertheless, it is shown that the traffic events that occurred in the previous period usually impact the predicted time, so RNN forecasting methods need to be further improved. Ma et al. firstly applied Long Short-Term Memory Neuron Network (LSTM NN) to predict traffic speed, which realized the memory of helpful information in a short and long time through the gate units and overcame the defect of traditional RNN. The results showed that the prediction performance was significantly better than other prevailing methods [23]. As the variant of LSTM, Gated Recurrent Unit (GRU) simplifies the structure, improving the prediction efficiency. Gao et al. combined GRU with MFD to forecast the traffic speed [24]. Other improved models like Attention-Based LSTM [25, 26] and BiLSTM [27, 28] achieved high accuracy on traffic prediction.

However, the process of RNN models is serial, meaning that later timesteps must wait for their predecessors to complete. For long-term sequence features capturing, RNNs use up much memory to store the partial results for their multiple cell gates. Convolution Neural Network (CNN) can extract the information of the long-term sequence parallelly because of the shared weights of the kernel [29]. As the length of the sequence increases, the network is deepened to learn the features, making it challenging to train. With the causal and dilated convolution, Temporal Convolution Network (TCN) achieves a flexible receptive field size, capturing the long-term historical information by a simple structure [30]. Zhao et al. improved the residual block of TCN for faster training speed and applied it to traffic flow prediction [31]. Zhang et al. used the genetic algorithm to optimize the hyperparameters of TCN. The results showed that the prediction performance was significantly better than other prevailing methods [32].

The inherent changing law of traffic flow is complex, consisting of various dynamics on different temporal scales. Although the deep learning models can capture long-term historical information, they need a deep network and take up much training time and memory. Thus, it is necessary to decompose the traffic flow time series, which simplifies the structure of prediction models and extracts features thoroughly and effectively. Huang et al. proposed Empirical Mode Decomposition (EMD), which decomposed the trend or fluctuation of different scales in signals consecutively to generate a series of IMF with different frequencies [33, 34]. Unlike wavelet transform, it is an adaptive and data-driven method without a defined wavelet basis. Theoretically, signals with nonlinearity and randomness can be decomposed. However, the conventional EMD decomposes the signal incompletely, causing mixing and false modes. Thus, several improved models were proposed to solve these problems [3538]. In recent years, EMD related methods have been gradually introduced to traffic prediction. For instance, Wei et al. combined EMD with Backpropagation Neural Network (BPNN) to predict the subway passenger flow, which showed notable performance. Only modes highly correlated with the original data were selected to improve the prediction efficiency [39]. Likewise, Chen et al. applied Ensemble Empirical Mode Decomposition (EEMD) to decompose the traffic flow time series, removed the high-frequency mode, and introduced LSTM NN to predict the left reconstructed modes [40]. However, what cannot be ignored is that each IMF plays an essential part in the time series, and the immediate abandonment of some modes will lead to the lack of detailed information on traffic flow features. Lu et al. applied the XGBoost method to predict the traffic flow intrinsic mode function (IMF) of each lane after Complete Ensemble Empirical Mode Decomposition with Adaptive Noise (CEEMDAN) [41]. Wang et al. combined CEEMDAN with LSSVM to predict highway traffic flow [42]. Huang et al. introduced K-means to cluster the traffic flow IMF decomposed by CEEMDAN and predicted by BiLSTM [43]. However, the value of K has not been chosen with a theoretical basis, and the BiLSTM may take up much memory usage. Though mixing modes were solved to some extent, the residual noise and spurious modes remained. Also, the prediction on every IMF resulted in poor efficiency. Moreover, the in-depth change features of traffic flow may not be captured because of the small training data size or high memory usage.

The existing research on the decomposition prediction of traffic flow time series is insufficient and remains preliminary. Such problems as incomplete decomposition, low prediction efficiency, high storage of memory, and the deep capture of traffic flow dynamics need further investigation. Therefore, in this paper, an improved CEEMDAN-FE-TCN model is proposed to forecast highway traffic flow. First, the improved CEEMDAN method decomposes the nonlinear highway traffic flow into IMF and residual with different frequencies. Next, the fuzzy entropy (FE) of each mode is calculated. IMF and residual with similar chaos are recombined, highlighting the traffic dynamics. Finally, the TCN is applied to predict the different recombined subsequences. After reconstructing the output of TCN submodels, the predicted traffic flow is obtained. The contributions of the paper can be summarized as follows:(i)The improved CEEMDAN method is first used for highway traffic flow decomposition. The changing features are decomposed to different temporal scales, making TCN extract the dynamics thoroughly.(ii)The FE difference of different modes decomposed from the original data is calculated. On this basis, the modes are recombined as subsequences, which highlights the primary trend of traffic flow changes and retains specific fluctuations. The computational complexity is reduced, and the forecasting efficiency and accuracy are further improved.(iii)The proposed improved CEEMDAN-FE-X framework can be applied to decrease the prediction error of prevailing models notably. Moreover, the improved CEEMDAN-FE-TCN model outperforms other models compared in this paper, which has strong robustness.

The rest of the paper is arranged as follows: In section 2, an improved CEEMDAN-FE-TCN model is proposed for traffic flow prediction. Section 3, Section 4, and Section 5 introduce the improved CEEMDAN, FE, and TCN, respectively. The prediction effects of the proposed model are verified on two sensors in Section 6. Finally, Section 7 summarizes the conclusions and future directions.

2. The Improved CEEMDAN-FE-TCN Model

In this paper, an improved CEEMDAN-FE-TCN model is constructed for highway traffic flow prediction, which contains three modules: improved CEEMDAN decomposition, FE calculation, and TCN prediction.

TCN is applied as the core module to predict the highway traffic flow. As a new neural network with a convolutional structure, TCN has the advantages of large-scale parallel processing of CNN and integrates the modeling ability of sequential tasks, which makes up for the long-term dependence problem of RNN [44]. The RNN variants like LSTM and GRU memorize part of the information through the gated unit, while TCN can capture all the historical information with better prediction and faster training speed [30].

However, the traffic flow time series consists of different temporal scaled changing features, causing fluctuation and nonlinearity. It is challenging for TCN to extract the mixed dynamics thoroughly. So, the improved CEEMDAN model is adopted to decompose the sequence to IMF and residual, making TCN capable of capturing the features on every single temporal scale.

The modes decomposed by improved CEEMDAN have physical significance. Nevertheless, from the traffic point of view, some IMF may be part of traffic flow dynamics on a specific time scale. Besides, each IMF needs a corresponding TCN submodel for training and predicting, causing complex computation. Therefore, FE is introduced to calculate the complexity of every IMF decomposed by the traffic flow time series. The sequences with close FE have similar temporal scales and stationarity, indicating that TCN will have the same feature extracting ability on the recombined sequence as every single sequence. The recombination will highlight the changing features of traffic flow and eliminate the accumulated error on multiple similar sequences prediction. Thus, the modes with similar FE are recombined as the input of TCN, reducing calculation complexity and improving prediction efficiency and accuracy.

The output of every TCN submodel is the predicted traffic flow on different time scales. After reconstruction, the final predicted traffic flow is obtained.

The framework of the proposed model is shown in Figure 1.

The procedures in specific are expressed as follows:Step 1: the improved CEEMDAN method is introduced to decompose the original traffic flow time series to obtain k IMF and residual with different frequencies.Step 2: the FE of each mode is calculated. According to the difference between the modes, the IMF and residual with similar chaos are recombined to subsequences (RS).Step 3: the TCN submodules are adopted to train and predict RS (1)-RS (n), respectively; then the prediction results are reconstructed to obtain the predicted highway traffic flow.

3. Improved Complete Ensemble Empirical Mode Decomposition with Adaptive Noise

3.1. CEEMDAN Algorithm

The CEEMDAN algorithm can eliminate the mixing modes to some extent. Each IMF is calculated through the residual signal by adding white noise adaptively in the IMF decomposition process, reducing the reconstruction error. The method has good integrity and reduces the number of integrations. The specific steps are shown as follows [37]:Step 1: a series of Gaussian white noise is added adaptively to the original signal x: denotes the time series after adding white noise for the ith time; denotes the noise factor; denotes the white noise added for the ith time; I denotes the number of integrations.Step 2: the EMD algorithm is used to decompose , and the first EMD mode is averaged to calculate the first CEEMDAN mode as follows:Remove from x to obtain the first residue as inStep 3: decompose by the EMD algorithm to obtain the second CEEMDAN mode:where denotes the kth mode decomposed by the EMD algorithm.Step 4: repeat the following process to calculate the remaining modes until the remaining residual cannot decompose.where K denotes the number of the CEEMDAN modes.

The final residual is calculated as

The original x can be expressed as

3.2. Improvements on CEEMDAN

Although the CEEMDAN method has overcome mode mixing, residual noise and spurious modes remain. On this basis, the improved CEEMDAN algorithm was proposed, which has two perfections: One is to estimate the local mean of the signal plus noise and define the difference between the current residue and the average of its local means as the primary mode, which reduces the residual noise existing in the decomposition mode. The other is to extract the kth mode by using to replace white noise, reducing mode overlap. Therefore, the improved CEEMDAN method is adopted to decompose the original traffic flow time series. The steps can be described as follows [38]:

Define operator as the kth mode decomposed by EMD, operator as the local mean of the mode, and operator as mean operation. Then, .Step 1: is constructed to calculate the first residue:Step 2: the first mode can be calculated asStep 3: the second residue is estimated as the mean of a series of and the second mode is defined asStep 4: for , the kth residue is expressed asStep 5: the kth mode of the improved CEEMDAN can be obtained:Step 6: go to Step 4 for next k.

4. Fuzzy Entropy

Fuzzy entropy (FE) measures the complexity of time series and the probability of generating new patterns when the dimension changes. The higher the time series complexity, the higher the entropy [45]. The fuzzy membership function is introduced to make the fuzzy entropy continuous and smooth with the change of parameters, reducing the sensitivity dependence on parameters, and the statistical results are stable [46]. The process of FE calculation is shown as follows [47]:Step 1: the dimension is set for the IMF of traffic flow time series , and the m-dimension vector is constructed as follows:; then, can be expressed asStep 2: the distance of vectors and is calculated as, and .Step 3: introduce the membership function:r denotes the similarity tolerance parameter, which means R times the standard deviation of the original one-dimensional time series, namely, .The similarity between vectors and is defined asStep 4: define functionThen,Step 5: go to Step 1 for next m.Step 6: the fuzzy entropy of traffic flow time series can be expressed as

5. Temporal Convolutional Network

TCN combines the advantages of CNN and RNN, which capture the global information and process parallelly. It contains three main modules: causal convolution, dilated convolution, and residual block.

5.1. Causal Convolution

When processing sequential tasks, TCN needs to generate outputs with the same length as the input. All data in causal convolution strictly follow the causal relationship in time order, meaning that the value at time t only depends on the information before time t. Because of the strict time-constrained nature of causal convolution, TCN ensures causality and prevents future data leakage.

5.2. Dilated Convolution

With the increasing length of the sequence, the network is deepened to extract more features of historical time, making it hard to train. In order to simplify the network structure, the dilated convolution is adopted, which enables an exponentially sizeable receptive field. For a 1D sequence input x ∈ R and a filter f: {0, …, k − 1}  ⟶ R, the traffic flow F at time s is defined aswhere d is the dilation factor, k is the filter size, and s−d⋅i indicates the past direction. The structure of the causal and dilated convolution is shown in Figure 2. With the dilated convolution, the receptive field size of TCN is flexible, making it easy to capture the features of the global long sequence by a few hidden layers.

5.3. Residual Block

By learning the identity mapping function, residual connection enables the network to transfer information in a cross-layer way, increasing network depth, improving accuracy, and simplifying network training.

X is set as the input value of the residual module, and the potential identity mapping function for cross-layer is F (⋅), the result of which will be added to the input value X, so the output value o of the residual module can be expressed as

The structure of a residual block is shown in Figure 3.

6. Empirical Study

6.1. Data Description

In this paper, two sections on US101-S in California were selected as examples to verify the effectiveness of the proposed model. The section where VDS No. 717490 locates is on the mainline, and the section where VDS No. 718462 locates is on the on-ramp. The locations are shown in Figure 4, and the detailed information of the two sensors is shown in Table 1.

The datasets were collected by Caltrans PeMS (https://pems.dot.ca.gov/) from 2018/8/1 to 2018/8/31. The flow of all lanes was aggregated into 5-minute intervals to reduce the volatility of the data and ensure real-time prediction. There were 8928 samples in each group of datasets. The error and loss rate was less than 2%, making it proper to be trained and tested. The training and testing datasets were divided by 2018/8/27. There were 7488 samples trained and 1440 samples tested in each group, as shown in Figure 5.

The autocorrelation of the traffic flow data obtained by VDS No. 717490 and VDS No. 718462 is shown in Figure 6. As the time lag increases, the autocorrelation of both sequences decreases slowly. Therefore, they are nonstationary time series with nonlinear changes which should be smoothed. In addition, when the time lag increases to 40, the autocorrelation is still over 0.3, indicating that the sequences have a long-time dependence, so TCN is suitable for the traffic flow prediction.

The datasets were processed by TensorFlow2.0.0 and Keras 2.3.1 and compiled by Python3.6. Four indexes were introduced to measure the prediction accuracy: Mean Absolute Error (MAE), Root Mean Square Error (RMSE), R-squared, and Geoffrey E. Havers (GEH). They are calculated as follows:

6.2. Traffic Flow Sequence Decomposition and Recombination

The improved CEEMDAN algorithm was adopted to decompose the traffic flow time series obtained by VDS No. 717490 and VDS No. 718492, respectively. The 11 IMF and one residual were arranged with different frequencies, as shown in Figure 7.

The IMF and residual with similar FE were recombined to reduce the calculation complexity and increase the forecasting efficiency and accuracy. The mode FE and the FE difference between different modes of the traffic flow sequences obtained from the two sensors were calculated, as shown in Table 2, and the changing trend of FE is shown in Figure 8. For VDS No. 717490, modes with FE differences less than 0.1 were recombined. Similarly, for VDS No. 718462, 0.05 was the difference threshold of recombination. The recombined subsequences of the two sensors are plotted in Figure 9.

Figure 9 shows that each recombined subsequence reflects part of traffic flow dynamics. For VDS No. 717490, IMF1, IMF2, and IMF3 are high-frequency modes with high FE and chaos. Although they are noise with poor predictability, reflecting the randomness and nonlinearity of traffic flow, the detailed information is contained. Therefore, they need to be predicted, respectively.

The FE of IMF4 is close to that of IMF5 and IMF6. The recombined subsequence reflects the specific daily change characteristics of traffic flow. There are two peaks every 288 data points, representing the morning and evening peaks of traffic flow which have apparent differences. It is shown that the morning peak flow is higher than that of the evening on weekdays, while on weekends, the two peaks are similar and lower than those on weekdays.

The FE of IMF7 is quite different from IMF6 and IMF8, reflecting that the overall trend of daily traffic flow increases first and then decreases. Together with IMF4, IMF5, and IMF6, they are median-frequency modes with solid predictability and are the core of time series prediction.

IMF8, IMF9, IMF10, and IMF11, together with the residual, constitute the trend mode, reflecting the weekly traffic flow dynamics. It is shown that the traffic flow on weekdays is relatively stable and higher than that on weekends. The FE and the chaos of the trend mode are low, and the predictability is firm. The trend mode is the essential component of time series prediction. It is worth noting that the data on 2018/8/21 were unstable and fluctuant, so the corresponding subsequences of IMF7 and IMF8 + IMF9 + IMF10 + IMF11 + Residual changed apparently on that day, causing disturbance to the original changing cycle.

The traffic flow obtained from VDS No. 718462 shows similar changing characteristics to VDS No. 717490. However, because the on-ramp only has one lane with more unstable traffic flow, the fluctuation frequency is higher than that of the mainline, resulting in weaker periodicity.

6.3. Highway Traffic Flow Prediction
6.3.1. Hyperparameter Optimization

The accuracy of each prediction model is affected by various hyperparameters, which should be optimized before the prediction. For TCN, the number of filters, time lag, kernel size, and dilation factors are the crucial hyperparameters affecting the performance. The number of filters determines whether feature extraction is complete, the others affect the size of the receptive field, and all hyperparameters jointly influence the prediction accuracy of TCN. GridSearchCV in the Scikit-learn was imported to score the performance of different hyperparameters combinations of each prediction model and search for the best hyperparameters by 10-fold cross-validation. The data range of different hyperparameters of each TCN module is shown in Table 3.

6.3.2. Results and Comparison

The prediction effect can be divided into vertical and horizontal comparisons. The vertical comparison measures the effect of different variants of the TCN model, while horizontal comparison compares different baseline models for traffic flow prediction. Both comparisons were analyzed to verify the superiority of the proposed model. In addition, our experiment platform is a personal computer with Core (TM) i3-8100 [email protected] GHz and 8 GB RAM. Python 3.6, TensorFlow 2.0.0, and Keras 2.3.1 are used to realize the models.

(1) Vertical Comparison of Different Models Based on TCN. The modes and the recombination subsequences decomposed by VDS No. 717490 and VDS No. 718462 traffic flow time series were predicted. The error and training time is shown in Tables 4 and 5, respectively.

The results show that the recombined subsequences have higher accuracy and less training time than the single IMF. With recombination, the number of training models is reduced, and the computational complexity is decreased. Despite the improved CEEMDAN algorithm, other methods based on EMD are introduced to optimize the prediction performance of TCN. The results are shown in Table 6 and Figure 10.

As shown in Figure 10, compared with the direct prediction, the accuracy of decomposition prediction is notably increased. With the improvement of EMD, the performance of forecast is promoted. Specifically, for VDS No. 717490, the error of ICEEMDAN-TCN is reduced by 69% compared with TCN. For VDS No. 718462, it is decreased by 59%. Furthermore, recombining similar modes according to FE can ulteriorly improve efficiency and accuracy with less calculation complexity. In terms of prediction efficiency, the training time of the model after the recombination is reduced by 34%–49%. From the perspective of prediction accuracy, for VDS No. 717490, the error of ICEEMDAN-FE-TCN is further reduced by 3% compared with ICEEMDAN-TCN, and for VDS No. 718462, it is decreased by 5%. Overall, the proposed model has the lowest error and the least training time (except original TCN) on both sensors, indicating the best goodness of fit. Aug 31, 2018, was taken as an example to visualize the prediction performance of each model, as shown in Figure 11. Since there is little difference in visualization between X-TCN and X-FE-TCN, the X-FE-TCN is representative.

The prediction performance of the original TCN model is approximately fitted to the actual data but has an apparent time delay. The reason is that the traffic flow time series consists of changing features with multiple frequencies. TCN cannot accurately capture the multiple-scaled dynamics, causing prediction error. The EMD-based models can decompose the sequence to different IMF, making it easier for TCN to learn the characteristics of every scale so that the prediction performs better than the original model, and the hysteresis can be effectively eliminated. Among all the models compared, the improved CEEMDAN-FE-TCN achieves the best performance because of the extraordinary ability of decomposition. Besides, the fluctuation of ramp flow is more potent than that of the mainline flow; the proposed model also performs well on the ramp traffic flow prediction, which appears to have strong robustness.

(2) Horizontal Comparison of Eight Different Models. The traffic flow of VDS No. 717490 and VDS No. 718462 was predicted single-step and multistep ahead by TCN, LSTM, GRU, SVR, and HA algorithms and their improved models under the framework proposed in this paper. The hyperparameters optimization method was mentioned in 6.3.1. The prediction error of each model is shown in Table 7 and Figure 12. The visualization is shown in Figures 13 and 14.

The results above show that the traffic flow predicted single-step and multistep ahead by different algorithms approximately fits with the original data, and the prediction accuracy obtained by decomposition forecasting is significantly improved compared with the direct prediction.

From the perspective of one-step-ahead prediction, for VDS No. 717490, the prediction accuracy of TCN is higher than that of LSTM, GRU, SVR, and HA. Under the framework of the improved CEEMDAN-FE-X, the prediction error of TCN, LSTM, GRU, and SVR is sharply decreased by 69%, 64%, 67%, and 44%, respectively. Among all the models compared, the improved CEEMDAN-FE-TCN model obtains the lowest MAE, RMSE, GEH average at 7.36, 10.34, and 0.39, respectively, and the highest R-square at 0.997. For VDS No. 718462, the prediction accuracy of TCN is higher than that of LSTM, GRU, SVR, and HA. Under the framework of the improved CEEMDAN-FE-X, the prediction error of TCN, LSTM, GRU, and SVR is sharply decreased by 59%, 55%, 54%, and 51%, respectively. Among all the models compared, the improved CEEMDAN-FE-TCN model obtains the lowest MAE, RMSE, GEH average at 2.10, 2.96, and 0.39, respectively, and the highest R-square at 0.968. Though the error of all models increases with the prediction step prolonging, the proposed model performs best on two-step and three-step ahead predictions, indicating its goodness of fit on long- and short-term predictions.

TCN, LSTM, and GRU all realize the memory of the long-term changing features. Therefore, they appear to be more accurate with the extensive training samples by extracting deeper traffic dynamics than the SVR and HA models, reducing the prediction error. However, unlike the RNN models, TCN can capture the whole long-term sequence features by convolving parallelly. So, it takes up less memory and avoids forgetting information, which thoroughly learns the global time series characteristics and achieves more accuracy than RNN. Furthermore, under the framework of the improved CEEMDAN-FE-X, the TCN RNN and SVR models all perform better than the direct prediction, which means the decomposition prediction has the universality on different prediction models. What should be mentioned is that the proposed framework has the best optimization effect on the TCN model and has a better effect on the RNN model than on the SVR model for both the mainline and the ramp flow. In conclusion, under the reasons mentioned above, the improved CEEMDAN-FE-TCN outperforms the other models compared in this paper on the highway mainline and ramp traffic flow prediction.

7. Conclusions

In this paper, an improved CEEMDAN-FE-TCN model is proposed to forecast highway traffic flow. First, the improved CEEMDAN method decomposes the nonlinear highway traffic flow into IMF and residual with different frequencies. Then, the FE of each mode is calculated, and the modes with similar chaos are recombined as subsequences, highlighting the traffic dynamics. Finally, the TCN is applied to predict the different recombined subsequences. After reconstructing the output of TCN submodels, the predicted traffic flow data is obtained. The data of two sensors on US101-S: VDS No. 717490 and VDS No. 718462 collected from PeMS were tested. Compared with other models, the following conclusions are drawn:(1)The improved CEEMDAN algorithm can decompose the traffic flow time series with different frequencies. The accuracy of time series prediction after decomposition and reconstruction is notably higher than direct prediction. Compared with conventional EMD-based models, the improved CEEMDAN-FE-TCN obtains the lowest prediction error.(2)The FE algorithm can calculate the chaos of the modes decomposed by the original data. By recombining the modes with similar FE, the main dynamics of traffic flow are highlighted while amplifying the details of fluctuations. The prediction efficiency and accuracy would be further improved.(3)The improved CEEMDAN-FE-X framework has remarkable effects on single-step and multistep traffic flow prediction. Under this structure, the prediction accuracy of the TCN, LSTM, GRU, and SVR models is significantly increased. The proposed model outperforms the other models in this paper on the highway mainline and ramp traffic flow prediction, confirming the robustness.

Studies can be combined with other aspects in future work, such as adding spatial factors into the time series prediction. Decomposing the spatiotemporal graph and selecting suitable models for the subgraphs of different frequencies to make predictions may improve accuracy.

Data Availability

All of the data related to this paper are available on Caltrans PeMS (https://pems.dot.ca.gov/).

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.

Acknowledgments

This work was supported by the National Natural Science Foundation Item (Grant no. 52072143).