Abstract
A correct lanechanging plays a crucial role in traffic safety. Predicting the lanechanging behavior of a driver can improve the driving safety significantly. In this paper, a hybrid neural network prediction model based on recurrent neural network (RNN) and fully connected neural network (FC) is proposed to predict lanechanging behavior accurately and improve the prospective time of prediction. The dynamic time window is proposed to extract the lanechanging features which include driver physiological data, vehicle kinematics data, and driver kinematics data. The effectiveness of the proposed model is validated through the experiments in real traffic scenarios. Besides, the proposed model is compared with five prediction models, and the results show that the proposed prediction model can effectively predict the lanechanging behavior more accurate and earlier than the other models. The proposed model achieves the prediction accuracy of 93.5% and improves the prospective time of prediction by about 2.1 s on average.
1. Introduction
Driver lanechanging behavior is a key factor in driving safety. An improper lanechanging behavior may cause a vehicle collision [1, 2] or even a traffic accident [3–5]. In [6], it was indicated that nearly 18% of the total number of traffic accidents were caused by improper lane changing. Using a prediction model in the Advanced Driver Assistance Systems (ADASs) [7–9] could reduce the risk of accidents. Therefore, a model for accurate prediction of a driver lanechanging behavior using multiple data fusion is needed. Substantial research regarding the lanechanging prediction has been conducted. At present, there are two mainstream groups of methods for prediction models of lanechanging, namely, mathematical methods and artificial intelligence approaches. One of the lanechanging prediction models based on a mathematical method was introduced in [10, 11]. Also, in [12],the logistic regression method was used in a lanechanging prediction model, wherein the distances to the front and adjacent rear vehicles, forward timetocollision (TTC), and turn signal were taken into account, and the results showed that this model performed well in certain circumstances. Baumann et al. [13] improved a cognitive model to characterize driver behavior in an automotive environment and proved the correlation between drivers’ cognitive processes and their driving movements. Salvucci [14] introduced an adaptive control of rational cognitive structures to monitor the lanechanging process in a multilane highway environment, and their model demonstrated how cognitive architectures could facilitate understanding of driver behavior. However, some of the mentioned studies were conducted in specific traffic scenarios such as highway entrance and ramp; besides, a part of the traffic scenarios was simulated instead of a real one. Therefore, those studies may not consider the lanechanging behavior in real traffic scenarios. Moreover, the above studies achieved the prediction accuracy ranges from 80% to 85%, so it has considerable space for improvement.
With the aim to build a more intelligent lanechanging prediction model, researchers have adopted machine learning. A lanechanging behavior prediction model based on the support vector machine (SVM) classifier and Bayesian filtering (BF) was proposed in [15], and it was shown that this model could predict driver lanechanging behavior 1.3 s in advance. In addition, a lanechanging behavior and trajectory prediction model based on the Hidden Markov Model (HMM) was presented in [16], and the results showed that this model predicted the lanechanging trajectories very well,which makes it suitable for prediction of humanlike lanechanging maneuvers. Hou et al. [17] proposed a lanechanging prediction model based on fuzzy logic, which was developed for the case of a forced lanechanging under the lanedescending conditions. The prediction accuracy of this model for nonmerging behavior and merging behavior was 86.3% and 87.5%, respectively. Furthermore, a novel lanechanging intention recognition algorithm which combines the HMM and BF models was proposed in [18], where the model input consisted of three signals from the CAN bus (steering angle, lateral acceleration, and yaw rate), and the output was behavior classification. The results revealed that the HMMBF could achieve an average recognition accuracy of 91.9%. Zheng et al. [19] proposed a machine learningbased segmentation and classification algorithm consisting of three stages. The first stage includes data preprocessing and prefiltering, and its function is to reduce noise and remove clear left and right turning events. The second stage employs a spectral time frequency analysis segmentation approach to generalize all potential timevariant lanechanging and lanekeeping candidates. The third stage includes two possible classifications: lanechanging and lanekeeping. The results showed that the average accuracy of this threestage algorithm exceeded 83.22%. Furthermore, in [20], a dynamic Bayesian network (DBN) was used to predict the lanechanging maneuvers,and the test using the real data showed that the lane changing was detected 1 s in advance. However, the machine learning methods are not suitable for multisource data fusion, and even a singleinput data may result in lower accuracy.
In recent years, the deep learning algorithms have been widely used in lanechanging prediction because of its powerful highdimensional data processing and autonomous learning capabilities, which is in sharp contrast with conventional mathematical methods. Xie et al. [21] proposed a deep learningbased method to predict the future trajectory of vehicles and achieved good results. It is demonstrated that the deep learning model can mine potential features from highdimensional data and also indicated the feasibility of deep learning in lanechanging research. In [22], a backpropagation (BP) neural network was used as a controller of an automatic vehicle system, and a camera image was used as a neural network input to construct a lanechanging model. The results showed that one hidden layer was enough to provide good performance of a timevarying nonlinear dynamic system. Tomar et al. [23] proposed a method based on a multilayer perceptron to predict a lanechanging trajectory accurately. The prediction results showed that this model was able to predict the future path accurately only for discrete patches of a trajectory, but not for the complete trajectory. Ding et al. [24] developed a BP neural networkbased model to predict the lanechanging trajectories, and they compared prediction results of the BP neural network with that of the Elman network. It was found that the BP neural networkbased model achieved better prediction performance under different sectionsand generated more reliable simulation results than the Elman networkbased model. In [25], a fully connected neural network was applied to predict the lanechanging behavior of drivers; especially, the network model input consisted of multivehicle data, and the prediction accuracy of more than 90% was achieved. Moreover, a multifeature fusion neural network [26] that takes into account the physiological factors such as driver’s head rotation was proposed to predict driver lane change behavior and the prediction accuracy exceeded 85%, while the prospective time was 1.5 s. Dou et al. [27] introduced a prediction model based on the SVM and BP neural network, which combined the results of SVM and BP neural network to improve the prediction accuracy, and the results showed that the average combined accuracy exceeded 92%. Furthermore, an MTSDeepNet using a convolution kernel to process the multivariate time series data and a fully connected neural network to classify the lanechanging behavior were designed to predict the lane changing in [28], and the accuracy of this model exceeded 91.0%. Considering the driver’s driving style, Li et al. [29] proposed a lanechanging intention estimation model based on Bayesian network and Gaussian mixture model, which achieved a good prospective time, but its accuracy is low. However, the lanechanging process is determined by both the driver and traffic environment, but the aforementioned research studies did not consider all the factors affecting the driver lanechanging behavior. Also, using a fully connected network for lanechanging prediction may cause data loss which further reduces model performance.
In summary, the existing studies have the limitations of low prediction accuracy and short prospective time. Two reasons for these limitations are “data problem” and “prediction model structure problem.” Aiming at these two problems, a hybrid neural network driven by multiple types of data is proposed. The first level of the hybrid network is composed of Seq2Seq, a variant of RNN [30, 31], which is mainly used for time series data processing to reduce invisible data loss. The second level consists of a fully connected neural network for data fusion and lanechanging classification. There are three contributions of this study. (1) Three different types of data including vehicle kinematics data, driver kinematics data, and driver physiological data are collected and used. (2) A hybrid network with a twolevel training model is proposed to deepen the number of network layers while avoiding the problem of gradient dispersion. (3) A dynamic time window algorithm is proposed to ensure the consistency and homogeneity of the model input data and extend the prediction prospective time.
The remainder of the paper is organized as follows. Section 2 describes the data source and introduces the data processing method. Section 3 elaborates the working principle and structure of the proposed Seq2SeqFC neural network, as well as its mathematical relationships. Section 4 validates the model generalization ability. Lastly, Section 5 concludes the paper.
2. Data Collection and Processing
Since the data used in this study was derived from real traffic scenarios and different sensing equipment, the original data was first filtered and segmented, and then, the timestamp alignment was conducted, and the data with the same label was extracted by a time window.
2.1. Data Collection
In the data collection process, the speed and acceleration of a vehicle were collected by the Cohdawireless dedicated shortrange communication (DSRC) installed on the vehicle. The steering angle and angular velocity of the steering wheel were obtained by a corner tester mounted at the steering wheel. The electroencephalogram (EEG) and driver’s head movement data were obtained by the brain wave analyzer. Besides, the electrocardiogram (ECG) stemmed from a heart rate tester. The number of driver’s head rotations in the horizontal direction was determined using the driving video obtained by a driving recorder. The equipment used for data acquisition is shown in Figure 1. Also, a different data was used for model training and validating. The data was collected in the same way but using different traffic routes. The route used for the collection of training data is presented in Figure 2, and the route started at the Chang’An University going through the main roads in Xi’an and ended at Xi’an Cheng’nan Passenger Transport Center. During the process of data collection, the procedure was repeated for several times. Since all data collection equipment are connected to the same PC, a timestamp is added to the header of each data based on Beijing time to synchronize the data collected by different equipment. In addition, the purchased brainwave analyzer and DSRC equipment have supporting data receiving software, which can denoise and filter the data and other selfdesigned equipment receive data using the serial communication protocol, and the data are denoised using the periodaverage padding method and PauTa criterion built into hardware. Therefore, the data output by a different equipment has been denoised and can be used directly.
It should be noted that, during the research, we found the impact of other factors on the subject vehicle (such as surrounding environmental factors,traffic environmental factors, and driving purpose) will ultimately be reflected in the driver’s control of the vehicle. For example, if the number of vehicles around the subject vehicle increases, the speed of the subject vehicle will decrease. If the subject vehicle is driving on a slippery road, its overall speed will be significantly lower than the speed on a normal road. Therefore, only the subject vehicle’s own data and the driver’s own data were collected.
2.2. Features’ Extraction
The time window is the basis for data processing, and the data in time window should contain the data referring to the entire lanechanging process,not only a part of it. As shown in Figure 3, during the lanechanging process, the β bands of brain wave signal change greatly at the beginning of the lanechanging, and vehicle’s steering angle tends to become stable at the end of the lanechanging process. Therefore, the point where the β bands change drastically is recorded as the starting point of the time window, and the stable point of the steering angle after the lane changing is completed and is recorded as the end point of the time window. Because the time taken for each lane changings is different, each lanechanging behavior has its corresponding time window. Considering the prospective time of the prediction model, the method of time window shrinking is adopted in this work. As shown in Figure 3, without considering the prediction accuracy, the endpoint of time window 2 has longer prospective time than that of the endpoint 1. Therefore, in the process of model training, the model is verified by shrinking the time window while comparing the accuracy. During data processing,the data length of a current time window is important because the RNNSeq2Seq used in this work requires input data of a fixed dimension. However,since a dynamic time window is used, a data extraction method is needed. The data extraction method not only ensures the consistency of input data but also minimizes data loss. Besides, the maximum speed of a driver’s head in a threedimensional space and the number of driver’s head rotations in the horizontal direction within a time window are directly fed to the input of the fully connected neural network without being processed by the RNNSeq2Seq.
2.3. Data Processing
Vehicle kinematics data processing. The length of the time window was dynamically adjusted, to ensure that the dimension of data extracted from each time window was equal, during the vehicle kinematics data processing, for each time series data in a time window, the maximum, minimum, average, and variance values were used. Table 1 shows the characteristic of data within a time window which were used for further data processing. When the time window was gradually shrunk, the label data of the Seq2Seq was a caudal data of the original time window, but the processing method stayed the same. As can be seen in Table 1, 15 data features were extracted and used. There are two reasons for selecting these 15 features. The first is that such feature extraction method can ensure the same dimension and consistency of model input. The second is that the selected 15 features can reflect the vehicle movement situation in a time window no matter how long the time window is, which is also the meaning of those 15 features.
Driver’s physiological and kinematics data processing. The EEG data were filtered by a bandpass filter which was a Chebyshev type II filter with a lower cutoff frequency of 4 Hz and an upper cutoff frequency of 30 Hz. The δ bands of 1–3 Hz in the EEG with an amplitude of 20–200 μV were removed because the δ wave appears in the human infant stage or immature mental development period or when an adult is under extreme fatigue and lethargy or anesthesia. Therefore, these bands do not change significantly when the driver considers the lanechanging. In addition, the θ bands with a frequency of 4–7 Hz and an amplitude of 5–20 μV, the α bands with a frequency of 8–13 Hz and an amplitude of 20100 μV, and β bands with a frequency of 14–30 Hz and an amplitude of 100–150 μV were needed. Namely, the waveforms corresponding to these three bands types changed significantly when people are in the depression state, normal state, and excited state, so they should be used as input to the prediction model because they are changeable and important pointers to the people emotions and states. Similarly, the average, maximum, and minimum values and the variance of θ, α, and β bands of the EEG data within a time window were included. Six frequency domain features are extracted from each of the three EEG channels: θ/(θ + α), α/(θ + α), (θ + α)/β, α/β, (θ + α)/(θ + β), and θ/β. The reason for selecting these six features is that Martensson’s research [32] shows that these six features can reflect the driver’s brain activity at a certain time and can also reflect the driver’s thinking while driving. Like the method of vehicle kinematics data extraction, the selected 18 brainwave data related features can reflect the driver’s brain thinking when performing lane changing.
The heart rate signal was processed similarly as the other data, and the maximum, minimum, and average values and variance of the heart rate in a time window were taken as the input data regarding the motion data of a driver’s head in a time window, and we used the speed maximum value in a threedimensional space and the number of rotations of a driver’s head in the horizontal direction. These four features were not processed by the RNN, instead they were fed directly to the input of the fully connected network because, to a certain extent, the maximum speed of head movement can reflect the urgency of the driver to change lanes. For example, if the driver frequently turns his head during driving and the speed is high, the driver has a higher probability of performing lane changing. In summary, 26 data features were extracted and used as input parameters for data processing. The extracted features and their labels are shown in Table 2. And, the 26 data features in Table 2 can reflect the driving situation of the vehicle in two aspects (vehicle and driver) when input as a model. It also ensures that the dimensions of the input data are not affected by the length of the time window.
After the original data was processed as described above, the dimension of the input data will be the same in every time window regardless of its length. In this way, data consistency during model training and testing was ensured. In Figures 4 and 5, the vehicle kinematic data and the brain wave during the lanechanging process are presented, respectively. The reason why the steering angle shown in Figure 4 does not have a negative value is, when designing the steering wheel angle measuring equipment, we set its equilibrium state (middle value) to 0 and design it as a positive value regardless of turning to the right or turning to the left.
3. Proposed Model
As already mentioned, the proposed model is based on an RNNSeq2Seq network and a fully connected neural network. The RNNSeq2Seq network is used to extract the features of the original data and predict the value of the next point which will be used as an input for the following fully connected neural network. In this work, 41 features are extracted and used for data processing. Except the data related to the driver’s head rotation, the others features are subjected to the RNNSeq2Seq processing. Therefore, the Seq2Seq has 37 inputs and 37 outputs. In addition, the fully connected neural network has 41 inputs and 2 outputs. The data processing procedure is shown in Figure 6. It should be noted that when the entire prediction model is trained, the accuracy and prospective time of the model are compared by shrinking the time window.
3.1. Seq2Seq Layer
RNN is a kind of neural network that can model data sequences. The main advantage of this neural network type is that it can process time series data well. Since the lanechanging behavior often lasts for a long period of time, if the time series data is directly processed by a fully connected neural network, there will be inevitable data loss, which will decrease model accuracy. Additionally, when processing the time series data, an RNN considers the correlation between data in the data sequence, so the use of a data in a time window can be maximized.
Since our goal is to develop the prediction model whose input is consisted of time sequence data, an RNN structural variant, and the Seq2Seq structure, containing the encoder and decoder, is chosen, which represents an enhanced version of a normal RNN and consists of an encoder and a decoder. The computational kernel of both encoder and decoder is an LSTM (long shortterm memory unit) or a GRU (gated recurrent unit).
Typical Seq2Seq structure denotes the encoderdecoder framework shown in Figure 7. The working mechanism of a Seq2Seq structure uses the encoder to map the input data to the semantic space to get a decoding vector which represents the semantics and then use the decoder to obtain the required output.
As shown in Figure 7, the encoderdecoder framework has two inputs: one is which represents the encoder input and the other is which represents the decoder input. The inputs and are sequentially passed to the network in the respective order.
Assuming that the input sequence of the encoder is and according to the RNN characteristics, the hidden state at time in the input process is a function of the current input signal and the hidden layer output in the previous given aswhere is a nonlinear activation function and the decoding vector c is the last state of . Finally, a fixedlength decoding vector is obtained. The decoder can be regarded as another RNN. When the decoding vector is sequentially fed to the decoder, the decoder output at the time t can be expressed as
Then, the conditional probability of at timeis given bywhere and are the given activation functions, and it must produce a valid probability. The two components of the encoderdecoder structure are jointly trained to maximize the conditional loglikelihood which is given bywhere θ is the set of model parameters and denotes the data pair in the training dataset.
Thus, in the Seq2Seq training, participates in loss calculation and node operation, unlike general RNN, which is used for loss supervision. Assuming the encoder input is , the calculation process is as follows:
In the above formulas, , , and represent the intermediate variables, and are the training parameters, and is the activation function. When the hidden state of the encoding process is completed, the decoding vector is given bywhere is the final value of the encoder output after epochs and is the training parameter. After the decoding vector c is obtained, the decoder starts the decoding process initializing the initial hidden state which is given by
The hidden state of the decoder at time is given bywhere and its deformations are variable training parameters. After obtaining the last hidden state, the condition probability is calculated bywhere is the output dimension. Also, it holds thatwhere are variable training parameters.
Once the Seq2Seq finishes processing a given data sequence, its output is directly fed to the input of the fully connected neural network to perform lanechanging classification.
3.2. Fully Connected Layer
A fully connected network is the most basic and simple neural network, yet such a network performs well in the multiparameter fusion, so it is generally used in the complex nonlinear classification tasks. Since the lanechanging classification represents a nonlinear task including a largeamount input data, a fully connected network is used as a classification network.
In Figure 8, are the fully connected layer inputs that are from the Seq2Seq layer output and represents the weight of the th synapse of the th neuron in the th layer. The induced local domain in the th layer iswhere is the output of the th neuron in the previous layer (i.e., the ()th layer) after iterations. For , it holds that and , where denotes the bias of the thneuron in the thlayer. Using the SoftMax function as an activation function, the output of the thneuron in the thlayer is given by
If the neuron is in the first hidden layer, then it holds thatwhere is the th element of the input vector .
Besides, if neuron is in the output layer (, is the depth of the network), then it holds that
Therefore, the error is given bywhere is the th element of the expected response vector.
After the forward propagation is completed, the backpropagation is performed to complete the weight optimization. The backpropagation is given bywhere is the local gradient and is the differentiation of an independent variable. After the local gradient is obtained, the weight updating process is performed. The weight updating process in the th iteration is given bywhere is the momentum constant and is the learning rate.
3.3. Model Training
According to the survey statistic [33, 34], most drivers are more inclined to the carfollowing behavior than to the lane changing, so there are not a lot of lanechanging data. Therefore, we had to collect the data needed for the prediction model training and testing. During the data collection process, to avoid the situation where drivers deliberately change the lanes, drivers were not told the true purpose of their trip before the driving started. In addition, uncertain factors such as the driver’s driving skills, driving style, and travel purpose will affect the predictive performance of the model. Therefore, in the process of data collection, the research team will provide the driver with some brief driving information (such as recreational driving or emergency driving) when inviting the driver and to make the data more generalized. Different drivers were invited to participate in the data collection process in order to collect as many diverse data as possible under different conditions. After data collection, 7000 features of the lanechanging behavior were extracted from the collected dataset. Since the amount of data was not large, the model was trained and validated by the 10fold crossvalidation method whose pseudocode is shown in Algorithm 1.

We used Google’s open source deep learning library TensorFlow to build the hybrid neural network described above. The computational kernel of Seq2Seq was GRU; during the backpropagation training, the Adam optimizer in TensorFlow was used, which automatically adjusts the learning rate parameters during training to avoid overfitting. The initial weights of the entire network were filled with several sets of data obeying the positive distribution, and the initial learning rate is 0.005.
The model training process and its discussion items are shown in Figure 9. As the time window shrinks, the model training process advances. The first step in training is to use the original time window; since the time window is not shrunk, there is no label data for the Seq2Seq. The fully connected neural network is added after the Seq2Seq, and since the lanechanging signs are the label of the entire network, only the accuracy is shown in this step. The second step is shrinking the time window to 2/5 of the original one, and then, the model’s prospective time and accuracy are determined. Similarly, the third step denotes the time window shrinking to 1/5 of the original size and then determines the new model’s prospective time and accuracy. After shrinking the time window, the new prediction performance of the Seq2Seq is also displayed. This time window shrink mechanism enables the model to predict lanechanging behavior using only a small part of the header data of the entire lanechanging data, instead of using all lanechanging data for lanechanging recognition. At the same time, when processing the data, we found that, within 1/5 of the data of the original window, lanechanging behavior did not occur from a macroperspective. Therefore, it is reasonable to use this part of data to predict whether the vehicle will change lanes in the future.
When there is no middle label in the model, only the accuracy is discussed. The proposed model structure is presented in Figure 10, and the loss and the accuracy after the iteration are displayed in Figure 11.
(a)
(b)
As presented in Figure 11, after iterating for 20,000 times, the model achieves the convergence state with the accuracy of 0.9858 and the loss of 0.11, which shows that the proposed model can recognize the driver’s lanechanging behavior more accurately.
The training process of the entire model after shrinking the time window is given in Figure 9. Seq2Seq uses front data of a time window to predict the caudal data. Because there are series of different input and output data as well the Seq2Seq layer which predicts the driver and vehicle status at the next time point, so the Euclidean distance is used to estimate the level of regression of the prediction result to the real result. The calculation formula is
In Figure 12, the average Euclidean distance of the model training for once and twice shrunk time window is presented. It can be noticed that, after 200 iterations, the average deviation began to converge, and the deviation fluctuated within a small range, which shows that the Seq2Seq achieved a good prediction performance. Next, as already explained, the Seq2Seq prediction results were fed to the input of the fully connected network and subjected under the 10fold crossvalidation method. The obtained prediction accuracy and the loss for once shrunk time window are presented in Figure 13. When the number of iterations reached 20,000, the model converged with the accuracy of 0.935 and the loss of 0.12. The model accuracy was reduced due to the data loss caused by time window shrinking, but the prospective time was extended. The accuracy and the loss for twice shrunk time window are presented in Figure 14, where the accuracy increased to 0.938 and the loss was reduced to 0.11.
(a)
(b)
(a)
(b)
(a)
(b)
In Figures 15 and 16, the prospective time for once and twice shrunk time window is, respectively, presented. denote the time needed that a vehicle changes a lane, are the data extraction window for the Seq2Seq training and testing, are the label extraction window for the Seq2Seq training and testing, and are the time for predicting the lane changing, which is also the prospective time. When , the prediction method presented in Figure 16 had a longer prospective time than that presented in Figure 15.
In the model verification, the method presented in Figure 15 achieved the prediction accuracy of 0.935, and the model presented in Figure 16 achieved the prediction accuracy of 0.938. Thus, the latter method had better prediction performance. Besides, after statistical analysis of the data used in the paper, we came up with such a mathematical relationship:
The time cost and the corresponding number of lane changings are presented in Table 3, where the lane changing conducted in the interval 34s accounted for 47% of the total number of lane changings, which means the prospective time was 1.82.4s. Therefore, the average perspective time of the model is 2.1s. This result indicates that the model could predict the lane changing well and achieve a high prediction accuracy. The comparison between the time cost of lane changing and the prediction time cost after sampling 50 times in all lane changing is presented in Figure 17, and the presented result shows that the prediction time was much shorter than the lanechanging time.
4. Model Validation
After the model was trained, all the weight parameters in the prediction model were optimized. In order to test the generalization ability of the developed model, a different data was used for model training and testing. Also, different drivers were invited to drive vehicles to collect data on another route to obtain the test data. The routes used for validation data acquisition is presented in Figure 18. This route also started at the Chang’an University but ended at the Xi’an Chengbei Passenger Station, and the route direction was different compared with the one used for collection of training data presented in Figure 2. In addition, according to the data scale, the detailed structure and parameters of the cascade model, as shown in Table 4, were determined.
The model was validated using the same data processing method as that used for model training, but the validation included only one input data, without any label, the data was used as the network input, and the focus was on the comparison between the lane changing predicted by the network and the real data. The crossvalidation was performed using the data of 3000 lane changings. The accuracy of each validating batch and the average accuracy are presented in Figure 19, where the model accuracy decreased in the first few validating batches, but the overall performance was good, and the average accuracy of the test exceeded 93.5%. Moreover, the prospective time was the same as that of the training. The prospective time for randomly extracted 50 adjacent lanechanging data was 1/5 of the entire lanechanging time, as shown in Figure 20.
The proposed model was compared with the five most commonly used prediction models. The comparison results are presented in Figure 21 and Table 5. The performance parameters of MSTNet all come from the literature [27], so Table 5 does not include the calculation time of MTSDeepNet. The accuracy of the Dynamic Bayesian Network, the Decision Tree, and the SVM prediction was 75.0%, 84.0%, and 91.1%, respectively, and the accuracy of the BP neural network and the MTSDeepNet was 91.6% and 92.0%, respectively. The accuracy of the proposed Seq2SeqFC structure was 93.5%, which was better than that of the other five algorithms. Furthermore, although the computation time of the proposed model is higher than several other methods, the microsecondlevel increase does not affect applications such as vehicle anticollision which are built on it.
To better illustrate the superiority of the model proposed in the paper, it is necessary to illustrate the data and algorithms used in the six comparison models. As shown in Table 6, it is the detailed information of the six models in the experimental process.
The word “other” means that the algorithms and data used in the comparison model are from other literatures; on the contrary, “this paper” indicates that the algorithms and data used are all derived from this paper. The accuracy of MSTDeepNet comes from [28], and the BP, SVM, DBN, and decision tree use the data which is extracted in this paper and the algorithms used in other literatures. In the experiment, we input the 41 values which are from second shrinking of the time window into different classification algorithms to obtain the above accuracy. The comparison results show that this paper has certain advantages in terms of data or prediction model.
5. Conclusion
In this paper, a Seq2SeqFC neural network for prediction of driver lanechanging behavior is introduced. The proposed model has two levels, where the first level denotes the Seq2Seq network whose function is to process the time series data and the second level denotes a fully connected neural network which works as a nonlinear classifier. In the proposed prediction model, the vehicle kinematics data (VKA), the drivers’ kinematics data (DKA), and the drivers’ physiology data (DPA) are used as input data for the fully connected network. In addition, a dynamic time window is proposed to extract the features of the lanechanging process, and the method of time window shrinking is successively used to train the prediction model and improve the prospective time. Moreover, model testing was performed by data different from that used for model training to evaluate the model generalization ability. The test results showed that the proposed prediction model achieved a good performance.
In the data collection process, 35 drivers took part, and different routes were used for training and test data. The collected data consisted of 10,000 lanechanging samples, of which 7000 samples were used for model training, and the rest was used for model validating. The validating results proved the effectiveness and stability of the proposed model. Moreover, the proposed Seq2SeqFC model was compared with five common prediction models: BP neural network, SVM, dynamic Bayesian network, decision tree, and MTSDeepNet. The comparison results showed that the Seq2SeqFC network achieved higher prediction accuracy and longer prospective time than other models. The results presented in this study can be helpful to improve the practical effect of the ADAS and enhance lanechanging safety.
In our future research, we will improve the proposed model from several aspects. Firstly, many researchers have demonstrated that driver’s decision to change a lane is also affected by vehicle type and driver’s driving skills [36, 37]; for instance, a car has different lanechanging factor compared with a bus. However, in this study, we did not consider vehicle type because we used the vehicles of the same type. Therefore, in our future work, we will take different types of vehicles into account. Secondly, although many different road conditions were included in the traffic route, some of the road types such as rugged mountain road and country road were still not included, but they will be considered in our future research. Thirdly, since the neural network considers the fuzzy relationship between input and output, input data redundancy can be caused, and model calculation speed can be reduced. However, the sensitivity analysis can be used to eliminate some variables that have little effect on model classification, which may make the model more optimized and concise.
Data Availability
The data used to support the findings of the study are available from the first author ChengWei ([email protected]) upon reasonable request.
Conflicts of Interest
The authors declare that there are no conflicts of interest regarding the publication of this paper.
Acknowledgments
This work was supported by National Key R&D Program of China (Grant no. 2018YFB1600604).