Abstract

An accurate prediction of future trajectories of surrounding vehicles can ensure safe and reasonable interaction between intelligent vehicles and other types of vehicles. Vehicle trajectories are not only constrained by a priori knowledge about road structure, traffic signs, and traffic rules but also affected by posterior knowledge about different driving styles of drivers. The existing prediction models cannot fully combine the prior and posterior knowledge in the driving scene and perform well only in a specific traffic scenario. This paper presents a long short-term memory (LSTM) neural network driven by knowledge. First, a driving knowledge base is constructed to describe the prior knowledge about a driving scenario. Then, the prediction reference baseline (PRB) based on driving knowledge base is determined by using the rule-based online reasoning system. Finally, the future trajectory of the target vehicle is predicted by an LSTM neural network based on the prediction reference baseline, while the predicted trajectory considers both posterior and prior knowledge without increasing the computation complexity. The experimental results show that the proposed trajectory prediction model can adapt to different driving scenarios and predict trajectories with high accuracy due to the unique combination of the prior and posterior knowledge in the driving scene.

1. Introduction

Since the 1980s, autonomous vehicles have been regarded as effective solutions to the problems of road safety, traffic congestion, and energy crisis. However, autonomous vehicles still face many driving difficulties in the real urban traffic environment. A major problem is how to interact safely and reasonably with other types of vehicles in a driving scene. Experienced human drivers can predict the future trajectory of other vehicles in a driving scene, thereby making safe, reasonable, and efficient decisions. Accurately predicting the future trajectory of a vehicle not only can reduce or eliminate the collision risk when autonomous vehicles perform complex driving maneuvers, such as merge, lane change, and overtaking, but also can improve the driving efficiency and comfort of autonomous vehicles [1]. In a real urban traffic scenario, the vehicle’s driving trajectory is not only constrained by prior knowledge, such as that about the road structure, traffic signs, and traffic rules, but also by uncertain posterior knowledge, including subjective driving intentions of the driver. The influence of driving knowledge on vehicle trajectory is shown in Figure 1, where it can be seen that when the road structure constraints are not considered, the predicted future trajectory, denoted as the red curve, is incorrect. As shown in Figure 1(b), there is a large slow-moving truck in front of the target vehicle. In such a case, based on human driving experience, the target vehicle is likely to adopt a lane change strategy. Therefore, how to fully combine the prior and posterior knowledge in a driving scene in the prediction process is crucial for accuracy improvement of the long-term trajectory prediction and safe interaction with other vehicles.

According to the specific prediction process, the existing prediction models can be roughly divided into three categories: physics-based models, maneuver-based models, and learning-based models [2]. The physics-based models use vehicle kinematics and dynamics model to predict the future position of a target vehicle, and they include the constant turn rate and acceleration model [3], switching Kalman filters [4], and Monte Carlo simulation [5]. However, these models ignore the prior and posterior knowledge about a driving scenario, such as road structure, traffic rules, and driver’s subjective intentions, which limits these models to short-term prediction (less than 1 s) [6].

Maneuver-based models divide the prediction process into two parts. First, driving intention is estimated according to the physical state of a vehicle, information about the road network, and driver behavior, and then the predicted trajectory is fitted based on the driving intention. For maneuver classification in more complex scenarios, discriminative learning algorithms, including the multilayer perceptions (MLPs) [7], logistic regression [8], relevance vector machines (RVMs) [9], and support vector machines (SVMs) [10], have been very popular. Complex vehicle motion is decomposed into predefined driving action sequences, which makes driving intention easier to identify and classify, and the prediction result is more stable and accurate than that of the physics-based models, and the prediction horizon is longer. However, in complex traffic scenarios, the traditional algorithms, such as finite vector machines and conditional random fields, have the problem of low scene adaptability, while Bayesian network and Markov model can solve the problem of driving maneuver classification in uncertain environments. In addition, the state space of the above models is extremely large, and these models are prone to “curse of dimensionality,” and unable to real-time prediction. The approach of [11] predicts the fictive collision probabilities stemming from the execution of each intention by updating a prior intention distribution based on postulation that drivers do not perform maneuvers with high collision risks, but this assumption prevents the detection of specific dangerous maneuvers that are conducive to driving efficiency. Recently, artificial neural networks have been used to classify vehicle driving actions, but the existing high-quality calibrated datasets are limited and cannot cover all possible driving scenarios (data sparsity) [12], which makes the network training difficult and challenging, and scene adaptability is low.

Learning-based models skip the step of maneuver recognition and perform trajectory prediction directly based on the historical observation of a target vehicle, so the posterior knowledge in driving scenarios can be effectively learned, and incorrect driving motion recognition can be avoided. Recently, artificial neural networks have been used to predict future trajectories of vehicles, bicycles, and pedestrians [1315]. As a type of recurrent neural network (RNN), the long short-term memory (LSTM) neural network has been proven to be very effective in solving the time series problems, and thus has been widely used in pedestrian trajectory prediction, intersection vehicle destination prediction, and highway vehicle trajectory prediction. However, in previous works, specific-scenario models, such as lane change models for nonintersection sections and left-/right-turn models for intersection areas, have been proposed [10, 16, 17], and the training data needed manual annotation, which increased the training difficulty of the model. In [18], an encoder-decoder LSTM model is proposed for predicting vehicle trajectory by using an occupancy grid map, and the maximum prediction horizon of this model is two seconds, which is not sufficient for applications.

When an intelligent vehicle is driving in a real urban environment, the driving scene changes dynamically over time, which means that the prediction model should automatically adapt to a driving scene. In order to solve the problem of vehicle adaptability to the driving scene, many studies incorporated the maneuver-based and learning-based models. In [19], two LSTMs were used to identify high-level driver intentions and analyze low-level complex vehicle motion dynamics. This method is better geography-adaptive than the traditional LSTM networks. An LSTM model for interaction aware motion prediction of surrounding vehicles on freeways was presented in [20]. This model assigns confidence values to maneuvers being performed by vehicles and outputs a multimodal distribution over future motion based on these values. The mentioned methods predict the multimodal trajectory based on maneuver classes, which improves the road adaptability, but the prior knowledge in driving scenarios is not used. In [21], a long short-term memory (LSTM) network was employed to anticipate the driving policy of a vehicle (such as forward, yield, turn left, and turn right) using its sequential history observations. The policy was then used to guide a low-level optimization-based context reasoning process. This method combines the prior knowledge in the driving scene and constructs the cost map to perform the second optimization of the previously obtained driving intention to generate the final predicted trajectory, but the driving intention estimation of the upper-level does not utilize the prior knowledge of the driving scene, and the weight of the function cannot be adjusted adaptively to a driving scenario. Deo and Trivedi [22] adopt a convolutional social pooling LSTM-based model. This approach predicts a distribution of future vehicle trajectory dependent on maneuver, but this approach ignores the impact of the interaction of the road users. Dai et al. [23] proposed a spatiotemporal LSTM-based model, which considers the spatial interactions of the surrounding vehicles, but the constraints of other prior knowledge such as road structure, traffic rules, and driving experience are not considered. The dual learning model (DLM) which takes information from two different inputs to predict vehicle trajectory was presented in [24]. This model embeds the occupancy map and risk map into the trajectory model to consider a comprehensive definition of risk in the traffic scene, but the computational complexity usually grows exponentially if the dimensionality of the feature space increases. Thus, it becomes difficult to meet the online requirement.

In this article, an integrated trajectory prediction model, which combines knowledge reasoning and LSTM neural networks, is proposed. The contribution of this study can be summarized as follows:(1)In order to consider the constraints of the prior knowledge. The prediction reference baseline obtained by knowledge reasoning is introduced into the LSTM network, where the proposed model can effectively combine the prior knowledge without increasing the computation complexity.(2)In order to learn the spatial interactions of the surrounding vehicles and solve combinatorial explosion problem caused by a large number of condition attributes. A method of deterministic scene evaluation is employed to classify and analyze the main conditions that affect the future trajectory of a vehicle from the perspectives of safety, legitimacy, and reasonableness, which simplifies modeling of the spatial interactions.(3)In order to improve the adaptability of the proposed model. The Frenet coordinates based on the PRB are used to train the LSTM network, and it is not necessary to annotate the training data set manually according to the specific driving scenario. The results of the field test prove the adaptive performance of the proposed model.(4)The performance of the proposed model is evaluated with state-of-the-art methods on a naturalistic highway driving dataset (NGSIM), and the results show that our proposed model outperforms the state-of-the-art methods.

The rest of the paper is organized as follows. The prediction reference baseline determination method and the proposed LSTM network are presented in Section 2. The proposed prediction model is evaluated by both simulations and real-traffic urban roadway experiments, and the obtained results are presented and discussed in Section 3. Finally, the main conclusions, limitations, and future work are presented in Section 4.

2. Materials and Methods

2.1. Problem Formulation and Method Overview
2.1.1. Problem Formulation

The proposed trajectory prediction model is divided into two layers. The first layer determines the PRB of a target vehicle, and the second layer predicts the future trajectory based on the PRB. PRB is a trajectory that indicates the driving intention of the target vehicle based on prior driving knowledge, which connects the online reasoning system and the LSTM network.

The process of driving intention prediction for a target vehicle at time is presented in Figure 2, where it can be seen that it is necessary to understand and evaluate driving scene , and generate the scene evaluation parameters of the target vehicle, including the safety assessment , legitimacy assessment , and reasonable assessment , which is expressed as

According to the prior knowledge of driving scenarios, such as traffic rules and driving experience, the driving intention of the target vehicle is inferred based on the prolog online reasoning system. A maneuver is classified by the lateral movement of the vehicle, which is expressed by a finite set :

Finite set includes the following maneuvers: lane keeping (LK), lane change to left (LCL), lane change to right (LCR), turn right (TR), turn left (TL), go straight at intersection (GS), and stop before the stop line (SS).

Finally, driving intention is fitted to the PRB by the cubic Bezier curves.

The second layer predicts the future vehicle trajectory. First, the coordinate transformation is performed on the historical trajectory of the target vehicle based on the PRB , as shown in Figure 3. In Figure 3, denotes the distance the target vehicle has traveled along the PRB, and is the transverse distance between the target vehicle and the . The absolute position denoted as is transformed to the Frenet coordinates that are denoted as . The set of observation vectors denoted as is used for trajectory prediction of the target vehicle, and it is given bywhere a set denotes the Frenet coordinates, denotes the curvature, represents vehicle heading, denotes the vehicle speed, denotes the vehicle acceleration, and M is the input step of the network.

The network output is expressed aswhere K denotes the output step of the network.

Finally, the predicted trajectory denoted as is obtained by transforming the reference coordinates to the absolute coordinates represented by the latitude and longitude, which is expressed as

2.1.2. Overview of the Proposed Approach

This paper proposes a trajectory prediction model based on knowledge reasoning and LSTM neural network. The architecture of the proposed model is shown in Figure 4, where it can be seen that the proposed model consists of two phases: PRB determination phase and trajectory prediction phase. During the PRB determination phase, by analyzing the relationship between “human-vehicle-road” in the driving scene and extracting the knowledge of road network, traffic participants, and road traffic facilities, the conceptual ontology model of the driving scene is established. The main conditional attributes that affect the behavior decision-making process are classified and analyzed from the perspectives of safety, legitimacy, and reasonableness using the proposed deterministic situation assessment method, and situation parameters in the horizontal and vertical directions are obtained. The behavior prediction rule base is constructed using the situation parameters, traffic rules, and driving experience. Based on the prolog online reasoning system, the behavioral prediction rules are matched with the factual knowledge obtained by the conceptual ontology model, and the driving intentions are inferred. Finally, a third-order Bezier curve is used to fit the driving intention to a PRB. The trajectory prediction phase uses the LSTM network to learn the continuous features of the historical trajectory of a target vehicle on the basis of the PRB and generates the final predicted trajectory.

2.2. Prediction Reference Baseline Determination

The architecture of the proposed PRB determination method is presented in Figure 5, where it can be seen that this method consists of online and offline phases. The offline phase establishes the conceptual ontology model (Tbox) of a driving scene and extracts the behavioral prediction rules based on traffic rules and driving experience. According to the conceptual ontology model, the road network and real-time environment perception information are used to instantiate the entities and related relationships in the driving scene (Abox). The entities and entity relationships in the driving scene are classified by the deterministic scene assessment method and analyzed from the perspectives of safety, legitimacy, and reasonableness. The scene evaluation parameters in both horizontal and vertical directions are generated, and the behavioral prediction rules are matched with the scene evaluation parameters by using the prolog online reasoning system. The driving intentions are inferred, and finally, the third-order Bezier curve is used to fit the driving intention to the PRB.

2.2.1. Semantic Modeling of Driving Scene

In a driving scenario, there are various road element entities, such as traffic participants, road networks, and road traffic facilities in urban driving scenarios. The environment perception system can provide only the spatial location of each entity, but it cannot describe the correlation between entities, and make full use of prior information, such as traffic rules and driving experience, which is crucial for improving the prediction model adaptability to the driving scene. Ontology, as a form of knowledge expression, is used to model the concepts of specific domains and relationships between concepts, which can be used to model driving scenarios effectively [2527].

The conceptual ontology model is divided into two module types: entities and attributes. This study takes the target vehicle as a perspective and summarizes five entity types on the basis of [28]:(1)Target vehicleThe target vehicle entity describes the vehicle to be predicted.(2)BehaviorThe behavior entity is a collection of driving maneuvers of a vehicle. Three behavior types are designed: LongtiBehavior, LatiBehavior, and AdvancedBehavior. The LongtiBehavior represents basic vertical driving behavior and includes four behaviors: accelerate, decelerate, keep, and stop. The LatiBehavior represents basic horizontal driving behavior and includes three behaviors: ChangeToLeft, ChangeToRight, and KeepLane. The AdvancedBehavior represents advanced driving behavior and includes two behaviors: Overtake and Merge.(3)ObstacleThe obstacle entity represents a collection of obstacle entities encountered by a vehicle during driving. This work divides obstacles according to the behavior characteristics of obstacle entities in driving scenarios into two categories: StaticObstacle and DynamicObstacle.(4)Road networkThe road network entity represents the topological connection of roads by intersecting points and lines. RoadType includes different road types. RoadPart describes the components of the road network and is divided into AreaEntities and PointEntities. AreaEntities refers to road entities that can be abstracted into lines and areas, such as lane, side walk, junction, and segment, while PointEntities refers to road entities that can be abstracted into points, such as road signs, traffic signs, and traffic lights.(5)Driving scenarioThe driving scene entity refers to a collection of road entity elements encountered when a vehicle travels in different road areas. In this work, driving scenarios are divided into three categories: InSpecialAreascenario (special area driving scenario), OnRoadscenario (road driving scenario), and NearSpecialAreascenario (near special region driving scenario). InSpecialAreascenario category can be further divided into IntersectionScenario (intersection scene), TunnelScenario (tunnel scene), BridgeScenario (elevated scene), and UturnScenario (U-turn scene).

The object attribute is used to describe the relationship between concept classes. This attribute restricts the described relationship regarding the domain and range. The data attribute restricts the described relationship through the definition and value domains. The definition domain is a class type.

The described ontology modeling process of driving scenario is equivalent to filling the background knowledge of the TBox that constitutes the ontology knowledge base, but the situational knowledge in the ABox is still lacking. According to the road elements of a real driving scenario, the driving scenario needs to be re-expressed using the conceptual model of the TBox, which is an instantiation of the ontology model. A real-traffic scenario is displayed in Figure 6(a); a concrete driving scene that includes instances of defined classes is presented in Figure 6(b), and its semantic description is presented in Figure 6(c). The instances of RoadNetwork are added to the ABox as prior knowledge, and instances of the obstacle are asserted in real time.

2.2.2. Situation Assessment

After obtaining a semantic description of a driving scene, it is necessary to determine and evaluate condition attributes that affect the driving intention in the driving scene, so as to estimate the driving intention of a target vehicle. In order to solve the problem of combinatorial explosion due to numerous condition attributes [28], a deterministic scenario assessment method is adopted to classify and analyze the key attributes that affect driving intentions from the perspectives of safety, legitimacy, and reasonableness.

Deterministic scenario assessment methods use the threat assessment indicators: TTC (time to collision), THW (time headway), TTB (time to brake), DST (deceleration to safety time), and MSM (minimal safety margin) in rule-based systems, and the probability of collision is estimated as a binary value. For instance, Glaser et al. [29] used the TTC and TIV (time intervehicles) indicators to evaluate the possibility of collision. Noh et al. [30] proposed a distributed reasoning method by dividing the current and adjacent lanes into the front and rear areas, and the TTB and MSM indicators were used to evaluate the possibility of collision in the front area, while the TTC and MSM indicators were used to evaluate the collision of rear area collision possibility.

The proposed deterministic scenario assessment method consists of two parts. First, the driving scenario is determined by querying the knowledge base with the current vehicle position. Then, a reasoning structure of an obstacle is constructed in eight regions of interest to make safety assessment, and a binary result (safe or dangerous) is calculated for each region using critical indicators TTC and TIV. Finally, legitimacy and reasonableness assessments are made to predict the maneuver of the target vehicle.

(1) Safety Assessment. Safety primarily refers to whether the surrounding obstacles pose a threat to a vehicle, especially in the area ahead, but it also refers to whether the left or right lane can provide a safe lane change. This paper constructs the eight-direction obstacle inference model. For each area, the TTC and TIV indicators are used for safety assessment. The TTC indicator is defined as a time when two vehicles continue to collide on the same trajectory at the current speed, and it is defined bywhere denotes the relative distance between the following vehicle and followed vehicle, denotes the speed of the following vehicle, and is the speed of followed vehicle.

The threshold value is used to judge whether a vehicle is dangerous in high-speed scenarios. The risk assessment formula is as follows:and when the calculated collision time t between the following vehicle and followed vehicle is greater than , the current scene is considered to be safe; otherwise, it is considered to be dangerous.

The TIV indicator is used to detect low-speed difference scenarios. When the speeds of two vehicles are similar in value, the TIV indicator is used to judge the degree of danger, and it is calculated by

Threshold is used to distinguish between safe and dangerous scenes in low workshop distance scenes. The TIV risk assessment formula is as follows:

When the calculated vehicle interval time t between the following vehicle and followed vehicle is greater than , the current scene is considered to be safe; otherwise, it is considered to be dangerous, and in that case, the following car needs to perform a certain action to avoid a possible collision.

In each region, only when and are calculated safety synchronously, the region is considered to be safe. The risk assessment of an region is determined aswhere represents one of the eight regions, as shown in Figure 7; when has a value of zero, the area is considered to be safe, and when has a value of one, the area is considered to be dangerous.

The degree of danger in the area ahead is calculated byand when the calculated values of the TTC and TIV indicators in the current area are greater than the predefined threshold, the following result message is obtained: safeToGo (targetVehicle, keep); when the calculated values of TTC and TIV indicators in the current area are both less than the predefined deceleration threshold, but greater than the corresponding parking threshold, the result message is safeToGo (targetVehicle, dec); in this case, a lane change can be performed to improve driving efficiency; otherwise, the result message is safeToGo (targetVehicle, stop).

The safety assessment of adjacent lanes is conducted using the same assessment formula as that of the area ahead. If the left lane is taken as an example, then the degree of danger is expressed asand when there are vehicles in the left area, the left lane can be considered to be dangerous; otherwise, the safety of the left front and left rear areas are, respectively, evaluated by

Therefore, the safety assessment of the left lane is as follows:

If the left lane is safe, will be generated; otherwise, the scene evaluation parameters will be instantiated as .

(2) Legitimacy Assessment. The legality assessment includes three assumptions. First, when a vehicle is driving on the road, it cannot exceed the maximum speed limit of the road; second, when the vehicle is driving to the preintersection, it is necessary to pay attention to the change in traffic lights and obey the traffic rules; third, when a vehicle is about to change the lane, the adjacent lane should allow lane changes.

(3) Reasonableness Assessment. Reasonableness assessment generally refers to whether lane changing and other driving behaviors affect the current goal of a target vehicle. Based on the current lane of the target vehicle, the specific road section or lane to be driven can be known. For instance, on the one hand, if the next area to be driven by the target vehicle is an intersection, and the distance between the target vehicle and the stop line is less than , then lane change is not recommended. On the other hand, if the distance between the target vehicle and the stop line is greater than , lane change can be performed. Reasonableness assessment introduces a situation parameter set , which is defined as data properties in the ontology model.

2.2.3. Rule-Based Reasoning

The driving intention is determined based on traffic rules and driving experience, where the traffic rules are mainly used to limit the driving behavior while the driving experience is utilized to summarize the understanding and cognition of human drivers in different scenes and obtain some rules that are not specific traffic rules but are conducive to the reasonable driving. According to the different driving scenarios defined in the driving knowledge base, the rules stored in the driving knowledge base are divided into several categories. Different scenarios have different key road entities. For instance, unlike , in , traffic lights are considered. Also, the classification of traffic rules can reduce the rule search space and reasoning time.

In order to save computing resources and reduce reasoning time, SWI-Prolog language is used to write rule knowledge, which is represented as a set of driving scene-driving behavior mapping pairs, where driving behavior is described as a rule head, and the driving scene is described as a rule body. On the basis of [28], this paper adds the scene assessment as an intermediate link of mapping driving scene to the driving behavior and reorganizes 57 rules. Some of the prediction rules are presented in Table 1.

The online reasoning process can be described as follows. First, the real-time facts related to the scene are used as input, and each rule statement is matched. If all the facts of the corresponding rule are matched, the matched prediction result will be obtained, and the next rule statement will be matched until each rule is matched. When all matched results are obtained, the final result denotes the predicted driving intention.

2.2.4. Prediction Reference Baseline Fitting

After obtaining the driving intention of the target vehicle, the driving intention is converted into the prediction reference baseline using the cubic Bezier curves. As shown in Figure 8, first, a target lane is selected based on the driving intention and road network, where represents the current position of the target vehicle, and is selected from the centerline of the target lane with distance from , and is obtained based on the driving experience. The prediction reference baseline is divided into three parts by and : the predicted extension, the historical extension, and the intention segment. In addition, and are determined by the input and output steps of the LSTM neural network.

As shown in Figure 8(c), the cubic Bezier curve constructed by four control points is used to generate the intention segment, which is expressed aswhere is the Bernstein polynomial and it is given by

The coordinate system () is built with the origin at the vehicle center. The x-axis direction is the vehicle’s initial heading, the terminal state will be the end point , and and are obtained by moving forward for distance d along the vehicle’s initial heading direction from the start point and backward for distance d along the terminal heading from the end point , respectively. The position of the control points in the above coordinate system is expressed aswhere and are lateral and longitudinal offsets of the terminal state to , respectively; is the angle between the terminal heading and the direction of the x-axis. The terminal heading is defined as the tangential direction of the closest point on the PRB to .

Equations (15) and (16) can be rewritten by applying (17), so the Bezier curve can be represented as

Besides, the curvature of the generated path can be derived by applying (19) and (20), which leads to

The maximum of the curvature should satisfy the condition given by equation (21) to meet the vehicle’s nonholonomic constraint:

In equation (21), L denotes the vehicle wheelbase and denotes the maximum steering angle of the vehicle.

The maximum of the curvature is a function of d. The suitable value of d that satisfies the vehicle’s nonholonomic constraint can be found by brutal searching from to , where denotes the distance between and . The processing time can be reduced by building a look-up table that matches a given set with the corresponding maximum curvature of the Bezier curve.

2.3. LSTM Network Driven by Knowledge

Since different drivers have different driving styles, in order to accurately predict the future trajectory of a vehicle, in this work, an LSTM neural network is employed to learn the continuous features of the historical trajectory. The LSTM is an RNN type that can effectively overcome the problem of gradient disappearance [31]. The LSTM is composed of a unit memory that stores the previous input sequence information and a gating mechanism that controls the information flow between input, output, and unit memory. There are three gates in the core design of the LSTM network, namely, the input gate, the forget gate, and the output gate. The specific network structure is shown in Figure 9. The forget gate is used to control how much information is retained in . The input gate determines how much information of remains in , and finally, the output gate determines how much information in the output is output to by the control unit . The work of the LSTM is described by the following recursive equations:where denotes the input vector, denotes the activation function, denotes the linear transformation matrix, denotes the offset vector, are gate vectors, represents the amount of cell memory, and lastly, denotes the output.

In this work, the network presented in Figure 10 is used as a reference structure. This network has two layers consisting of 256 LSTM cells, followed by one time-distributed layer consisting of 128 neurons, and the final dense output layer containing as many cells as the number of outputs. The network input is a tensor of track histories of a vehicle. The network output consists of the future coordinates and velocity of the vehicle. Since the prior knowledge about the driving scene is expressed by prediction reference baseline, the network can learn the posterior knowledge about the driving scene only from the relative relationship between the historical trajectory and the prediction reference baseline. Compared with the existing prediction models based on the LSTM network, the proposed prediction model reduces the network training difficulty and decreases demand for the computing performance of the vehicle platform.

3. Results and Discussion

3.1. Data Preparation and Model Training
3.1.1. The Training Dataset

The next generation simulation (NGSIM) dataset in I-80 and US101 sections is used for model training and testing [32], and this dataset is derived from the US Federal Highway Administration, which is currently the largest public natural driving public data source, and thus has been widely used in the literature [33, 34]. The layouts and top-down views of the US101 and I-80 sections are shown in Figure 11. Each data frame includes many vehicle’s parameters, including the position, velocity, yaw rate, size, and others. The sampling frequency of the dataset is 10 Hz; therefore, in this work, is set to 0.1 s.

3.1.2. Data Preparation

The vehicle positioning data in the NGSIM dataset are obtained by video analysis, so the recorded trajectory contains a lot of noise [26]. Therefore, the vehicle kinematics model and the road geometric are used to filter the original data, which is expresses as

The vehicle position is transformed to the Frenet coordinates based on the centerline. As shown in Figure 12(b), the centerline of each lane is extracted and fitted using the shapefile, and the centerline that the target vehicle was initially driven is selected as a reference baseline. For each original coordinate point , the corresponding mapping point on the reference baseline is determined, and the Frenet coordinates are obtained bywhere denotes the Euclidean distance between and , and denotes the length from the mapping point to the starting point of the reference trajectory.

In addition, four other features, curvature , velocity , acceleration , and heading , are also selected so as to compose the observation vector with the Frenet coordinates .

3.1.3. Training Details

There were 8311 filtered trajectories; 80% of the trajectories were selected as the training set, 10% as the test set, and the remaining 10% was used as the verification set to observe if the model is overfitted.

The network was trained using minibatches with a size of 64. Due to the limitation on a sensor measurement range and noise in practical application scenarios, it was difficult to track dynamic vehicles stably for a long time, so the network was trained using windows that consisted of 30 inputs, representing a total of 3 s past observations. The Adam optimizer was used; the learning rate was 0.0005, and ReLU activation with  = 0.1. The loss function adopted the MSE (mean square error) between the predicted sequence and the ground truth sequence; the code used to generate the model was written in Keras, and the training was performed on NVIDIA GTX 2080 s GPU using the TensorFlow backend. The model training contained 16 epochs, and the average training time for each epoch of the full training set is around 2300 seconds.

3.2. Testing Results and Discussion
3.2.1. The Impact of the Prediction Reference Baseline

To investigate the impact of considering prediction reference baseline on the accuracy of the proposed method, we test the RMSE performance of the proposed modelwith three modifications using the NGSIM dataset.

In one experiment, the system is trained and tested with the absolute coordinate, in the second, the centerline that the target vehicle was initially driven is selected as a PRB, and finally, in the third experiment, the PRB of the target vehicle is determined by the method in Section 2, while the other attributes of the three models are unchanged. Figure 13 shows the accuracy of the trajectory prediction for different time horizons, and the RMSE value of the model is decreased by adding the PRB for both lateral and longitudinal trajectories.

3.2.2. Comparative Study

To evaluate the proposed approach, we purse a direct comparison with state-of-the-art vehicle trajectory prediction using the same dataset (i.e., NGSIM). The results show that the proposed method outperforms the state-of-the-art model and decreases the overall RMSE value of the system by 10 percent on average. Table 2 summarizes the RMSE values comparing the proposed methods with the baseline trajectory prediction models in the literature [20, 2224, 35].

The comparison results show that the proposed knowledge-driven LSTM network has better performance in RMSE for NGSIM. Note that as compared to the baseline [24], the prediction accuracy gets a slight improvement, but the proposed method enhances the real-time performance and much reduces the computational complexity due to the reduction of the feature space dimension.

3.3. Simulation Results and Discussion
3.3.1. Simulation Experimental Platform

The simulation experiments were based on the JAC’s automatic driving hardware-in-the-loop test platform. The experimental simulation platform is presented in Figure 14, where it can be seen that the experimental platform included the real vehicle braking system, steering system, sensor system, and network communication system, which had dSPACE (Matlab/Simulink) as a core. The controller rapid prototyping platform was built, virtual reality interfaces and environment-aware sensor modules were provided using the PreScan software, and the CarSim software was used to run the vehicle dynamic model and provide a platform that could be quickly verified for automatic driving algorithm testing.

3.3.2. Simulation Results

The simulation scenario shown in Figure 15 was established according to the real urban traffic scenario. Two typical traffic scenarios were selected to verify the adjustment effect of the PRB on the predicted trajectory. Figure 16(a) shows the driving scene on the road, and Figure 16(b) shows the driving scene at the intersection. In the first scene, three PRB intent segments were fitted to lane keeping (LK), lane change left (LCL), and lane change right (LCR), as shown by the blue curve in Figure 16(a). The historical observation vector of the target vehicle denoted the network input, and it was unchanged; the output network vector was converted according to three PRB to obtain three predicted trajectories, as shown by the green curve in Figure 16(c). In Figure 16(b), the intersection driving scenario is presented, where two PRB intent segments of go straight (GS) and turn right (TR) are fitted, respectively; the converted network output results are shown by the green curve in Figure 16(d). Since the network learns the relative relationship between the historical and PRB, even at the same network input, the predicted trajectory will be affected by the prediction reference baseline. The experimental results prove that the priori knowledge about the driving scene can be used to adjust the predicted trajectory effectively based on the prediction reference baseline.

3.4. Real-World Urban Traffic Scenarios
3.4.1. Experimental Platform Construction

In order to verify if the simulation results obtained in coincide well with the real-world scenario results, an instrumented vehicle was used to collect data, as shown in Figure 17(a). The vehicle loading sensors included an IBEO four-layer laser scan instrument, a Velodyne HDL-64E lidar, two high-resolution cameras, and a differential GPS/INS (SPAN-CPT) system. The sensor configuration of the vehicle and its sensing range are shown in Figure 17(b). The differential GPS module provided the information on the position, speed, and heading of the ego vehicle. Based on our previous work [36], moving obstacles were detected and tracked by a four-layer laser scanner, which was located at the front of the vehicle. According to the space-time relationship between the moving obstacles, such as pedestrians and vehicles, and experimental vehicles, the position, speed, size, and type of the sports vehicles can be measured. We conducted a real vehicle experiment in Hefei, Anhui Province, China. The test road is shown in Figure 18(a). The prediction model was exemplarily implemented on NVIDIA Xavier platform using the C++ programming language.

Before conducting the actual vehicle experiment, the high-resolution maps were collected to establish based on our experimental vehicle. There were more than 1,140 road entities on the map, including the stop signs, lane markings, and lane lines, covering approximately 8 km of the roadways (as Figure 18 shows).

3.4.2. Field Test and Discussion

The experimental driving route was located on a typical urban roadway, with a total length of about 4.7 km. It includes multiple intersections, Y-shaped intersections, T-shaped intersections, and other common urban road scenarios. Due to the long experimental route and a large number of scenes encountered, it was inconvenient to conduct the prediction process for each scene. Therefore, two typical scenes were selected for detailed trajectory prediction process analysis.

In Scenario 1, vehicle054 was on the road. The input conditions of the scene evaluation module are shown in Figure 19(b). In front of the target vehicle, there was a large truck denoted as vehicle055 that was moving with a speed of 5 km/h. At that time, the speed of vehicle054 was 24 km/h. The calculated value of the TIV was less than the deceleration threshold. Therefore, it was judged that the target vehicle would have an intension to change lane. Vehiche054 was driving on lane00055, which was a straight lane. Through an associated search in the conceptual ontology model of the driving scene, it was learned that the right lane was also a straight lane, and the lane line was a white dotted line; the prolog rule of legality is expressed as follows:legalToRight (target, true): targetVehicle (target), isOnLane (target, Lane), hasRightLine (Lane, Line), and hasLineType (Line, “dotted_white”).

The next section of the target vehicle to travel was the intersection, and the distance to the intersection was greater than 30 m, so the effectiveness of changing lanes was satisfied; thus, the lane change did not affect the current target of the target vehicle; the prolog rule of reasonableness is as follows:reasonableToRight (target, true): targetVehicle (target), currentRoadState (target. “ApprJunction”), isOnSegment (target, Seg), connectToJunction (Seg, Junc), intersection (Junc), connectToStopLine (Seg, SL), and distToStopLine (SL, DL), DL ≥ 30.

The autonomous vehicle was driving in the right back region of the target vehicle, and the TTC and TIV values were both less than the corresponding acceleration threshold but greater than the corresponding parking threshold; the prolog rule of safety is as follows:safeToRight (target, true): targetVehicle (target), hasRightObstacle (target, null), hasRightFrontObstacle (target, null), and hasRightBackObstacle (target, egovehicle).

The final prolog rule is as follows:canChangeToRight (target, true): safeToRight (target, true), reasonableToRight (target, true), and legalToRight (target, true).

After obtaining the driving intention of the target vehicle, the prediction reference baseline (the blue curve in Figure 19(a)) was fitted, and the historical trajectory was transformed into Frenet coordinates based on the prediction reference baseline and then fed to the LSTM network input; the predicted trajectory is shown by the green dotted line in Figure 19(a). The trajectory prediction results of vehicle054 could effectively reduce the reaction time of autonomous vehicle while avoiding collisions caused by vehicle054 cutting in.

In Scenario 2, vehicle121 was in the preintersection scenario. The input conditions of the scenario evaluation module are shown in Figure 20(b). The current speed of vehicle121 was 24 km/h, and the lane it traveled was lane000103, which was a right-turn lane. In this scenario, the main factor affecting the target vehicle’s driving intention was the traffic light. The prolog rule of legality is as follows:legalToTurnRight (ego, acc): targetVehicle (target), currentRoadState (target, “ApprJunction”), isOnSegment (target, Seg), connectToJunction (Seg, Junc), intersection (Junc), hasTrafficLight (Junc, TR), hasLightColor (TR, “green”), connectToStopLine (Seg, SL), and distToStopLine (SL, DL), DL =< 0

The target vehicle was on the right turn, and the prolog rule of reasonableness is as follows:reasonableToLeft (ego, true): egoVehicle (ego), currentRoadState (ego, “ApprJunction”), isOnSegment (ego, Seg), connectToJunction (Seg, Junc), intersection (Junc), connectToStopLine (Seg, SL), and distToStopLine (SL, DL), DL ≥ 30.

The final rule of safety is as follows:safeToTurnRight (target, true): targetVehicle (target), hasFrontObstacle (target, null), hasRightFrontObstacle (target, null), and hasRightBackObstacle (target, null).

The final rule is expressed as follows:canTurnRight (target, true): safeToTurnRight (target, true), reasonableToTurnRight (target, true), and legalToTurnRight (target, true).

The blue curve in Figure 20(a) represents the prediction reference baseline fitted based on the turn right driving intention. The predicted trajectory is shown by the green dotted line in Figure 20(a).

The experimental results of Scenarios 1 and 2 verify that the proposed LSTM network can effectively combine the prior and posterior knowledge in the driving scene, and the lane change behavior can be predicted before the variation of vehicle kinemics. The proposed LSTM network can be iteratively adapted to a driving scenario without manual annotation during the network training, which significantly reduces the training complexity and solves the sparse data problem. Due to the unique combination of knowledge reasoning, the proposed prediction model can make a more accurate and reasonable estimation of future trajectories of surrounding vehicles than the existing models in different environments.

4. Conclusions

This paper combines a maneuver-based and learning-based trajectory prediction models and proposes an improved trajectory prediction model based on the LSTM neural network driven by driving knowledge. In order to achieve better use of a prior driving knowledge in driving scenarios and solve the problem of the combinatorial explosion caused by a large number of conditional attributes, the multisource and heterogeneous information of the driving scenario is modeled based on ontology, and a driving knowledge base, including the driving experience and traffic rules, is constructed. Then, the conditional attributes that affect driving intentions are classified and analyzed from the perspectives of safety, legitimacy, and reasonableness, and situation parameters in the horizontal and vertical directions are generated by the deterministic scene evaluation method. Finally, using the obtained situation parameters and the driving knowledge base, the driving intention is inferred based on the prolog online reasoning system. In order to make the prediction results effectively combine the posterior knowledge and solve the problem of insufficient adaptability of the existing learning-based prediction models, this paper converts the driving intention of the target vehicle into a prediction reference baseline, and the Frenet coordinates based on prediction reference baseline are used as a coordinate frame for the LSTM neural network. The prior driving knowledge existing in the driving scene can be used to adjust the predicted trajectory in the form of the prediction reference baseline without increasing the network complexity but ensuring efficient operation of the proposed model on an embedded platform.

The proposed prediction model was verified by simulations and experiments. The simulation results showed that the prediction reference baseline could effectively adjust the output of the LSTM neural network, making the predicted trajectory meet the constraints of the prior knowledge in a driving scenario. The real-world-experiment results show that the proposed prediction model can significantly reduce the computing performance requirements while ensuring real-time performance on the embedded platform. Also due to the full combination of prior and posterior knowledge in the driving scene, the target vehicle’s lane-changing behavior can be predicted on average 2.05 s (for LCL) or 2.71 s (for LCR) in advance, and the precision can be improved by 12.5% for long-term predictions and is more robust, flexible, and adaptive in complex traffic scenarios.

Even though the proposed model has advantages in trajectory prediction, there are still some limitations, such as that indexes of scenario assessment are not comprehensive enough. In order to overcome this limitation, in future work, the evaluation index will be considered from the perspective of human-vehicle interaction and multivehicle interaction. Furthermore, data from a more complex scenario will be collected and used to verify the proposed prediction model.

Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare no conflicts of interest.

Acknowledgments

The authors gratefully acknowledge the help of the team members, whose contributions were essential for the development of the intelligent vehicle. The authors would also like to thank the Institute of Applied Technology, Hefei Institute of Physical Science, and the Academy of Sciences of China for supporting this study. This work was supported by the National Key Research and Development Program of China (nos.2016YFD0701401, 2017YFD0700303, and 2018YFD0700602), Youth Innovation Promotion Association of the Chinese Academy of Sciences (grant no. 2017488), Key Supported Project in the Thirteenth Five-Year Plan of Hefei Institutes of Physical Science, Chinese Academy of Sciences (grant no. KP 2019 16), Equipment Pre-Research Program (grant no. 301060603), Natural Science Foundation of Anhui Province (grant no. 1508085MF133), and Technological Innovation Project for New Energy and Intelligent Networked Automobile Industry of Anhui Province.