Intention-Aware Autonomous Driving Decision-Making in an Uncontrolled Intersection

Song, Weilong; Xiong, Guangming; Chen, Huiyan

doi:https://doi.org/10.1155/2016/1025349

Mathematical Problems in Engineering

On this page

Abstract Introduction Related Work Results Conclusion Acknowledgments References Copyright Related Articles

Special Issue

Modeling and Control Problems in Sustainable Transportation and Power Systems

View this Special Issue

Research Article | Open Access

Volume 2016 | Article ID 1025349 | https://doi.org/10.1155/2016/1025349

Intention-Aware Autonomous Driving Decision-Making in an Uncontrolled Intersection

Weilong Song,¹Guangming Xiong,¹and Huiyan Chen¹

Academic Editor: Dongsuk Kum

Received03 Sept 2015

Revised18 Feb 2016

Accepted23 Mar 2016

Published28 Apr 2016

Abstract

Autonomous vehicles need to perform social accepted behaviors in complex urban scenarios including human-driven vehicles with uncertain intentions. This leads to many difficult decision-making problems, such as deciding a lane change maneuver and generating policies to pass through intersections. In this paper, we propose an intention-aware decision-making algorithm to solve this challenging problem in an uncontrolled intersection scenario. In order to consider uncertain intentions, we first develop a continuous hidden Markov model to predict both the high-level motion intention (e.g., turn right, turn left, and go straight) and the low level interaction intentions (e.g., yield status for related vehicles). Then a partially observable Markov decision process (POMDP) is built to model the general decision-making framework. Due to the difficulty in solving POMDP, we use proper assumptions and approximations to simplify this problem. A human-like policy generation mechanism is used to generate the possible candidates. Human-driven vehicles’ future motion model is proposed to be applied in state transition process and the intention is updated during each prediction time step. The reward function, which considers the driving safety, traffic laws, time efficiency, and so forth, is designed to calculate the optimal policy. Finally, our method is evaluated in simulation with PreScan software and a driving simulator. The experiments show that our method could lead autonomous vehicle to pass through uncontrolled intersections safely and efficiently.

1. Introduction

Autonomous driving technology has developed rapidly in the last decade. In DARPA Urban Challenge [1], autonomous vehicles showed their abilities for interacting in some typical scenarios such as Tee intersections and lane driving. In 2011, Google released its autonomous driving platforms. Over 10,000 miles of autonomous driving for each vehicle was completed under various traffic conditions [2]. Besides, many big automobile companies also plan to launch their autonomous driving product in the next several years. With these significant progresses, autonomous vehicles have shown their potential to reduce the number of traffic accidents and solve the problem of traffic congestions.

One key challenge for autonomous vehicles driven in the real world is how to deal with the uncertainties, such as inaccuracy perception and unclear motion intentions. With the development of intelligent transportation system (ITS), the perception uncertainty could be solved through the vehicle2X technology and the interactions between autonomous vehicles can be solved by centralized or decentralized cooperative control algorithms. However, human-driven vehicles will still be predominance in a short time and the uncertainties of their driving intentions will still be retained due to the lack of “intention sensor.” Human drivers anticipate potential conflicts, continuously make decisions, and adjust their driving behaviors which are often not rational. Therefore, autonomous vehicles need to understand human drivers’ driving intentions and choose proper actions to behave cooperatively.

In this paper, we focus on solving this problem in an uncontrolled intersection scenario. The uncontrolled intersection is a complex scenario with high accident rate. In US, stop signs can be used to normalize the vehicles’ passing sequence. However, this kind of signs is rarely used in China and the right first traffic laws are often broken by some aggressive drivers. Perception failures, misunderstandings, and wrong decisions are likely to be performed by human drivers. In such cases, even with stop signs, the “first come, first served” rule is likely to be broken. Besides, human driving behaviors are likely to change as time goes on. With these uncertain situations, specific layout, and the traffic rules, when autonomous vehicles approach an intersection, they should have potential ability to recognize the behavior of other vehicles and give a suitable corresponding behavior considering future evolution of the traffic scenario (see Figure 1).

Figure 1

A motivation example. Autonomous vehicle B is going straight, while human-driven vehicle A has three potential driving directions: going straight, turning right, or turning left. If vehicle A turns right, it will not affect the normal driving of autonomous vehicle B. But the other maneuvers including turning left and going straight will lead to a passing sequence problem. Besides, if they have potential conflict, autonomous vehicle B will simulate the trajectories of vehicle A in a prediction horizon and gives the best actions in the current scenario. The vehicles drawn by dash lines are the future prediction positions. The red dash lines are the virtual lane assumption used in this paper, which means that the vehicles are considered to be driven inside the lane. The dark blue area is the potential collision region for these two cars.

With these requirements, we propose an intention-aware decision-making algorithm for autonomous driving in an uncontrolled intersection in this paper. Specifically, we first use easily observed features (e.g., velocity and position) and continuous hidden Markov model (HMM) [3] to build the intention prediction model, which outputs the lateral intentions (e.g., turn right, turn left, and go straight) for human-driven vehicles and longitudinal behavior (e.g., the yielding status) for related vehicles. Then, a generative partially observable Markov decision process (POMDP) framework is built to model the autonomous driving decision-making process. This framework is able to deal with the uncertainties in the environment, including human-driven vehicles’ driving intentions. However, it is intractable to compute the optimal policy for general POMDP due to its complexity. We make reasonable approximations and assumptions to solve this problem in a low computational way. A human-like policy generation mechanism is used to compute the potential policy set. A scenario prediction mechanism is used to simulate the future actions of human-driven vehicles based on their lateral and longitudinal intentions and the proper reward functions are designed to evaluate each strategy. Traffic time, safety, and laws are all considered to get the final reward equations. The proposed method has been well evaluated during simulation. The main contributions of this paper are as follows:(i)Modeling a generative autonomous driving decision-making framework considering uncertainties (e.g., human driver’s intention) in the environment.(ii)Building intention prediction model using easily observed parameters (e.g., velocity and position) for recognizing the realistic lateral and longitudinal behaviors of human-driven vehicles.(iii)Using reasonable approximations and assumption to build an efficient solver based on the specific layout in an uncontrolled intersection area.

The structure of this paper is as follows. Section 2 reviews the related work and two-layer HMM-based intention prediction algorithm is discussed in Section 3. Section 4 models general autonomous driving decision-making process in a POMDP, while the approximations and the simplified solver are described in Section 5. In Section 6, we evaluate our algorithm in a simulated uncontrolled intersection scenario with PreScan software and a driver simulator. Finally, the conclusion and future work are discussed in Section 7.

The decision-making module is one of the most important components of autonomous vehicles, connecting environment perception and vehicle control. Thus, numerous research works are performed to handle autonomous driving decision-making problem in the last decade. The most common method is to manually define specific driving rules corresponding to situations. Both finite state machines (FSMs) and hierarchical state machines (HSMs) are used to evaluate situations and decide in their framework [4–6]. In DARPA Urban Challenge (DUC), the winner Boss used a rule-based behavior generation mechanism to obey the predefined driving rules based on the obstacle vehicles’ metrics [1, 6]. Boss was able to check vehicle’s acceleration abilities and the spaces to decide whether merging into a new lane or passing intersections is safe. Similarly, the decision-making system of “Junior” [7], ranking second in DUC, was based on a HSM with manually defined 13 states. Due to the advantages including implementing simply and traceability, this framework is widely used in many autonomous driving platforms. However, these approaches always use constant velocity assumptions and lack considering surrounding vehicles future reactions to host vehicle’s actions. Without this ability, the driving decisions could have potential risks [8].

In order to consider the evolution of future scenario, the planning and utility-based approaches have been proposed for decision-making. Bahram et al. proposed a prediction based reactive strategy to generate autonomous driving strategies [9]. A Bayesian classifier is used to predict the future motion of obstacle vehicles and a tree-based searching mechanism is designed to find the optimal driving strategy using multilevel cost functions. However, the surrounding vehicles’ reactions to autonomous vehicles’ actions are not considered in their framework. Wei et al. proposed a comprehensive approach for autonomous driver model by emulating human driving behavior [10]. The human-driven vehicles are assumed to follow a proper social behavior model and the best velocity profiles are generated in autonomous freeway driving applications. Nonetheless, their method does not consider the motion intention of human-driven vehicles and only targets in-lane driving. In their subsequent work, Wei et al. modeled traffic interactions and realized autonomous vehicle social behavior in highway entrance ramp [11]. The human-driven vehicles’ motion intentions are modeled by a Bayesian model and the human-driven vehicles’ future reactions are introduced, which is based on the yielding/not-yielding intentions at the first prediction step. Autonomous vehicles could perform social cooperative behavior using their framework. However, they do not consider the intention uncertainty over prediction time step.

POMDPs provide a mathematical framework for solving the decision-making problem with uncertainties. Bai et al. proposed an intention-aware approach for autonomous driving in scenarios with many pedestrians (e.g., in campus) [12]. In their framework, the hybrid algorithm is used to generate global path, while a POMDP planner is used to control the velocity of the autonomous vehicle solving by an online POMDP solver DESPOT [13]. Brechtel et al. presented a probabilistic decision-making algorithm using continuous POMDP [14]. They focus on dealing with the uncertainties of incomplete and inaccurate perception in the intersection area, while our goal is to deal with the uncertain intentions of human-driven vehicles. However, the online POMDP solver always needs large computation resource and consumes much time [15, 16], which limits its use in real world autonomous driving platform. Ulbrich and Maurer designed a two-step decision-making algorithm to reduce the complexity of the POMDP in lane change scenario [17]. Eight POMDP states are manually defined to simplify the problem in their framework. Cunningham et al. proposed a multipolicy decision-making method in lane changing and merging scenarios [18]. POMDPs are used to model the decision-making problem in their paper, while multivehicle simulation mechanism is used to generate the optimal high-level policy for autonomous vehicle to execute. However, the motion intentions are not considered.

Overall, the autonomous driving decision-making problem with uncertain driving intention is still a challenging problem. It is necessary to build an effective behavior prediction model for human-driven vehicles. Besides, it is essential to incorporate human-driven vehicles’ intentions and behaviors into autonomous vehicle decision-making system and generate suitable actions to ensure autonomous vehicles drive safely and efficiently. This work addresses this problem by first building a HMM-based intention prediction model, then modeling human-driven vehicle’s intentions in a POMDP framework, and finally solving it in an approximate method.

3. HMM-Based Intention Prediction

In order to pass through an uncontrolled intersection, autonomous vehicles should have the ability to predict the driving intentions of human-driven vehicles. Estimating driver’s behavior is very difficult, because the state of a vehicle driver is in some high-dimensional feature space. Instead of using driver related features (e.g., gas pedal, brake pedal, and drivers’ vision), easily observed parameters are used to build the intention prediction model in this paper.

The vehicle motion intention considered in this paper is divided into two aspects, lateral intention (i.e., turn right, turn left, go straight, and stop) and longitudinal intention . The lateral intention is a high-level driving maneuver, which is determined by human drivers’ long term decision-making process. This intention is not always changed in the driving process and determines the future trajectory of human-driven vehicles. In particular, the intention of stop is treated as a lateral intention in our model because it can be predicted only using data from human-driven vehicle itself. However, the longitudinal intention is a cooperative behavior only occurring when it interacts with other vehicles. We will first describe the HMM and then formulize our intention prediction model in this section.

3.1. HMM

A HMM consists of a set of finite “hidden” states and a set of observable symbols per state. The state transition probabilities are defined as , where

The initial state distribution is denoted as , where

Because the observation symbols are continuous parameters, we use Gaussian Mixture Model (GMM) [19] to represent their probability distribution functions (pdf):where represents the mixture coefficient in the th state for the th mixture. is the pdf of a Gaussian distribution with mean and covariance measured from observation . Mixture coefficient satisfies the following constraints:where , , .

And

Then a HMM could be completely defined by hidden states and the probability tuples .

In the training process, we use the Baum-Welch method [20] to estimate model parameters for different driver intention . Once the model parameters corresponding to different driver intention have been trained, we can perform the driver’s intention estimation in the recognition process. The prediction process for lateral intentions can be seen in Figure 2.

3.2. HMM-Based Intention Prediction Process

Given a continuous HMM, the intention prediction process is divided into two steps. The first step focused on the lateral intention. The training inputs of each vehicle’s lateral intention model in time are defined as , where is the distance to the intersection, is the longitudinal velocity, is the longitudinal acceleration, and is the yaw rate, while the output of this model is the motion intentions . The corresponding HMMs can be trained, including , , , and .

The next step is about longitudinal intention. This probability could be decomposed based on the total probability formula:where is the behavior data including and .

In this process, we assume that the lateral behavior is predicted correctly by a deterministic HMM in the first step, and therefore is determined by the lateral prediction result , where and . And (6) is reformulated by

The problem is changed to model . The features used in longitudinal intention prediction are , where , , and . means the distance to the potential collision area. The output of the longitudinal intention prediction model is longitudinal motion intention .

Instead of building a generative model, we use a deterministic approach to restrict as 0 or 1. Thus, two types of HMMs named , are trained where . Two test examples for lateral and longitudinal intention prediction are shown in Figures 3 and 4. Through these two figures, we can find that our approach can recognize human-driven vehicle’s lateral and longitudinal intention successfully.

(a)

(b)

(c)

(d)

(e)

(a)

(b)

(c)

(d)

4. Modeling Autonomous Driving Decision-Making in a POMDP Framework

For the decision-making process, the key problem is how to design a policy to perform the optimal actions with uncertainties. This needs to not only obtain traffic laws but also consider the driving uncertainties of human-driven vehicles. Facing potential conflicts, human-driven vehicles have uncertain probabilities to yield autonomous vehicles and some aggressive drivers may violate the traffic laws. Such elements should be implemented into a powerful decision-making framework. As a result, we model autonomous driving decision-making problem in a general POMDP framework in this section.

4.1. POMDP Preliminaries

A POMDP model can be formulized as a tuple, where is a set of states, is the action space, and denotes a set of observations. The conditional function models transition probabilities to state , when the system takes an action in the state . The observation function models the probability of observing , when an action is taken and the end state is . The reward function calculates an immediate reward when taking an action in state . is the discount factor in order to balance the immediate and the future rewards.

Because the system contains partially observed state such as intentions, a belief is maintained. A belief update function is defined as . If the agent takes action and gets observation , the new belief is obtained through the Bayes’ rule:where is a normalizing constant.

A key concept in POMDP planning is a policy, a mapping that specifies the action at belief . To solve the POMDP, an optimal policy should be designed to maximize the total reward:where is marked as the initial belief.

4.2. State Space

Because of the Markov property, sufficient information should be contained in the state space for decision-making process [14]. The state space includes the vehicle pose , velocity , the average yaw rate , and acceleration in the last planning period for all the vehicles. For the human-driven vehicles, the lateral and longitudinal intentions also need to be contained for state transition modeling. However, the road context knowledge is static reference information so that it will be not added to the state space.

The joint state could be denoted as , where is the state of host vehicle (autonomous vehicle), , , is the state of human-driven vehicles, and is the number of human-driven vehicles involved. Let us define metric state , including the vehicle position, heading, velocity, acceleration, and yaw rate. Thus, the state of host vehicle can be defined as , while the human-driven vehicle state is . With the advanced perception system and V2V communication technology, we assume that the metric state could be observed. Because the sensor noise is small and hardly affects decision-making process, we do not model observation noise for the metric state. However, the intention state cannot be directly observed, so it is the partially observable variables in our paper. The intention state should be inferred from observation data and predictive model over time.

4.3. Action Space

In our autonomous vehicle system, the decision-making system is used to select the suitable tactical maneuvers. Specifically, in the intersection area autonomous vehicles should follow a global reference path generated by path planning module. The decision-making module only needs to generate acceleration/deceleration commands to the control layer. As the reference path may not be straight, the steering control module can adjust the front wheel angle to follow the reference path. Therefore, the action space could be defined as a discrete set , which contains commands including acceleration, deceleration, and maintaining current velocity.

4.4. Observation Space

Similar to the joint state space, the observation is denoted as , where and are the host vehicle and human-driven vehicle’s observations, respectively. The acceleration and yaw rate can be approximately calculated by speed and heading in the consecutive states.

4.5. State Transition Model

In state transition process, we need to model transition probability . This probability is determined by each targeted element in the scenario. So the transition model can be calculated by the following probabilistic equation:

In the decision-making layer, we do not need to consider complex vehicle dynamic model. Thus, the host vehicle’s motion can be simply represented by the following equations given action :

Thus, the key problem is converted to compute , the state transition probability of human-driven vehicles. Based on the total probability formula, this probability can be factorized as a sum in whole action space:

With this equation, we only need to calculate the state transition probability given a specific action and the probability of selecting this action under current state .

Because the human-driven vehicles’ state , the probability can be calculated as

With a certain action , is equal to . The lateral behavior is considered to be a goal-directed driving intention which will not be changed in the driving process. So is equal to given a reference path corresponding to the intention of . Using (11), can be well solved.

The remaining problem for calculating is to deal with . The lateral intention is assumed stable through the above explanation. And the longitudinal intention is assumed to be not updated in this process. But it will be updated with new inputs in observation space.

Now is well modeled and the remaining problem is to compute the probabilities of human-driven vehicles’ future actions:

Because is determined by the designed policy, could be calculated by (11) given an action . The probability means the distribution of human-driven vehicles’ actions given the new state of host vehicle, the current state of itself, and its intentions. Instead of building a complex probability model, we designed a deterministic mechanism to calculate the most possible action given , , and .

In this prediction process, the host vehicle is assumed to be maintaining the current actions in the next time step and the action will be leading human-driven vehicle passing through the potential collision area either in advance of host vehicle under the intention or behind the host vehicle under the intention to keep a safe distance . In the case with the intention of , we can calculate the low boundary of through the above process and determine the upper one using the largest comfort value . If , will be used as the human-driven vehicle’s action. If not, we consider the targeted following a normal distribution with mean value between and . To simplify our model, we use the mean value of these two boundaries to represent human-driven vehicle’s action . Similarly, the case with the intention of can be analyzed in the same process.

After these steps, the transition probability is well formulized and the autonomous vehicle could have the ability to understand the future motion of the scenario through this model.

4.6. Observation Model

The observation model is built to simulate the measurement process. The motion intention is updated in this process. The measurements of human-driven vehicles are modeled with conditional independent assumption. Thus, the observation model can be calculated as

The host vehicle’s observation function is denoted as

But in this paper, due to the use of V2V communication sensor, the observation error almost does not affect the decision-making result. The variance matrix is set as zero.

The human-driven vehicle’s observation will follow the vehicle’s motion intentions. Because we do not consider the observation error, the value in metric state will be the same as the state transition results. But the longitudinal intention of human-driven vehicles in the state space will be updated using the new observations and HMM mentioned in Section 3. The new observation space will be confirmed with the above step.

4.7. Reward Function

The candidate policies have to satisfy several evaluation criterions. Autonomous vehicles should be driven safely and comfortably. At the same time, they should follow the traffic rules and reach the destination as soon as possible. As a result, we design objective function (17) considering three aspects including safety, time efficiency, and traffic laws, where , , and are the weight coefficient:

The detailed information will be discussed in the following subsections. In addition, the factor of comfort will be considered and discussed in policy generation part (Section 5.1).

4.7.1. Safety Reward

The safety reward function is based on the potential conflict status. In our strategy, safety reward is defined as a penalty. If there are no potential conflicts, the safety reward will be set as 0. A large penalty will be assigned due to the risk of collision status.

In an uncontrolled intersection, the four approaching directions are defined as (Figure 5). The driver’s lateral intentions are defined as . So the driving trajectory for each vehicle in the intersection can be generally represented by and , and we marked it as , , . The function is used to judge the potential collision status, which is denoted aswhere and are vehicles’ maneuver .

can be calculated through relative direction between two cars, which is shown in Table 1.

The safety reward is based on the following items:(i)If is equal to 0, then the safety reward is equal to 0 due to the noncollision status.(ii)If potential collision occurs, there will be a large penalty.(iii)If , there is a penalty depending on and .

4.7.2. Traffic Law Reward

Autonomous vehicles should follow traffic laws to interact with human-driven vehicles. Traffic law is modeled as a function for each two vehicles and where and are vehicles’ maneuver . This function is formulized as shown in Algorithm 1.




if , then

else if , then

if , then
if and <> , then

else if and or , then

else if and , then

end if
end if
end if
return

If the behavior will break the law, a large penalty is applied and the behavior of obeying traffic laws will get a zero reward.

4.7.3. Time Reward

The time cost is based on the time to the destination for the targeted vehicles in the intersection area:

is the distance to the driving goal. In addition, we also need to consider the speed limit, which is discussed in policy generation part in Section 5.

5. Approximations on Solving POMDP Problem

Solving POMDP is quite difficult. The complexity of searching total brief space is [12], where is the prediction horizon. In this paper, we model the intention recognition process as a deterministic model and use communication sensors to ignore the perception error, and thus the size of is reduced to 1 in the simplified problem. To solve this problem, we first generate the suitable potential policies according to the property of driving tasks and then select the reasonable total predicting interval time and total horizon. After that, the approximate optimal policy can be calculated through searching all possible policies with maximum total reward. The policy selection process is shown in Algorithm 2 and some detailed explanations are discussed in the subsections.

Input:
Predict horizon , time step ,
Current states: ,
()
() for each , do
() for to , do
()
()
()
()
()
()
()
()
()
() end
()
() end
()
()
() return

5.1. Policy Generation

For autonomous driving near intersection, the desired velocity curves need to satisfy several constraints. Firstly, except for emergency braking, the acceleration constraints are applied to ensure comfort. Secondly, the speed limit constraints should be used in this process. We aim to avoid the acceleration commands when autonomous vehicle is reaching maximum speed limit. Thirdly, for the comfort purpose, the acceleration command should not be always changed. In other words, we need to minimize the jerk.

Similar to [11], the candidate policies are divided into three time segments. The first two segments are like “keep constant acceleration/deceleration actions,” while keeping constant velocity in the third segment. We use , , and to represent the time periods of these three segments. To guarantee comfort, the acceleration is limited to the range from −4 m/s² to 2 m/s² and we discrete acceleration action into a multiple of . Then, the action space can be represented by a discretizing acceleration set. Then, we can set the value of , , and and the prediction period of single step. An example of policy generation is shown in Figure 6.

(a)

(b)

Figure 6

An example of policy generation process. (a) is the generated policies and (b) is the corresponding speed profiles. The interval of each prediction step is 0.5 s, current speed is 12 m/s², and the speed limit is 20 m/s². The bold black line is one policy. In the first 3 seconds, autonomous vehicles decelerate in −3.5 m/s², then accelerate at 2 m/s² for 4 seconds, and finally stop in the last one second. In this case, 109 policies were generated, which is suitable for replanning fast.

5.2. Planning Horizon Selection

After building policy generation model, the next problem is to select a suitable planning horizon. Longer horizon can lead to a better solution but consuming more computing resources. However, as our purpose is to deal with the interaction problem in the uncontrolled intersection, we only need to consider the situation before autonomous vehicle gets through. In our algorithm, we set the prediction horizon as 8 seconds. In addition, in the process of updating the future state of each vehicle using each policy, the car following mode is used after autonomous vehicle passes through the intersection area.

5.3. Time Step Selection

Another problem is the prediction time step. The intention prediction algorithm and the POMDP are computed in each step. If the time step is , the total computation times will be . Thus, smaller time step leads to more computation time. To solve this problem, we use a simple adaptive time step calculation mechanism to give a final value. The time step is selected based on the TTC of autonomous vehicle. If the host vehicle is far away from the intersection, we can use a very large time step. But if the TTC is quite small, the low is applied to ensure safety.

6. Experiment and Results

6.1. Settings

In this paper, we evaluate our approach through PreScan 7.1.0 [21], a simulation tool for autonomous driving and connected vehicles. Using this software, we can build the testing scenarios (Figure 7) and add vehicles with dynamic model. In order to get a similar scenario considering social interaction, the driver simulator is added in our experiment (Figure 8). The human-driven vehicle is driven by several people during the experiment and the autonomous vehicle makes decisions based on the human-driven vehicle’s driving behavior. The reference trajectory for autonomous vehicle is generated from path planning module and the human-driven vehicle’s data (e.g., position, velocity, and heading) are transferred through V2V communication sensor. The decision-making module sends desired velocity command to the PID controlled to follow the reference path. All policies in the experiment part use a planning horizon , which is discretized into the time step of 0.5 s.

6.2. Results

It is difficult to compare different approaches in the same scenario because the environment is dynamic and not exactly the same. However, we select two typical situations and special settings to make it possible. The same initial conditions including position, orientation, and velocity for each vehicle are used in different tests. Besides, two typical situations, including human-driven vehicle getting through before or after autonomous vehicle, are compared in this section. With the same initial state, different reactions will occur based on various methods. We compare our approach and reactive-based method [6] in this section. The key difference for these two methods is that our approach considers human-driven vehicle’s driving intention.

The first experiment is that human-driven vehicle tries to yield autonomous vehicle in the interaction process. The results are shown in Figures 9 and 10. Firstly, Figure 9 gives us a visual comparison of the different approaches. From almost the same initial state (e.g., position and velocity), our approach could lead to autonomous vehicle passing through the intersection more quickly and reasonable.

(a)

(b)

(a)

(b)

(c)

(d)

(e)

(f)

Figure 10

Case test 1. In this case, human-driven vehicle passes through intersection after autonomous vehicle. (a), (c), and (e) are the performance of our method, while (b), (d), and (f) are from the strategy without considering the driving intention. (a) and (b) are the velocity profiles and the corresponding driving intention. For longitudinal intention, label 1 means yielding and label 2 means not yielding. In lateral intention, 1 means turning left, 2 means turning right, 3 means going straight, and 4 means stop. The intentions in (b) are not used in that method but for detailed analysis. (c) and (d) are the distance to collision area for autonomous vehicle and human-driven vehicle, respectively. (e) and (f) are the prediction and true motions of human-driven vehicles in time 1.5 s with a prediction length of 8 s. The red curves in these subfigures are from autonomous vehicle while blue lines are from human-driven vehicle. The green lines in (e) and (f) are the prediction velocity curves of human-driven vehicle.

Then, let us look at Figure 10 for detailed explanation. In the first 1.2 s in Figures 10(a) and 10(c), autonomous vehicle maintains speed and understands that human-driven vehicle will not perform yielding actions. Then, autonomous vehicle gets yielding intention of human-driven vehicle and understands that human-driven vehicle’s lateral intention is to go straight. Based on candidate policies, autonomous vehicle selects acceleration strategy with maximum reward and finally crosses the intersection. In this process, we can obviously find that autonomous vehicle understands human-driven vehicle’s yielding intention. Figure 10(c) is an example of understand human-driven vehicle’s behavior based on ego vehicle’s future actions in a specific time. Our strategy predicts the future actions of human-driven vehicle. Although the velocity curves after 1 s do not correspond, it does not affect the performance of our methods. The reason is that we use a deterministic model in the prediction process and the prediction value is inside two boundaries to ensure safety. Besides, the whole actions of autonomous vehicle in this process could also help human-driven vehicle to understand not-yielding intention of autonomous vehicles. In this case, cooperative driving behaviors are performed by both vehicles.

However, if the intention is not considered in this process, we can find the results in Figures 10(b), 10(d), and 10(f). After 2 s in Figure 10(b), while the human-driven vehicle gives a yielding intention, autonomous vehicle could not understand and they find a potential collision based on the constant velocity assumptions. Then, it decreases the speed but the human-driven vehicle also slows down. The puzzled behavior leads both vehicles to slow down near intersection. Finally, human-driven vehicle stops at the stop line and then autonomous vehicle could pass the intersection. In this strategy, the human-driven vehicle’s future motion is assumed to be constant (Figure 10(f)). Without understanding of human-driven vehicle’s intentions, this strategy can increase congestion problem.

Another experiment is that human-driven vehicle tries to get through the intersection first. The results are shown in Figures 11 and 12. This case is quite typical because many traffic accidents in real world are happening in this situation. In detail, if one vehicle tries to cross an intersection while violating the law, another vehicle will be in great danger if it does not understand its behavior. From the visualized performance in Figure 11, our method is a little more safe than other approaches as there is nearly collision situation in Figure 11(b). In detail, we can see from Figure 12(a) that our strategy could perform deceleration actions after we understand the not-yielding intention in 0.8 s. However, without understanding human-driven vehicle’s motion intention, the response time has a 1-second delay which may be quite dangerous. In addition, it is shown that good performance is in the predictions of human-driven vehicle’s future motion in our methods (Figure 12(e)).

(a)

(b)

(a)

(b)

(c)

(d)

(e)

(f)

The results of these two cases demonstrate that our algorithm could deal with typical scenarios and have better performance than traditional reactive controller. Autonomous vehicle could be driven more safely, fast, and comfortably through our strategy.

7. Conclusion and Future Work

In this paper, we proposed an autonomous driving decision-making algorithm considering human-driven vehicle’s uncertain intentions in an uncontrolled intersection. The lateral and longitudinal intentions are recognized by a continuous HMM. Based on HMM and POMDP, we model general decision-making process and then use an approximate approach to solve this complex problem. Finally, we use PreScan software and a driving simulator to emulate social interaction process. The experiment results show that autonomous vehicles with our approach can pass through uncontrolled intersections more safely and efficiently than using the strategy without considering human-driven vehicles’ driving intentions.

In the near future, we aim to implement our approach into a real autonomous vehicle and perform real world experiments. In addition, more precious intention recognition algorithm aims to be figured out. Some methods like probabilistic graphic model can be used to get a distribution of each intention. Finally, designing online POMDP planning algorithms is also valuable.

Competing Interests

The authors declare that there are no competing interests regarding the publication of this paper.

Acknowledgments

This study is supported by the National Natural Science Foundation of China (no. 91420203).

References

C. Urmson, J. Anhalt, D. Bagnell et al., “Autonomous driving in urban environments: boss and the urban challenge,” Journal of Field Robotics, vol. 25, no. 8, pp. 425–466, 2008.
View at: Publisher Site | Google Scholar
J. Markoff, “Google cars drive themselves, in traffic,” New York Times, vol. 9, 2010.
View at: Google Scholar
L. R. Rabiner and B.-H. Juang, “An introduction to hidden Markov models,” IEEE ASSP Magazine, vol. 3, no. 1, pp. 4–16, 1986.
View at: Publisher Site | Google Scholar
M. Buehler, K. Iagnemma, and S. Singh, The DARPA Urban Challenge: Autonomous Vehicles in City Traffic, vol. 56, Springer, 2009.
S. Kammel, J. Ziegler, B. Pitzer et al., “Team AnnieWAY's autonomous system for the 2007 DARPA Urban Challenge,” Journal of Field Robotics, vol. 25, no. 9, pp. 615–639, 2008.
View at: Publisher Site | Google Scholar
C. R. Baker and J. M. Dolan, “Traffic interaction in the urban challenge: putting boss on its best behavior,” in Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS '08), pp. 1752–1758, IEEE, Nice, France, September 2008.
View at: Publisher Site | Google Scholar
M. Montemerlo, J. Becker, S. Shat et al., “Junior: the Stanford entry in the urban challenge,” Journal of Field Robotics, vol. 25, no. 9, pp. 569–597, 2008.
View at: Publisher Site | Google Scholar
L. Fletcher, S. Teller, E. Olson et al., “The MIT-Cornell collision and why it happened,” Journal of Field Robotics, vol. 25, no. 10, pp. 775–807, 2008.
View at: Publisher Site | Google Scholar
M. Bahram, A. Wolf, M. Aeberhard, and D. Wollherr, “A prediction-based reactive driving strategy for highly automated driving function on freeways,” in Proceedings of the 25th IEEE Intelligent Vehicles Symposium, pp. 400–406, IEEE, Dearborn, Mich, USA, June 2014.
View at: Publisher Site | Google Scholar
J. Wei, J. M. Dolan, and B. Litkouhi, “A prediction- and cost function-based algorithm for robust autonomous freeway driving,” in Proceedings of the IEEE Intelligent Vehicles Symposium (IV '10), pp. 512–517, San Diego, Calif, USA, June 2010.
View at: Publisher Site | Google Scholar
J. Wei, J. M. Dolan, and B. Litkouhi, “Autonomous vehicle social behavior for highway entrance ramp management,” in Proceedings of the IEEE Intelligent Vehicles Symposium (IV '13), pp. 201–207, IEEE, Gold Coast, Australia, June 2013.
View at: Publisher Site | Google Scholar
H. Bai, S. Cai, N. Ye, D. Hsu, and W. S. Lee, “Intention-aware online POMDP planning for autonomous driving in a crowd,” in Proceedings of the IEEE International Conference on Robotics and Automation (ICRA '15), pp. 454–460, IEEE, Seattle, Wash, USA, May 2015.
View at: Publisher Site | Google Scholar
A. Somani, N. Ye, D. Hsu, and W. S. Lee, “Despot: online pomdp planning with regularization,” in Advances in Neural Information Processing Systems, pp. 1772–1780, 2013.
View at: Google Scholar
S. Brechtel, T. Gindele, and R. Dillmann, “Probabilistic decision-making under uncertainty for autonomous driving using continuous POMDPs,” in Proceedings of the 17th IEEE International Conference on Intelligent Transportation Systems (ITSC '14), pp. 392–399, Qingdao, China, October 2014.
View at: Publisher Site | Google Scholar
C. H. Papadimitriou and J. N. Tsitsiklis, “The complexity of Markov decision processes,” Mathematics of Operations Research, vol. 12, no. 3, pp. 441–450, 1987.
View at: Publisher Site | Google Scholar | MathSciNet
O. Madani, S. Hanks, and A. Condon, “On the undecidability of probabilistic planning and related stochastic optimization problems,” Artificial Intelligence, vol. 147, no. 1-2, pp. 5–34, 2003.
View at: Publisher Site | Google Scholar | Zentralblatt MATH | MathSciNet
S. Ulbrich and M. Maurer, “Probabilistic online POMDP decision making for lane changes in fully automated driving,” in Proceedings of the 16th International IEEE Conference on Intelligent Transportation Systems (ITSC '13), pp. 2063–2067, The Hague, The Netherlands, October 2013.
View at: Google Scholar
A. G. Cunningham, E. Galceran, R. M. Eustice, and E. Olson, “MPDM: multipolicy decision-making in dynamic, uncertain environments for autonomous driving,” in Proceedings of the IEEE International Conference on Robotics and Automation (ICRA '15), pp. 1670–1677, Seattle, Wash, USA, May 2015.
View at: Publisher Site | Google Scholar
D. A. Reynolds and R. C. Rose, “Robust text-independent speaker identification using Gaussian mixture speaker models,” IEEE Transactions on Speech and Audio Processing, vol. 3, no. 1, pp. 72–83, 1995.
View at: Publisher Site | Google Scholar
L. E. Baum and T. Petrie, “Statistical inference for probabilistic functions of finite state Markov chains,” The Annals of Mathematical Statistics, vol. 37, pp. 1554–1563, 1966.
View at: Publisher Site | Google Scholar | MathSciNet
M. Tideman and M. Van Noort, “A simulation tool suite for developing connected vehicle systems,” in Proceedings of the IEEE Intelligent Vehicles Symposium (IEEE IV '13), pp. 713–718, Queensland, Australia, June 2013.
View at: Publisher Site | Google Scholar

Copyright

Copyright © 2016 Weilong Song et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

PDF Download Citation

Download other formats

Order printed copies

Views

5844

Downloads

4679

Citations

Mathematical Problems in Engineering

Modeling and Control Problems in Sustainable Transportation and Power Systems

Intention-Aware Autonomous Driving Decision-Making in an Uncontrolled Intersection

Abstract

1. Introduction

2. Related Work

3. HMM-Based Intention Prediction

3.1. HMM

3.2. HMM-Based Intention Prediction Process

4. Modeling Autonomous Driving Decision-Making in a POMDP Framework

4.1. POMDP Preliminaries

4.2. State Space

4.3. Action Space

4.4. Observation Space

4.5. State Transition Model

4.6. Observation Model

4.7. Reward Function

4.7.1. Safety Reward

4.7.2. Traffic Law Reward

4.7.3. Time Reward

5. Approximations on Solving POMDP Problem

5.1. Policy Generation

5.2. Planning Horizon Selection

5.3. Time Step Selection

6. Experiment and Results

6.1. Settings

6.2. Results

7. Conclusion and Future Work

Competing Interests

Acknowledgments

References

Copyright