Abstract

The emergence of intelligent connected vehicles (ICVs) is expected to contribute to resolving traffic congestion and safety problems; however, it is inevitable that ICV safety issues in mixed traffic (involving ICVs and human driven vehicles) will be a critical challenge. The numerical simulation of scenarios involving a mix of different driving profiles is expected to be an important safety assessment tool in the process of testing and validating ICVs, especially regarding extreme scenarios, including car collisions, which are rarely captured in real-world datasets. In this study, we propose a novel approach for car collision generation in numerical simulations based on the assumption that car collision occurrences are mostly associated with certain specific driver profiles. Using a dataset provided by the Next Generation Simulation (NGSIM) project, NGSIM 101 dataset, we identify three different driver profiles: aggressive, inattentive, and normal drivers. We then replicate car collision occurrences by varying the percentages of these three driver profiles in the simulated environment, allowing us to establish a relationship between driver profiles and car collision occurrences. We also investigate the severity of car collisions and classify them with respect to the driver profiles of the cars involved in the collisions. Our approach of replicating car collision occurrences in numerical simulations will facilitate the testing and validation of ICVs in the future, especially regarding the testing of ICV functionalities in dealing with traffic accidents.

1. Introduction

With the development of information and telecommunication technologies, along with the rapid growth of new energy vehicles (NEVs), intelligent connected vehicles (ICVs) have become an increasingly active research topic. ICVs are expected to reshape future mobility and contribute to mitigating road traffic congestion and safety problems [1]. However, a major challenge of ICVs is communicating with other vehicles and accurately recognizing the patterns of human driving behavior in mixed traffic [2]. To ensure the security of ICVs, they need to be tested by driving hundreds of millions of miles and need to be failure-free. Moreover, performing physical tests of ICVs is not only time consuming but can sometimes be dangerous as well. As an alternative to physical tests, traffic numerical simulation platforms can be used to create realistic traffic situations. In addition, traffic accidents can be generated in traffic simulations in order to test the functionality of ICVs in dealing with traffic accidents.

It is well known that traffic numerical simulation is established by using mathematical models to reconstruct road traffic [3] and can be performed on many scales (i.e., microscopic, mesoscopic, and macroscopic). Microscopic traffic simulators are particularly promising in the validation of ICV behavioral decision systems and work by replicating individually each vehicle behavior in detail in order to create an entire traffic environment. Various microscopic traffic simulators have been developed in recent years, such as SUMO [4], VISSIM [5], and AIMSUN [6]. However, among these simulators, vehicle collision generation mechanisms have not yet been described completely. In this study, we propose a novel approach for car collision generation in numerical simulations based on the assumption that car collision occurrences are mostly associated with certain specific driver profiles.

As is well known, drivers have different driving patterns, and different driver profiles can be distinguished. Meiring and Myburgh [7] reviewed various methods of classification of driving styles, such as “normal,” “aggressive,” and “inattentive.” Aggressive drivers are often characterized by risky behaviors with regards to velocity (such as abrupt instantaneous speed change, driving over the speed limit, excessive acceleration, or deceleration) and frequent lane changing behavior [8]. Inattentive drivers can be characterized by long reaction time and forced sudden lane changes with an abrupt deviation from normal behavior [7]. In our work, we focus on driver profiles only in terms of how they react when directly behind another car (the car-following behavior).

Several open datasets on vehicular trajectories are currently available and can be applied to studying and analyzing human driving behavior. Next Generation Simulation (NGSIM) has published four vehicle trajectory datasets (NGSIM [9]), which have been widely used in transportation research [10]. More recently, datasets on motion trajectory of traffic objects detected by autonomous vehicle sensors have been published (e.g., Waymo data [11], Argoverse [12], and nuScenes [13]). However, in these datasets, vehicle collisions are rarely observed. Nevertheless, the Strategic Highway Research Program 2 Naturalistic Driving Study (SHRP2-NDS) [14] provides a dataset where several accident situations are captured, yet seem unsatisfactory in terms of quantification of vehicle collision causes and impacts. Therefore, there is a lack of data on real-world vehicle collisions connected to human driving behavior, including a lack of both video data and vehicle trajectories in collision situations.

On the other hand, traditional road safety analyses are based on statistical methods which aim to understand the importance of geometric (road shape) and track (road type) characteristics for safety or to locate collision black-spots, based on crash datasets [15]. Crash datasets as mentioned in [16] provide only vehicle collision statistics, without involving car trajectories and driver profiles. A large body of the literature on road traffic safety explores the link between ambient traffic conditions and collision occurrences [17] and/or vehicle collision injury severity [18]. On highways specifically, the authors of [19] provide evidence that vehicle collisions have a specific relationship with traffic flow characteristics (volume, density, and speed). In [20], the authors identified the impact of general geometric parameters, weather, and traffic flow on different types of collision (rear-end, sideswipe, and multiple vehicle involved) based on French traffic and vehicle statistical data. Some other studies employed statistical and machine learning approaches, such as multivariate probit models [20] or support vector machine [21] to investigate the relationship between influencing traffic factors and vehicle collision occurrences.

Regarding vehicular collision involving unexpected human driving behavior, several previous works use traffic simulations. In [22], based on their own proposed models for three types of vehicle collision depending on vehicle interactions and maneuvers, the authors estimated the parameters of the models via simulation. In [23], the authors studied the impact of driving violations (driving over the speed limit, slow driving, and abrupt hard braking) on car collision occurrence for different traffic conditions through simulation. Moreover, in some other related works, the traffic conflicts technique (TCT) using surrogate safety measures based on vehicle trajectories has been proposed to assess traffic safety [24]. These indicators can be used for predicting potential collisions [15, 25, 26]. Nevertheless, most of the literature addresses road traffic safety without considering driving profiles. In our proposed approach, we would like to represent different driver profiles in traffic simulations and then increase the number of drivers with extreme driving profiles (aggressive and inattentive). Therefore, by the method of increasing the percentage of drivers with extreme profiles, car collisions could be generated in the traffic simulations.

In this study, we propose a novel approach for car collision (collision between cars) generation in numerical simulation by varying the percentages of different driver profiles in the traffic, aiming at establishing a relationship between driving profiles and car collision occurrences. The profiles are extracted from the NGSIM database and integrated in the traffic simulator SUMO [4, 27, 28] using the calibrated intelligent driver model (IDM), which is one of the most human-like car-following models [29, 30]. For the simulation, we used the IDM with an extension of driver reaction times. Therefore, our main goals in this study are to extract driver profiles from real driving data and replicate them using a microscopic traffic model, establish a method of reproducing realistic traffic simulation based on microscopic road traffic modeling in the SUMO simulator, generate car collisions by appropriately varying the percentages of the driver profiles, characterize the relationship between the car collision occurrences with the driver profiles, and observe the severity of the generated car collisions.

2. Materials and Methods

In this section, we present the relevant materials and the proposed method. The approach structure is shown in Figure 1. In this work, we study the driver profiles from car trajectories and propose the definition of an aggressive driver profile as a driver who always leaves short time headways with respect to the leading vehicle and an inattentive driver profile as a driver with particularly long reaction times compared to other drivers. We classify all drivers with intermediate values of reaction time and time headway as the normal profile. The selection and thresholds for the three driver profiles were based on the NGSIM 101 dataset [9]. After the classification of the driver profiles, the specific driver profiles (aggressive and inattentive) are represented using the calibrated IDM model (the existing IDM model with an extension of reaction time), and the road traffic is then simulated using SUMO (“Simulation of Urban MObility”) [28]. The IDM model calibration is performed using a genetic algorithm to find the optimal set of IDM parameters with an objective of minimizing the predefined error between real driver trajectory and the output of the IDM model. In essence, we artificially increase the percentage of drivers with extreme profiles (aggressive or inattentive) and then count and analyze the car collisions generated by traffic simulation. In the numerical simulation experiments, we propose 4 different experiments using different combinations of driver groups.

The remaining sections of this study are structured as follows: in the rest of this section, from Subsection 2.1 to 2.4, we present the necessary elements we predefined for our method of vehicular collision generation, including (1) the presentation of the NGSIM 101 dataset, (2) the proposition of different driver profiles, (3) SUMO traffic simulator and the choice of car-following model (IDM model), and (4) the method of IDM model calibration to present different driver profiles. In Sections 35, we present the specific driver profiles in the NGSIM 101 dataset, the calibration of the IDM model for the specific drivers, and the simulation experiments description. Section 6 presents the results obtained from numerical simulations and the investigation of the relationship between generated car collisions and driver profiles. Section 7 presents the validation of the approach with a different part of data in the NGSIM 101 dataset. Section 8 deals with the severity of simulated car collisions with respect to different driving profiles. Finally, Section 9 summarizes major findings and provides recommendations for further research.

2.1. NGSIM 101 Dataset Description

NGSIM 101 [9] is an open dataset released by the US Federal Highway Administration (FHWA). On a highway section in Los Angeles, California, covering 640 meters of length, all vehicle trajectories are provided with a rate of 10 Hz, and all data were collected during the rush hour from 7 : 50 a.m. to 8 : 35 a.m. This section has 5 normal lanes and one auxiliary lane connecting an on-ramp and an off-ramp. In our study, we focused on the car trajectories only in the 5 main lanes. In addition, the whole NGSIM 101 dataset is divided into three subsets of 15 minutes, which cover the traffic from 7 : 50 a.m. to 8 : 05 a.m., from 8 : 05 a.m. to 8 : 15 a.m., and from 8 : 20 a.m. to 8 : 35 a.m. In Figure 2, we present traffic volumes and speeds during each of the three time periods. It can be observed that in the beginning of the first 15 minutes, the mean speed is between 10 and 15 m/s. After that, traffic becomes denser resulting in a sudden fall of the mean speed. In the second and third 15-minute time periods, traffic is more congested and the mean traffic speed varies between 5 and 10 m/s.

In this work, we applied the proposed approach to the first 15-minute dataset. The second 15-minute dataset was then used for method validation. In our approach, we are only interested in understanding driving profiles in terms of car-following behavior, without considering lane change or other behaviors. After processing the data of the first 15-minute time frame, we selected approximately 1500 car trajectories (drivers) from a total population of 1993 vehicle trajectories to extract specific driver profiles. The selection was based on the following condition: each car must always have a leading car whose trajectory is continuous for at least 40s. The same data processing was made for the second 15-minute of data and resulted in the selection of 1300 car trajectories out of a total of 1495.

2.2. Classification of Driver Profiles

A driver profile can be defined as the average driving behavior of a given driving class [31]. Driving behavior is related to a driver’s skills, sociodemographic status (age, gender, and occupation) and current psychological state (fatigue and distraction). From vehicle trajectory data, a driver’s characteristics can be illustrated by the speed, acceleration, jerk, and some other relevant traffic indicators, such as time headway (THW) and time to collision (TTC) [31]. THW and TTC are critical safety indicators for car-following behavior [32]. THW measures the time passed before reaching the leading vehicle’s position while running at current speed, while TTC is usually used for judging the moment to start braking and in the control of braking [33]. Several studies show that the distribution of THW is related to driving speed and also to traffic flow conditions. According to [34], a negative correlation between the car speed and THW can exist. In [35], it was shown that the speed and THW follow different distribution patterns under different traffic density levels. The THW can illustrate better the driver’s characteristics in all conditions of car-following behavior than the TTC, which is only important for the braking behavior.

In this work, the classification of different driver profiles begins with the observations from the NGSIM data. Figure 3 shows the distribution of each car driver’s maximum THW, mean THW, minimum THW, and standard deviation of THW, for the 1993 vehicle trajectories of the first 15 min NGSIM 101 dataset.

Figure 3 shows evidence of the heterogeneity of human driving profiles as the mean THW ranges from near 0s to 5s and the minimum value of THW ranges from near 0s to more than 3s. Based on these previous preliminary findings, we propose the definition of three profiles: (i) aggressive: shorter car time headway, (ii) inattentive: longer reaction time, and (iii) normal: for intermediate values of reaction time and car time headway. We notice that since the reaction time cannot be detected in the real driving data, we estimate it here, and we assume that the reaction time is approximate to the minimum THW value, which is the minimum safe time gap the driver estimates should be maintained with the leading vehicle during their whole trajectory (longer than 400 time steps, explained in Section 2.1). The definitions of the the two driver profiles (aggressive and inattentive) are formalized.(1)Aggressive driver profile: A driver is considered to be aggressive with respect to a threshold on the time headway () ifWe will explain in Section 3 how we fixed the threshold in our approach.(2)Inattentive driver profile (drivers with long reaction time): A driver is considered to be inattentive (with a long reaction time) with respect to a threshold on the time headway () ifWe will explain in Section 3 how we fixed the threshold in our approach.(3)Normal driver profile: the drivers whose profiles are not aggressive or inattentive are called normal. They have intermediate values of reaction time and time headway.

2.3. SUMO Traffic Simulator and IDM Car-Following Model

SUMO (“Simulation of Urban MObility”) is a microscopic open-source traffic simulator [4, 27, 28], widely used for traffic research. Microscopic traffic models generally include two driver behavior models: (a) car-following, corresponding to the behavior of a driver in reaction to the actions or stimulus of the leading vehicle; (b) lane change, corresponding to the behavior of lane changes that includes maneuvers of overtaking as well as insertion into a target lane. We limit our work to the car-following component. In addition, during the simulation, SUMO can detect physical collisions (front and back bumper meet or overlap) [36], and a collision is observed when a following vehicle collides into the rear-end of the proceeding vehicle. The collision counts become available at the end of each simulation.

Many interesting reviews on car-following can be found in the literature [3739]. We briefly present here some of the most commonly used. The GHR model [40] is a stimulus-response model in which drivers perform their acceleration/deceleration depending on car speeds, relative car speeds with respect to leading vehicles, intervehicle distances, and drivers’ reaction time. Wiedemann’s (1974) car-following model is used in the Vissim simulator and describes the psychophysiological aspects of driving behavior in terms of four discrete driving regimes. This model considers different modes of operation, divided into “no reaction zone” (free-road), “closing in,” “must decelerate,” and “car-following” by human perceptual thresholds [38]. The Gipps model [41] and Krauss model [42] are based on safe distance. In the Gipps model, drivers update their car speeds with respect to keeping a minimum safe distance between themselves and the leading vehicle, in order to avoid collisions and in anticipation of the extreme case of a leading car braking suddenly. The Krauss model [42] is the car-following model by default in the SUMO simulator. It extends the Gipps model by modeling the imperfection of human drivers with stochastic terms of car speed. This property makes (or is supposed to make) the model more realistic. The IDM car-following model was first published in [43] which improved the initial results produced by Gipps’ model. The acceleration is calculated as a function of the desired speed and the desired space headway. The IDM model is suitable for both free flow and congestion phases [38]. This attribute makes the model more efficient and facilitates its calibration with real driving data.

We chose here the IDM car-following model [43], which has been already implemented in SUMO, and it is shown by several research works that IDM is the most human-like car-following among certain compared models [29, 30]. Therefore, we briefly recall the model.wherewhere is the vehicle acceleration, is the vehicle speed, is the intervehicular distance; is the relative speed with leading vehicle; is the maximum acceleration, is the maximum deceleration, is the desired speed, is the exponential parameter of speed, we fixed it at 4 referring to [43], is the minimal intervehicular distance with leading vehicle, and is the desired time headway (desired THW).

The output of the IDM model is the acceleration given as a function of the influencing factors (the inputs of the model): the vehicles own speed , the intervehicular distance , and the relative speed with respect to the leading vehicle.

In addition, we notice that the original IDM car-following model does not include reaction time as a parameter [39]. Nevertheless, in a recent development of the SUMO simulator [28], the authors improved the work on reaction time modeling. They indicated that reaction time could be introduced into the driving model (car-following and lane change models) as an additional parameter. In the work we present here, we introduced a reaction time parameter into the IDM car-following model. In SUMO, the simulation step can be set to , and the driver reaction time is a time delay to make a decision both for updating acceleration and lane changing depending on the present state of traffic. This delay of decision time (reaction time) can be set as , where is an additional parameter which we determine by calibration.

2.4. Method of Calibration of the IDM Model

In order to simulate real driver behavior and driver profiles, the microscopic traffic model must be calibrated with real driving data. Two main types of calibration of car-following models exist: (1) estimation of driving model parameters accounting for the physical meanings of each parameter [44], in which parameters (such as speed maximum and acceleration maximum) can be extracted directly from vehicle trajectory data, such as the maximal value of speed in the trajectory and maximal value of acceleration in the trajectory, and (2) calibration of driving models, which can be constructed as an optimization problem. In the second approach, the objective function and the optimization method need to be selected, and the problem is performed to minimize the distance between the simulated vehicle trajectories by the model and real vehicle trajectories. Therefore, by this method, the optimal set of parameters can be found. In the methods of calibration by optimization, several mathematical optimization methods and algorithms such as Newton, Gauss-Newton, gradient descent, and Levenberg-Marquardt are presented for car-following model parameters optimization in [45]. The authors of [45] proved that genetic algorithms (GAs) are also effective to solve optimization problems for car-following model calibration. Many of the recent works on car-following model calibration used GA methods [29, 30], and their works show that the GA optimization method is efficient for car-following model calibration.

In this work, the parameters which need to be calibrated in the IDM model are as follows: (maximal acceleration), (maximal deceleration), (desired speed), (minimal intervehicular distance to leading vehicle), and (desired THW), as well as the additional parameter for the reaction time . In addition, the parameter (exponential parameter of speed) is fixed at as proposed in [43].

In our approach, the identification of drivers with extreme profiles is performed by calibrating the parameters of the IDM model, using a genetic algorithm referring to the work in [30]. The used GA method is the default model in Matlab 2020b [46], and for the algorithm description, it is presented in the book [47]. In general, the mean square error (MSE) is widely used for calibration of car-following models. Here, another error metric is used as in [30], root mean square percentage errors (RMSPE).where (respectively, ) denotes the car trajectory (positions in time) from data (respectively, from simulation) and where represents all the parameters of the considered model. The car longitudinal position is chosen as the target variable in the objective function.

The IDM model calculates car acceleration, as described in (6), with the driver’s reaction time . The reaction time is a delay of decision that driver takes for updating the acceleration with current situation with the leading vehicle. Car’s speed and position can be calculated as follows in (7) and (8), based on the Euler method.

In addition, for drivers with normal profile, we simulate them using an average profile. Their calibration of IDM parameters is extracted directly from the dataset by their physical meanings and combined with the default values in the SUMO simulator.

3. Specific Driver Profiles in the NGSIM Dataset

In a traffic dataset with a large number of trajectories, it is always possible to distinguish good drivers from risky ones. Following the definition of the two extreme driver profiles of Section 2.2, we consider the following four driver profile groups in the NGSIM dataset: group 1: the 2.5% most aggressive drivers, group 2: the second 2.5% most aggressive drivers (drivers ranked between 2.5% and 5% of mean THWs), group 3: the 2.5% most inattentive drivers, and group 4: the second 2.5% most inattentive drivers (drivers ranked between 2.5% and 5% in terms of inattentiveness). Mathematically, the four groups are defined as follows.

Let us denote the set , the of drivers whose is less than :

Now, we denote by the total number of drivers of the considered dataset and by the cardinal (number of elements) of :

Finally, we denote by the proportion of the number of drivers in with respect to the total number of drivers :

Let us now define and as follows:

Groups 1 and 2 are then defined as follows:

The groups 3 and 4 are defined similarly, that is,

Applying the definitions of groups 1, 2, 3, and 4 above, we obtain the thresholds of Table 1 on for groups 1 and 2 and on for groups 3 and 4 (Figure 4).

4. IDM Model Calibration for Extreme Driver Profiles

As mentioned above, the calibration is considered as the optimization problem with the aim at minimizing the error between the simulated car trajectories by the IDM model and the real car trajectories from the dataset. We chose here to apply a genetic algorithm (the default algorithm in Matlab [48]) and to attempt to find the global optimum parameters of selected drivers. Moreover, we repeated the genetic algorithm 10 times for every driver trajectory, in order to find a global optimum solution for the parameters of the calibration.

In the application of the genetic algorithm, upper and lower bounds need to be set for all parameters in the IDM model. For the reaction time and desired THW, the bounds for these two parameters need to be set differently for the aggressive and inattentive driver profiles because the parameters’ real distribution for these two profiles are different(Figure 4). Thus, we set that for aggressive drivers, the desired THW ranges from 0.1 to 4 s, while the reaction time is from 0.1 to 2 s, and for inattentive drivers, we set the desired THW ranges from 1 to 4 s, while the reaction time goes from to seconds, where is the value of (Section 2.2, where we approximate the driver’s reaction time to be around ). For other parameters, the bounds are the same for both aggressive and inattentive drivers as in [30]. Thus, the desired speed ranges from 10 to 40 m/s, the minimal intervehicular distance ranges from 0.1 to 10 m, and the maximum acceleration and the maximum deceleration range from 0.1 to 5 m/s2.

The results of calibration for the 4 specific groups of drivers are given in Appendix A. For the calibration on the car trajectories variable (car positions in time), we obtained for all 4 driver profile groups an RMPSE ranging in (1%, 2%). From the calibration results, given in Tables 25, we notice that group 1 driver profiles have shorter reaction times and desired headway times compared to group 2 driver profiles. On the other hand, group 3 driver profiles have longer reaction times and desired headway times than group 4 driver profiles. Moreover, the desired THW (parameter T of formula (4)) of attentive drivers is much larger than the one of aggressive drivers. This result of calibration of IDM parameters for aggressive drivers and inattentive drives is consistent with the findings from the dataset(Figure 4).

5. Numerical Simulation Experiments Setup

Before generating the collisions in traffic, we need to ensure that the NGSIM traffic can be simulated realistically. The reproduction of the NGSIM traffic data by numerical simulation consists on the creation of the road section, the selection of the microscopic traffic model (IDM car-following model plus the default lane changing model in SUMO), and the creation of vehicles. The road network of NGSIM is drawn manually using NETEDIT [49] provided by the SUMO simulator, which is presented as a road of 640 meters with 5 straight lanes. The creation of each vehicle is provided by its longitudinal origin position, longitudinal destination position, entering lane, and entering time (in seconds). This necessary information is extracted from the car trajectory data.

As what we supposed in Section 3, each extreme driver profiles have 2.5% in the initial traffic simulation. In Figure 5, we give car mean speed from the NGSIM real data and from numerical simulation in SUMO. This figure shows that the traffic simulation represents well the state of traffic (car speed) over time. This first simulation with initial driver profiles distribution in the traffic show that we can simulate the real NGSIM traffic in SUMO. Under this condition, we can begin the experiments for generating the collisions. In addition, during the simulation, SUMO can detect physical collisions (front and back bumper meet or overlap) [36]. We can get the collision counts at the end of each simulation. Four experiments of car collision simulation are proposed for different combinations of the two selected groups of drivers. Each experiment is carried out using different combinations of one group of aggressive drivers and one group of inattentive drivers. Driver profiles used in each experiment are given in Table 6, and the different groups of driver profiles are described in Section 3. For generating the car collisions, we assume here that the car collisions can be generated in each simulation by increasing the number of aggressive and/or inattentive drivers in the traffic. The increasing of the percentages of extreme driver profiles is done artificially and randomly by replacing the normal drivers by the chosen extreme driver profiles. In each experiment, we reset numbers of simulations by using different rates of drivers associated to each extreme driver profile (2.5% in the origin dataset and then 5%, 10%, 15%, 20%, 25%, 30%, 35%, 40%, 45%, and 50%, respectively, for aggressive drivers and inattentive drivers in each simulation). For each simulation, once the rates of the extreme driver profiles are fixed, all the other drivers are simulated as normal.

6. The Results

In this section, we give the results of the numerical simulations. In Subsection 6.1, we provide the number of car collisions generated by numerical simulation, based on the first 15 minutes data, and, respectively, for the defined four experiments. In Subsection 6.2, we focus on the analysis of the car collisions obtained in the simulations of experiment 1 (E1). In addition, we show that the number of simulated car collisions can be written as a function of the rates of aggressive and inattentive driver profiles in traffic. In Section 7, we use another 15-minute time period data for the validation of our approach.

6.1. Number of Car Collisions by Numerical Simulation

We start with the implementation of experiment E1. We observed that the number of collisions occurred in each simulation is various even when using the same rate of specific drivers. This is due to the random distribution of the specific (aggressive and inattentive) driver profiles over all the drivers. Furthermore, we observed the mean number of collisions converged after several times simulations for each case. Therefore, the presented collision counts in each case shown in the following tables are the average value over 10 simulations for each case. Moreover, the number of car collisions obtained by numerical simulation is given in Tables 710 for each experimentation (E1, E2, E3, and E4) respectively.

In Table 7, the simulated car collision counts are given for taking different rates of specific profiles in E1. The results indicate clearly that the number of intervehicular collision occurrences increases by increasing the number of aggressive and/or inattentive driver profiles. Similarly, from the results for E2, E3, and E4, given, respectively, in Tables 810, the number of intervehicular collision occurrences also increases by increasing the number of aggressive and/or inattentive driver profiles. It is shown that the number of obtained car collisions in these three experiments is lower than the simulated result in E1 (by comparing Tables 810 to Table 7). In addition, the number of car collisions obtained with E4 is the lowest compared to the other three experiments. This result shows that a different risk level is associated to each group of specific driver profiles. The first group of aggressive driver profiles could potentially cause more car collisions than the second group of aggressive driver profiles (comparing the results of E1 and E2). Similarly, the first group of inattentive driver profiles could cause more car collisions than the second group of inattentive driver profiles (comparing the results of E3 and E4). It is interesting to note that in all the four experiments, no collisions occurred when the percentage of extreme drivers was low. With the increase of the rate of each specific driver profile, the number of car collisions increases. However, this increase seems to be different from one experiment to another (Figure 6), where each line presents the number of car collisions in the simulation of each experiment with the condition that aggressive drivers and inattentive drivers have the same rate in the traffic. From Figure 6, we can observe clearly that E1, where we generated car collisions based on varying the percentage of the most aggressive and the most inattentive drivers, can generate more car collisions than the other three experiments. The E4 has probably the least number of car collisions, where we generated car collisions based on varying the percentage of the second aggressive and the second inattentive drivers.

6.2. Analysis of Car Collisions Obtained in Experiment 1 (E1)

In this subsection, we focus on further analyzing the results obtained by E1 (Table 7). A car collision is observed in SUMO when a following vehicle (veh2) crashes into the rear-end of the proceeding vehicle (veh1). We can get the driver profile information of the two vehicles involved in each collision. Therefore, the distribution of rear-end car collisions produced by different driver profiles (aggressive, normal, and inattentive) is shown in Figure 7, and the distribution is shown in percentage. In Figure 7, the suffixes ag, long, and nor are used to indicate aggressive, inattentive, and normal driver profiles, respectively. We give the percentages of the number of car collisions caused by each pair of driver profiles rates regarding to all generated collisions. It is shown in Figure 7 that collisions generated by inattentive (veh1)-aggressive (veh2) represent the largest percentage, which is at 28.99% of the total number of simulated collisions.

Furthermore, Figure 8 shows the relationship between the number of simulated car collisions and the rates of aggressive and inattentive driver profiles, which presents in another way for the values of the result given in Table 7. From Figure 8, it seems that the sum of the two rates of aggressive and inattentive driver profiles is important in determining the number of car collisions. We can see in this figure that the contours (the different levels of numbers of collisions) have a regular relationship with the percentage of specific drivers, where the gap of different levels is similar and the contours have a similar shape. However, the contours are not linear which means that the sum of the rates of aggressive and inattentive driver profiles is important, but is not the unique parameter for the determination of the number of car collisions.

In order to better understand this relationship, we use the statistical regression method to assume that car collision counts (Y) are proportional to the percentage of aggressive drivers (%) and the percentage of inattentive drivers (%), which can be formulated as .

We have first tried linear regression, and the result is shown in Appendix B, which shows that the assumption of linearity in not convincing. Moreover, we observe from Table 7 (red cases) that the number of car collisions generated with 30% of aggressive driver profiles and 30% of inattentive driver profiles is 60. Holding on this 60%, however, with 25% (also, respectively, 20%, 15%, 10%, 35%, 40%, 45%, and 50%) of aggressive driver profiles and 35% (also, respectively, 40%, 45%, 50%, 25%, 20%, 15%, and 10%) of inattentive driver profiles, the number of generated car collisions become less, although the sum of all the pairs of considered rates of aggressive and inattentive driver profiles is 60%. Therefore, it seems that the number of generated car collisions decreases with the increasing of the absolute difference between the rates of aggressive and inattentive driver profiles. In other terms, the number of generated car collisions increases with the uniformity of mixing both profiles (aggressive and inattentive). Based on this observation, we propose to add a quadratic term of the approximation of the relationship between the number of generated car collisions and the rates of aggressive and inattentive driver profiles.where is the estimation of car collision counts, % is the percentage of aggressive drivers, and % is the percentage of inattentive drivers. We set the lower bound of the function at zero, since the number of car collisions cannot be negative. We obtained the following coefficients: , and , with RMSE .

By this result, first, the fact that confirms the significance of the effect of the absolute difference between the two rates of aggressive and inattentive driver profiles on the number of generated car collisions. Second, the fact that is negative confirms our hypotheses and shows the number of generated car collisions decreases with the increase of the absolute difference between the rates of aggressive and inattentive driver profiles.

In Figure 9, the number of generated car collisions for each percentage of two drivers profiles is presented, as well as the surface obtained by the nonlinear regression function.

7. Validation of the Approach on the Second 15-Minute Time Period of Data

In this section, the same approach (Figure 1) for car collision generation is carried out based on the second 15-min time period of the NGSIM 101 dataset with the purpose of validation of the approach, where 1495 vehicles are registered in this period on the 5 normal traffic lanes. The thresholds on (mean THW) and for each driver profile are given in Table 11. The simulation of traffic in SUMO for this period is shown in Figure 10. The simulated resulting car collisions, for the 4 experiments (E1-E4), are given in Tables 1215.

The result using the second 15-minute period data is close to the result using the first 15-minute period. The thresholds obtained from the four experiments are similar. The simulated car collisions using the second 15-minute data show the same trend as the first 15-minute data. Experiment E1 simulates the greatest number of car collisions, while E4 simulates the lowest number of car collisions (Tables 1215). Furthermore, as shown in Figure 6 which shows the result for the experimentation based on the first 15-minute period data, Figure 11 shows similar results (number of generated crashes for the four experiments) for the experimentation based on the second 15-minute data. In Figure 11, each line presents the number of car collisions in the simulation resulted from one experiment. In Figure 11, experiment 1, where we generated car collisions based on varying the percentage of the most aggressive and the most inattentive drivers, can generate more car collisions than the other three experiments. Experiment 4 has probably the least number of car collisions, where we generated car collisions based on varying the percentage of the second group of aggressive drivers and the second group of inattentive drivers. These similar observations have been already obtained, as shown in Figure 6. In addition, the number of car collisions obtained in experiment E1 based on the second 15-min data is given in Table 12. We can see clearly that the results shown in Figure 12 are similar to the results shown in Figure 8. As the same, to further study on the relationship between the number of collision and the percentage of aggressive drivers and inattentive drivers in the whole traffic population, we applied to the same regression method for the first 15-min time period data. A nonlinear regression is performed (15), where we get the following coefficients: , and , with an RMSE.

8. Analysis of Crash Severity

All simulated car collisions can be further used to explore car collision severity. Several research efforts investigate the relationship of rear-end car collision severity to car speed. In [50], the authors reported that the critical impact speed was approximately 55 km/h for rear-end car collisions. In addition, in [51], the authors claim that the change of speed before and after car collision is a critical indicator for car collision severity outcomes.

In physics, the kinetic energy (KE) of an object is the energy that it possesses due to its speed (16). In the case of front-rear collision of two moving cars, the kinetic energy is related to the relative speed of the front vehicle and the rear car (17).

Relative speed could thus be an important surrogate to indicate the severity of collision. In simulations with 50% drivers being aggressive and the other 50% being inattentive, 866 car collisions were generated (with 5 simulations). Among all car collisions, 220 (25.4%) concerned aggressive-aggressive couples of driver profiles, 408 (47.1%) concerned inattentive-aggressive, 107 (12.4%) concerned inattentive-inattentive, and 131 (15.1%) concerned aggressive-inattentive. In Figure 13, the distribution of relative speed for these 866 simulated collisions is shown. Car collisions involving inattentive (veh1)-aggressive (veh2) drivers accounted for the largest rate of all simulated collisions. All generated collisions have a relative speed below 50 km/h. Furthermore, collisions produced by two inattentive drivers are more severe, since its mean relative speed is at 23.27 km/h, which are more critical than the collision between other driver profiles. Car collision severity is much lighter between two aggressive drivers who have an average of relative speed at 15.7 km/h.

SCANeR studio [52, 53] is a driving simulation software, which also includes Bullet physics engine (a free and open-source software of simulation of collision detection, soft, and rigid body dynamics) and provides the opportunity to simulate vehicle car collisions with a vehicle physical model. For every simulated car collision in SUMO, the involved two vehicles’ trajectory can be registered. Then, the car collision scene can be replayed in SCANeR Studio. Thus, the result of a car collision simulated in SUMO can be resimulated in SCANeR Studio (Figure 14). In Figure 15, the collision force for certain replayed car collisions is shown. Each point corresponds to one resimulated car collision by SCANeR. It seems that the force is a superlinear function of the relative speed (Figure 15). However, this fact needs more investigations and should be considered in our future research. The interest of this work will be to couple the SUMO traffic simulator with SCANeR studio (the immersive driving simulator) in order to build a high-performance system to test and validate autonomous vehicles.

9. Conclusion and Scope for Further Work

In this work, the main contributions consist of the following: extracting three proposed driver profiles based on real driving datasets, replicating these profiles in a simulated environment, and establishing a relationship between car collision occurrences and these different driver profiles by varying their percentages in the whole traffic via virtual simulation.

Based on the NGSIM 101 dataset, two specific driver profiles related to car collisions on road networks and a normal driver profile have been classified, defined as (i) aggressive drivers who keep short time headways with cars ahead of them, (ii) inattentive drivers with long reaction times, and (iii) normal drivers with intermediate values of reaction time and time headway. These three driver profiles have been simulated by using the intelligent driver model (IDM) car-following model, with an extension including driver reaction time. In order to represent the real driver profiles, the IDM model is calibrated using a genetic algorithm. Finally, by increasing the percentage of these two extreme driver profiles among all drivers in a virtual traffic simulation, we investigate the effect of these driver profiles on car collision occurrences.

The results of the numerical simulations show that the percentages of the aggressive and inattentive driver profiles over the whole driver population are determinant in the car collision occurrences and in the resulting severity outcomes. One of the important results we obtained in this work is the characterization of the relationship between the ratios of these two driver profiles over the whole driver population and the car collision occurrence count. We have also classified the car collisions that were generated and analyzed their severity, in particular with respect to the relative speed between the cars involved in the collisions. Another important result of this research is that the car collisions involving an aggressive driver in the leading vehicle and an inattentive driver following represent the most frequent collision occurrences, while collisions between two inattentive drivers were the most severe.

The safety validation of intelligent connected vehicles is essential and could be critical for their deployment. In order to complete the demonstration of the reliability of the system’s safety, autonomous vehicles need to be driven for hundreds of millions of miles. However, the huge cost of such physical tests, combined with the inherent danger of testing situations where collisions can happen, makes the numerical simulation of scenarios mixing different driver profiles an important safety assessment tool for ICV testing. Furthermore, during the deployment phase of ICVs, the recognition of drivers’ profiles should be considered in order to avoid collisions. As shown by our experiments, based on our approach, different traffic scenarios can be generated with different driver profiles in traffic simulations. For future research, this work is expected to greatly facilitate future ICV testing and validation for the car manufacturing industry via numerical traffic simulation.

In this work, we used the IDM model and the simulator SUMO to generate car collisions. The method is limited by calibration accuracy and model performance; therefore, the car-following model is not able to replicate 100% real human driving behavior. Regarding the perspectives for our proposed approach, tests with different car-following models need to be considered in further works. Moreover, this work is an experiment based on the NGSIM 101 dataset, and the profiles are extracted from this dataset. We also intend to work in the future with other datasets in order to confirm the derived driver profiles and/or derive new driver profiles, related to car collisions occurrences.

Appendix

A. Calibration Result for Specific Drivers

The results of calibration for 4 specific groups of driver in the 15-minute data of the NGSIM 101 dataset are given as follows: in Table 2 for drivers of group 1, in Table 3 for drivers of group 2, in Table 4 for drivers of group 3, and in Table 5 for drivers of group 4. These tables show the mean, median, standard deviation, 25% quantile, and 75% quantile, of each parameter and for each group drivers. The tables give also the bounds for the genetic algorithm.

B. Linear Regression of the Number of Simulated Car Collisions

In order to find the relationship between simulates car collisions number and the rate of specific profile drivers for the result given in Table 3, we obtain , with , and . The obtained root mean square error (RMSE) is 9.7838. In Figure 16, the number of car collisions for each percentage of two drivers profiles is presented, as well as the plane obtained by regression. This result allows us to estimate the proportion between aggressive and inattentive driver profiles generating the same number of car collisions:

Although the linear regression is satisfying with RMSE = 9.7838 and , we can see on Figure 1 that we need to still improve approximation.

Data Availability

This is a public dataset for traffic research. The homepage of this dataset is as follows: https://ops.fhwa.dot.gov/trafficanalysistools/ngsim.htm.

Conflicts of Interest

The authors declare that they have no conflicts of interest.