Abstract

This study intended to investigate the crash injury severity from the insights of the novice and experienced drivers. To achieve this objective, a bivariate panel data probit model was initially proposed to account for the correlation between both time-specific and individual-specific error terms. The geocrash data of Las Vegas metropolitan area from 2014 to 2017 were collected. In order to estimate two (seemingly unrelated) nonlinear processes and to control for interrelations between the unobservables, the bivariate random-effects probit model was built up, in which injury severity levels of novice and experienced drivers were addressed by bivariate (seemingly unrelated) probit simultaneously, and the interrelations between the unobservables (i.e., heterogeneity issue) were accommodated by bivariate random-effects model. Results revealed that crash types, vehicle types of minor responsibility, pedestrians, and motorcyclists were potentially significant factors of injury severity for novice drivers, while crash types, driver condition of minor responsibility, first harm, and highway factor were significant for experienced drivers. The findings provide useful insights for practitioners to improve traffic safety levels of novice and experienced drivers.

1. Introduction

According to the National Center for Statistics and Analysis (NCSA), each year, it is estimated that 25,000 people got killed in motor vehicle crashes, and the trend tends to cause a 9% increase. Among these crashes, novice drivers account for a large proportion, whose risk per mile driven is nearly 3 times greater than that of experienced drivers, especially in the first 6 months when licenses are issued [1]. Furthermore, novice drivers are easily influenced by a variety of factors, such as various distractions (e.g., neon lights and billboards) along the roadways, smartphones, and online chatting, which causes more severe injury than experienced drivers. A variety of factors affect the injury severity, including human (drivers, pedestrians, and bicyclists), vehicles, roadway, and environment, but novice and experienced drivers may cause different injury severity levels because of the personal features and factors. However, the identification and determinants of injury severity among novice and experienced drivers have not been uniformly recognized, although various scholars have explored different aspects of novice drivers, experienced drivers, or both. Moreover, for the injury severity levels, there may exist interrelations between their unobservables of both groups; thus, how to estimate the two (seemingly unrelated) injury severity levels simultaneously and to control for interrelations between their unobservables may be challenging. Therefore, it is necessary to investigate and determine the influencing factors of injury severity for novice and experienced drivers so that some general consensus can be reached, and the interrelations can be addressed.

From the beginning of this century, the crash analysis of novice drivers has been very active. Ferranteet et al. [2] explored the relationship among novice drink drivers, recidivism, and crash involvement using multivariate survival analysis. The results found that if a driver’s first drink driving offense occurred at a younger age, he/she was significantly more likely to drink, drive, and crash again. After that, Simons-Morton et al. [3] described the effects of the Checkpoints Program on parent limits on novice teen driving through six months after licensure. It was found that it was possible to foster modest increases in parental restrictions on teen driving limits during the first 6 months of licensure, but the level of restriction was not sufficient to protect against violations and crashes. Continued with the crashes of novice teenage drivers, Braitman et al. [4] identified the characteristics and contributing factors leading to crashes of novice 16-year-old drivers in Connecticut. The results revealed that three-fourths of the crash-involved teenagers were at fault, while more than half of the at-fault crashes of newly licensed novice drivers involved more than one contributing factor including speed, loss of control, and slippery roads. From the perspective of simulation, Ivers et al. [5] explored the risky behaviors and risk perceptions of young novice drivers and sought to determine the relation with crash risk. A detailed questionnaire was conducted and Poisson regression was employed to explore crash risk. The results reached that self-reported risky driving behaviors among novice drivers were associated with 50% increased risk of a crash and the types of novice driver policies need to be strengthened. A similar study by McDonaldet al. [6] developed simulator scenarios for assessing novice driver performance with crash data. Chapman et al. [7] evaluated crash and traffic violation rates before and after licensure for novice California drivers subject to different driver licensing requirements. Plots and Poisson regression were employed to compare overall rates and subtypes of crashes and traffic violations among novice drivers. It was found that novice 16- and 17-year-old drivers’ highest crash rates occur almost immediately after they were licensed, and their peak traffic violation rates were delayed until around age of 18. A recent study by Curry et al. [8] employed the Poisson regression to compare crash rates of older and younger novice drivers. It was found that older novice drivers experienced much less steep crash reductions over the first year of licensure than younger novice drivers. Moreover, early night crash rates of novice drivers under age 21 declined rapidly while changes in late-night crashes were much smaller.

Compared to novice drivers, experienced drivers perform much better due to the personal status, driving experience, decision-making ability, and so on, so not many studies concentrate on experienced drivers individually. However, in order to verify this, a variety of studies have focused on the comparison between novice and experienced drivers. From the ergonomics aspect, Underwood et al. [9] performed the eye fixations, while novice and experienced drivers drove along different types of roadways. Differences in sequences of fixations were found between novice and experienced drivers on three types of roads, and experienced drivers showed greater sensitivity overall, while the novice drivers revealed some stereotypical transitions in the visual attention. From the perspective of driving skill, Craen et al. [10] questioned whether novice drivers overestimate their driving skills more than experienced drivers. Questionnaires were designed, and the results showed that when the novice drivers were compared themselves to the average and peer drivers, they were not as optimistic about their driving skills, but when comparing their self-assessment with actual behavior, they overestimated their driving skills. Mitchell et al. [11] compared the crash circumstances of common crash types for novices and experienced drivers in New South Wales, Australia. Correspondence analysis revealed that the crash characteristics between novice and experienced drivers were similar, but vehicle speed, fatigue, and alcohol were risky factors in novice driver crashes. Crundall [12] tested hazard prediction in isolation to assess discriminates between novice and experienced drivers. Based on the Situation Awareness Global Assessment Technique, the results suggested that experienced drivers found hazard prediction less effortful, while response time measures can discriminate between novice and experienced drivers.

Simulation plays an important role in the crash risk analysis of novice and experienced drivers. By simulating different scenarios, Lee et al. [13] detected the road hazards of novice teen and experienced adult drivers. The results indicated that a large portion of teen drivers failed to disengage from peripheral task engagement in the presence of hazards, while the adult drivers observed hazards and demonstrated overt recognition of hazards more frequently than the teen drivers did. A similar study by Smith et al. [14], from the perspective of the sleepiness effect, investigated hazard perception in novice and experienced drivers. Based on the video test, the results indicated that the hazard perception skills of the more experienced drivers were relatively unaffected by mild increases in sleepiness, but the novice drivers were significantly slowed. Ohlhauser et al. [15] compared the driving performance of novice teenage drivers and experienced drivers over the span of six monthly simulator sessions. It was found that novice drivers’ perception response times (PRT) to the braking events were significantly longer than those of the experienced drivers. From the visual information search in simulated junction negotiation, Scott et al. [16] compared gaze transitions of novice and experienced drivers. The results revealed that when scanning the junction, young experienced drivers distributed their gaze more evenly across all areas, whereas older and novice drivers made more sweeping transitions, bypassing adjacent areas. A similar study by Alberti et al. [17] strengthened the impact of a restricted field of view on visual search and hazard perception, by comparing novice and experienced driver performance in a driving simulator. The results showed that all drivers were more likely to avoid the hazards when presented with a wide view, but gaze movement recording revealed that only experienced drivers made overt use of wider eccentricities. Seacrist et al. [1] made the comparison of crash rates and rear-end striking crashes among novice teens and experienced adults using a driving study. The results identified significantly more crashes and rear-end striking crashes among the teen group than the adult group, which conformed to the previous findings.

Another important approach to determining the influencing factors of crash rate/injury is econometric modeling. Simon-Morton et al. [18] compared rates of risky driving among novice adolescent drivers and adult drivers and elevated g-force event rates by Poisson regression with random effects. The findings revealed that elevated g-force events among novice drivers may have contributed to crash and near-crash rates that remained much higher than adult levels after 18 months of driving. Chapman et al. [7] employed plots and Poisson regression to compare overall rates and subtypes of crashes and traffic violations among novice drivers.

During the last decade, there have been a variety of different approaches and perspectives [1922] presented in safety evaluation, among which multivariate regression analysis has been considered as one critical method dealing with two or more dependent variables with correlation and heterogeneity issues. At an early stage, Yamamoto and Shankar [23] developed a bivariate ordered-response probit model of driver’s and passenger’s injury severities in collisions with fixed objects. The results revealed the driver’s characteristics, vehicle attributes, types of objects, and environmental conditions had an effect on both driver and passenger injury severity. After that, Dong et al. [24] analyzed injury crashes and proposed a random-parameter bivariate zero-inflated negative binomial regression model. A Bayesian approach was employed as the estimation method, and the results showed that the proposed model outperformed other investigated ones. The model gained new sights into how crash occurrences were influenced by risk factors. Focused on temporary disability and permanent motor injuries, Ayuso et al. [25] introduced a bivariate copula-based regression model for the joint analysis. The findings illustrated that the conditional distribution function of injury severities may be estimated. A similar study by Wali et al. [26] applied copular-based bivariate ordinal models to investigate the degree of injury severity sustained by drivers involved in head-on collisions. Chen et al. [27] developed a random-parameter bivariate ordered probit model to examine influencing factors by two drivers involved in the same rear-end crash between passenger cars. Taken both within-crash correlation and unobserved heterogeneity into consideration, the proposed model outperformed the individual ordered probit models with fixed parameters, which provides the foundation for this study. A recent study by Besharati et al. [28] extended into a bivariate spatial negative binomial Bayesian model with random effects of traffic fatalities and injuries across Provinces of Iran. Unobserved heterogeneity and spatial correlation were addressed, and the results helped to prioritize area-wide safety initiatives and programs. Besides bivariate regression models, different multivariate regression models, for example, multivariate tobit analysis [2932], Bayesian multivariate approach [33, 34], multivariate spatial or/and temporal models [3539], and mixture of abovementioned models, have been presented to address correlation and unobserved heterogeneity among injury severities.

As summarized from the literature above, there have been various methods and comparisons about the crash risk analysis of novice and experienced drivers. However, most of the studies address the crash risk of novice and experienced drivers separately, and there may exist interrelations between the unobservables, which can be accommodated by multivariate regression models. Therefore, the purpose of this study is to estimate the two (seemingly unrelated) nonlinear injury severity levels and to control for interrelations between their unobservables with bivariate random-effects probit models, which can address the injury severity levels simultaneously and accommodate the interrelations between the unobservables (i.e., heterogeneity issue).

2. Methodology

This study attempts to jointly model the injury severity of novice drivers and experienced drivers. Since the injury severity levels could be interdependent, there may be an interrelation between the unobservable factors influencing the injury severity of novice drivers and experienced drivers. In order to address this issue, the bivariate probit model is proposed where the injury severity levels of novice and experienced drivers rely on the set of independent variables, and the interrelation between the two error terms is considered as an auxiliary parameter. The reason that the bivariate probit model is selected lies in that whether the interrelation is significantly different from zero or not, the selected model does not require exclusion restrictions to provide meaningful estimates, particularly of the interrelation coefficient [40]. More importantly, the bivariate panel data probit model can estimate two (seemingly unrelated) nonlinear processes and control for interrelations between the unobservables, which can account for the correlation in both time-specific and individual-specific error terms.

Specifically, the model includes two equations, one for the binary injury severity of novice drivers () with main responsibility, Property Damage Only (PDO) (0) or injured and fatality (1), and the other for the binary injury severity of experienced drivers () with main responsibility, PDO (0) or injured and fatality (1). Therefore, the equations can be expressed as follows:where i represents the panel variable (here is referred to as individual observation) with i = 1,…N, and t denotes the time point (here is referred to year) with t = 1,…T. The dependent variables and are explained by the independent variables and , respectively. and are coefficients, and refers to the process-specific error terms with . Here, includes two parts, an individual-specific time-invariant error term and a time-specific idiosyncratic shock ; that is, ; thus, equation (1) can be described as

Due to the normalization of the error terms, the two components are assumed that the error terms αj are normally distributed, and the idiosyncratic shocks uj are standard normally distributed. The ratio of the time-constant individual-specific error term and the composite error term is calculated aswhere ; if , the correlation of the error terms can be calculated aswhere refers to the correlation of the error terms.

The individual likelihood function Li can be obtained from the product of the joint probability of the observed binary outcome variable and the joint density of the random-effects error terms ,where refers to the covariance of the random-effects error terms (). Since the joint density of the random-effects error terms is assumed to follow a bivariate normal distribution, the joint probability of the observed binary outcome variable is expressed aswhere refers to the bivariate normal cumulative distribution function and .

According to Greene [41], the bivariate normal cumulative distribution function can be described as the following form:while the density takes the form as follows:

Thus, the likelihood of the sample can be expressed as follows:

The estimation can make full use of quasirandom numbers (Halton draws) and maximum simulated likelihood to achieve the correlation between the error terms of both processes. For more details about the bivariate probit and random-effects probit models, refer to Yildirim et al. [40] and Plum [42].

3. Data Description

Similar to the dataset adopted by Xiao et al. [43], Arc GIS open data site maintained by Nevada Department of Transportation (NDOT) from 2014 to 2017 was considered as the data source. “Identical” dataset here denotes that both datasets are from the same open dataset by NDOT, but the variables and modeling employed in this study are different; that is, the dataset is a different subset from the Xiao et al. 27 major and minor arterials in the metropolitan Las Vegas area were the target population selected in this study, which included City of Las Vegas, City of North Las Vegas, City of Henderson, and Clark County. Four main aspects were collected and considered: the crash status, the vehicle features, roadway characteristics, and environment.

As shown in Figure 1, 27 arterials 1999 injuries including both novice drivers and experienced drivers were considered. Conformed to Seacrist et al. [1], here, the novice and experienced drivers are selected among 16–19-year-old and 35–54-year-old drivers correspondingly. In Nevada, PDO, injury, and fatality are classified as three injury severity types. Since the fatality only accounted for 0.5% and the injury was quite similar, the injury and fatality categories were merged as one type, which may not affect the inference potentially. Therefore, the dependent variables in the proposed model were considered as binary injury severity, in which PDO was regarded as one, while injury and fatality were treated as the contrast, finally forming a binary probit model.

In the form of the vehicle profiles, the explanatory variables include the total vehicle, vehicle types, vehicle direction, vehicle action (e.g., changing lanes, making U-turn, and passing other vehicles), vehicle conditions (e.g., hit and run, mechanical defects, and driving too fast), and vehicle driver’s age and driver’s conditions (e.g., normal, fatigue, physical impairment, and distracted), whereas pedestrian, pedal cyclist, and motorcyclist are also considered. In this study, according to the classification by NDOT, when the crash happens, if there are two or more vehicles involved, the vehicle with the main responsibility here is considered as vehicle 1, and the rest with minor responsibility is vehicle 2. After the dataset was cleaned, crashes involving two vehicles account for 87% of injuries, which verifies the classification reasonably. In this study, the selected injury severity involves both novice and experienced drivers, so that the same injury can be addressed simultaneously.

The roadway characteristics involve the number of vehicle lanes, roadway conditions (e.g., dry, wet, ice, and snow), and the crash environment extracts the weather, lighting conditions, and first harm (e.g., median, fence, and pedestrian).

In order to evaluate the proposed models in Stata software, the categorical variables are digitalized, and all the variables collected are summarized in Appendixes A and B for novice and experienced drivers, respectively.

4. Results

Based on the typical variables selected, the characteristics of the crashes and correlation among main factors could be examined. In this study, Stata software was used to analyze the data. The correlation test was conducted to avoid the colinearity among the independent variables. In this study, crash type is highly related to total vehicle, while vehicle 2 action, vehicle 2 type, and vehicle 2 driver condition are highly correlated with each other; thus, in the final results, the variables with a high correlation may not occur at the same time.

The bivariate random-effects probit and bivariate probit models are proposed to assess the likelihood of novice and experienced drivers. The final results are presented in Table 1 with 50 Halton draws. In order to make the comparison, both numbers of observations are selected as the same.

As shown in Table 1, in the novice driver injury model, crash type, vehicle 2 type, pedestrian, and motorcyclist are significant for both bivariate probit model and bivariate random-effects probit model, while in the experienced driver injury model, crash type, vehicle 2 driver condition, first harm, and highway factor are significant. The covariances ρ of both models are not equal to 0, implying that correlation does exist between the injury severity levels of novice and experienced drivers, although the correlation is lower than 0.5. The log-likelihood values at convergence (−978.069) and zero (−1894.337) from the bivariate random-effects probit model are a little smaller than those (−925.747 and −1809.635) from the bivariate probit model, respectively. It can be found that the goodness of fit of the proposed bivariate random-effects probit model performs better than that of the bivariate probit model; thus, the following explanation would concentrate on the proposed model.

Table 1 demonstrates the effect on injury severity of novice and experienced drivers. For novice drivers, crash type and vehicle 2 type are negatively related to injury severity while pedestrian and motorcyclist notably increase the likelihood for injury severity levels. Compared to unknown crash types, the injury severity is reduced with the changing from angle to noncollision, which is understandable. Among all the crash types, angle and rear-end crashes frequently occur, accounting for about 85% and leading to different injury severities as verified by Xu et al. [44] and Hosseinpour et al. [45]. With the crash type from angle to noncollision, the injury severity of novice drivers is reduced about 110%.

Vehicle 2 type is negatively related to injury severity of novice drivers, indicating that the injury of cars and trucks is less than that of motorcycles, which is in line with the studies by Quddus et al. [46], Zmbon and Hasselberg [47], and Chang et al. [48]. Since motorcycles are exposed outside, even the drivers with minor responsibility (vehicle (2)) may still be suffered from severe injuries. Computed from the marginal effect, the injury severity of cars and trucks may be decreased about 4.9% compared to motorcycles.

Pedestrians play a positive significant role in the injury severity of novice drivers, meaning that the more the pedestrians, the more severe the injury of novice drivers. The study is uniform with Oh [49], and the possibility may go up to 139% if pedestrians are increased by onefold. The reason is that the driving skills of novice drivers are inadequate and they may become nervous when more pedestrians show up; thus, the possibility of running into injury is raised.

Similarly, motorcyclists have a positive association with the injury severity of novice drivers, implying that more motorcyclists may increase the injury severity. It can be calculated that the possibility may rise 167% if motorcyclists are increased by onefold, which is in agreement with common sense. More motorcyclists may produce the disordered traffic easily and cause more conflicts, especially for novice drivers, since they are not very skilled and may go on the rampage, thus leading to more chances of running into crashes.

For experienced drivers, crash type, first harm, and highway factor are negatively related to injury severity while vehicle 2 driver condition is positively concerned with injury severity. Similar to novice drivers, the injury severity is reduced with the crash type changing from angle to noncollision, compared to unknown crash types, and the possibility is reduced about 5.6%. It can be seen that the novice drivers or experienced drivers can be influenced by various crash types.

Different from the novice drivers, the driver conditions of vehicle 2 are positively significant to the injury severity of experienced drivers, indicating that, compared to the unknown, apparently normal condition causes less injury severity. This is in line with Weber et al. [50], and the possibility increases about 1.6% with the driver condition varying from the normal conditions to the unknown. Although most crashes happen under apparently normal conditions, the injury severity may be more severe under unknown conditions because the unknown makes the driving condition unpredictable.

The last two negatively significant variables are the first harm and highway factor. With the variation of first harm from cross median/centerline to “no data,” the injury severity is decreased. Since first harm mainly includes motor vehicle in transport, slow/stopped vehicle, and “no data,” the injury severity of motor vehicle in transport is the worst, which makes sense. Because the motor vehicles in transport have more chances of running into conflicts, the possibility of injury severity is reduced by 11% than the others.

Compared to none highway factor, injury severity in the active work zone is the worst, which reaches some consensus with Wong et al. [51] and Sze and Song [52]. In the active work zone, speeding happens frequently, as well as the stop-and-go traffic, thus causing more chances of running into conflicts and leading to injury severity.

5. Discussion

So far, there have been various approaches and comparisons about the crash injury analysis of novice and experienced drivers. However, most of the studies address the crash injury severity of novice and experienced drivers separately and there may exist interrelations between the unobservables. In this study, in order to estimate the two (seemingly unrelated) nonlinear injury severity levels and to control for interrelations between their unobservables, the bivariate random-effects probit models are proposed, which can address the injury severity levels simultaneously and accommodate the interrelations between the unobservables (i.e., heterogeneity issue).

As shown in Table 1, the closer examination of the estimated results reveals some similarities and differences between novice and experienced drivers. First, the similarity is that, among all the influencing variables, crash types are of significance for injury severity of both novice and experienced drivers. This indicates that certain crash type would lead to specific injury severity and need to be paid more attention whether for the novice or experienced drivers. Secondly, the difference is that significant variables for novice drivers may emphasize more on moving objects, especially the vulnerables, that is, pedestrians and motorcyclists, since their driving skills are not mature enough and still need more time to become accustomed to driving situation, while for experienced drivers, the injury severity is more derived from static facilities and environment. This implies that, after a certain driving period, experienced drivers have become used to the moving objects, while paying less attention to the static ones.

According to the results obtained, from an empirical point of view, for the novice drivers, more education and training hours are necessary before they are qualified to drive on the roadways safely, while the pedestrians and motorcyclists should be paid more attention with clear warning/crossing signs and helmets, respectively. As for the experienced drivers, more alternative facilities should be designed to avoid the first harm; the presence of active work zones increases the injury severity; thus, one way of improving the safety is to organize the traffic flow efficiently to avoid the conflicts between vehicles, so that the injury severity levels may be decreased.

6. Conclusions

In this study, bivariate random-effects probit model was proposed initially to investigate the injury severity among novice and experienced drivers, in which both injury severity levels were addressed by bivariate (seemingly unrelated) probit simultaneously, and the interrelations between the unobservables (i.e., heterogeneity issue) were accommodated by random-effects model. The results showed that crash types, vehicle 2 types, pedestrians, and motorcyclists were potentially significant factors of injury severity for novice drivers, while crash types, vehicle 2 driver condition, first harm, and highway factor were significant for experienced drivers.

Two main findings can be drawn from the results of the study. First, there indeed exists a correlation between novice drivers and experienced drivers in injury severity, although the correlation is not so strong. Second, bivariate random-effects probit model can address the injury severity levels simultaneously and accommodate the interrelations between the unobservables (i.e., heterogeneity issue), which extends the range of bivariate probit analysis.

Some drawbacks still exist in this study. One is that the division of novice and experienced drivers is conducted using the age difference as the dataset provides, and the preferred division should depend on the proposed criterion described, that is, the number of years with a valid driver’s license or the number of miles driven, which may reflect the actual driving experience. Moreover, since the results of the study are based on the dataset from Las Vegas, it is worthwhile to try out different data sources to confirm the findings and transferability of this study in future studies. Further study may try other types of modeling, bivariate random-parameter probit model, or bivariate spatial probit model, so that spatial and temporal issues can be addressed efficiently.

Appendix

A. Summary of the Parameters for Novice Drivers

VariableDescriptionCount (proportion)
(i) Dependent variables
Injury severity0-PDO620 (31%)
1-Injury and fatality1379 (69%)
(ii) Categorical variables
Crash type1-Angle868 (43.4%)
2-Backing15 (0.8%)
3-Head-on9 (0.5%)
4-Rear-end827 (41.3%)
5-Sideswipe100 (5.0%)
6-Noncollision176 (8.8%)
7-Unknown base4 (0.2%)
Vehicle 1 type1-Car1388 (69.4%)
2-Truck/bus284 (14.2%)
3-Motorcycle base43 (2.1%)
4-Others44 (2.2%)
5-Pickup/van240 (12.1%)
Vehicle 1 action1-Backing up11 (0.5%)
2-Changing lanes120 (6.0%)
3-Going straight1180 (59.0%)
4-Making U-turn20 (1.0%)
5-Passing other vehicles/racing5 (0.3%)
6-Stopped15 (0.8%)
7-Turning left370 (18.5%)
8-Turning right126 (6.3%)
9-Other bases7 (0.4%)
10-Unreported138 (7.0%)
11-Unknown7 (0.4%)
Vehicle 1 driver condition1-Apparently normal1614 (80.7%)
2-Driving under influence (DUI)19 (1.0%)
3-Drowsiness, fatigue, fainted, and so on99 (5.0%)
4-Illness/physical impairment4 (0.2%)
5-Inattention/distracted113 (5.6%)
6-Obstructed view4 (0.2%)
7-Other bases63 (3.2%)
8-Unknown82 (4.1%)
Vehicle 1 condition1-Disregarded traffic signs/signals/road markings144 (7.2%)
2-Driving too fast168 (8.4%)
3-Failed to yield right of way499 (24.9%)
4-Failure to keep in proper lane or running off road140 (7.0%)
5-Follwed too closely324 (16.2%)
6-Hit and run62 (3.1%)
7-Made an improper turn48 (2.4%)
8-Mechnical defects8 (0.4%)
9-Other types of improper driving336 (16.8%)
10-Unknown base270 (13.5%)
Vehicle 2 type1-Car1033 (51.7%)
2-Truck/bus368 (18.4%)
3-Motorcycle base38 (1.9%)
4-Others262 (13.1%)
5-Pickup/van298 (14.9%)
Vehicle 2 driver age0-Less than 16-year-old base266 (13.3%)
1-16–19 years old96 (4.8%)
2-20–34 years old570 (28.5%)
3-35–54 years old725 (36.3%)
3-More than 55 years old382 (17.1%)
Vehicle 2 action1-Backing up0 (0.0%)
2-Changing lanes7 (0.4%)
3-Going straight955 (47.8%)
4-Making U-turn2 (0.1%)
5-Passing other vehicles/racing2 (0.1%)
6-Stopped674 (33.7%)
7-Turning left92 (4.6%)
8-Turning right38 (1.9%)
9-Other bases1 (0.01%)
10-Unreported6 (0.03%)
11-Unknown222 (11.1%)
Vehicle 2 driver condition1-Apparently normal1724 (86.2%)
2-Driving under influence (DUI)5 (0.2%)
3-Drowsiness, fatigue, fainted, and so on3 (0.1%)
4-Illness/physical impairment1 (0.05%)
5-Inattention/distracted2 (0.1%)
6-Obstructed view0 (0.0%)
7-Others1 (0.05%)
8-Unknown base263 (13.2%)
First harm1-Cross median/centerline9 (0.45%)
2-Fence/wall1 (0.05%)
3-Light/luminary support2 (0.1%)
4-Motor vehicle in transport435 (21.7%)
5-Pedal cycle/pedestrian10 (0.5%)
6-Ran off road left/right20 (1.0%)
7-Slow/stopped vehicle72 (3.6%)
8-Other fixed/movable objects6 (0.3%)
9-Other noncollisions7 (0.35%)
10-No database1437 (71.9%)
Road conditions1-Dry1942 (97.1%)
2-Wet49 (2.5%)
3-Ice/snow3 (0.1%)
4-Unknown base5 (0.3%)
Weather conditions1-Clear1707 (85.4%)
2-Cloudy237 (11.9%)
3-Rain44 (2.2%)
4-Blowing sand, soil, dirt, and snow2 (0.1%)
5-Others0 (0.0%)
6-Unknown base9 (0.45%)
Lighting conditions1-Daylight1334 (66.7%)
2-Dark605 (30.3%)
3-Dawn21 (1.0%)
4-Dusk36 (1.8%)
5-Unknown base3 (0.1%)
Highway factor1-Active work zone17 (0.9%)
2-Inactive work zone17 (0.9%)
3-Weather540 (27%)
4-Others26 (1.3%)
5-None base1399 (69.2%)
MeanS.D.Min.Max.
(iii) Continuous variables
Total vehicleTotal vehicles involved2.060.5916
PedestrianPedestrian0.0401
MotorcyclistMotor cyclist0.0201
Note. Unknown category in the dataset has no actual data.

B. Summary of the Parameters for Experienced Drivers

VariableDescriptionCount (proportion)
(i) Dependent variables
Injury severity0-PDO2510 (31%)
1-Injury and fatality5594 (69%)
ii) Categorical variables
Crash type1-Angle3452 (42.6%)
2-Backing50 (0.6%)
3-Head-on47 (0.5%)
4-Rear-end3290 (40.6%)
5-Sideswipe500 (6.2%)
6-Noncollision731 (9.0%)
7-Unknown base34 (0.4%)
Vehicle 1 type1-Car3950 (48.7%)
2-Truck/bus1872 (23.1%)
3-Motorcycle base168 (2.1%)
4-Others285 (3.5%)
5-Pickup/van1829 (22.5%)
Vehicle 1 action1-Backing up44 (0.5%)
2-Changing lanes495 (6.1%)
3-Going straight4787 (59.0%)
4-Making U-turn95 (1.1%)
5-Passing other vehicles/racing45 (0.5%)
6-Stopped91 (1.1%)
7-Turning left1327 (16.4%)
8-Turning right608 (7.5%)
9-Other bases63 (0.8%)
10-Unreported475 (5.8%)
11-Unknown74 (0.9%)
Vehicle 1 driver condition1-Apparently normal6197 (76.5%)
2-Driving under influence (DUI)803 (9.9%)
3-Drowsiness, fatigue, fainted, and so on70 (0.9%)
4-Illness/physical impairment90 (1.1%)
5-Inattention/distracted333 (4.1%)
6-Obstructed view25 (0.3%)
7-Other bases194 (2.4%)
8-Unknown392 (4.8%)
Vehicle 1 condition1-Disregarded traffic signs/signals/road markings698 (8.6%)
2-Driving too fast410 (5.0%)
3-Failed to yield right of way1686 (20.8%)
4-Failure to keep in proper lane or running off road523 (6.5%)
5-Follwed too closely1380 (17.0%)
6-Hit and run237 (2.9%)
7-Made an improper turn197 (2.4%)
8-Mechnical defects21 (0.3%)
9-Other types of improper driving1483 (18.3%)
10-Unknown base1469 (18.1%)
Vehicle 2 type1-Car4191 (51.7%)
2-Truck/bus1488 (18.4%)
3-Motorcycle base150 (1.8%)
4-Others1064 (13.1%)
5-Pickup/van1211 (14.9%)
Vehicle 2 driver age0-Less than 16-year-old base1078 (13.3%)
1-16–19 years old389 (4.8%)
2-20–34 years old2310 (28.5%)
3-35–54 years old2942 (36.3%)
3-More than 55 years old1385 (17.1%)
Vehicle 2 action1-Backing up2 (0.02%)
2-Changing lanes34 (0.4%)
3-Going straight3622 (47.8%)
4-Making U-turn17 (0.2%)
5-Passing other vehicles/racing9 (0.1%)
6-Stopped2867 (35.3%)
7-Turning left437 (5.4%)
8-Turning right157 (1.9%)
9-Other bases9 (0.1%)
10-Unreported35 (0.4%)
11-Unknown913 (11.2%)
Vehicle 2 driver condition1-Apparently normal6994 (86.3%)
2-Driving under influence (DUI)31 (0.3%)
3-Drowsiness, fatigue, fainted, and so on38 (0.4%)
4-Illness/physical impairment4 (0.04%)
5-Inattention/distracted2 (0.02%)
6-Obstructed view0 (0.0%)
7-Others7 (0.07%)
8-Unknown base1029 (12.7%)
First harm1-Cross median/centerline9 (0.1%)
2-Fence/wall4 (0.05%)
3-Light/luminary support7 (0.08%)
4-Motor vehicle in transport1646 (20.3%)
5-Pedal cycle/pedestrian64 (0.7%)
6-Ran off road left/right44 (0.5%)
7-Slow/stopped vehicle251 (3.1%)
8-Other fixed/movable objects17 (0.2%)
9-Other noncollisions12 (0.1%)
10-No data base6050 (74.6%)
Road conditions1-Dry7887 (97.3%)
2-Wet180 (2.2%)
3-Ice/snow3 (0.03%)
4-Unknown base34 (0.4%)
Weather conditions1-Clear6809 (84.0%)
2-Cloudy1107 (13.6%)
3-Rain146 (1.8%)
4-Blowing sand, soil, dirt, and snow7 (0.08%)
5-Others8 (0.09%)
6-Unknown base27 (0.3%)
Lighting conditions1-Daylight5616 (69.2%)
2-Dark2223 (27.4%)
3-Dawn88 (1.0%)
4-Dusk167 (2.0%)
5-Unknown base10 (0.1%)
Highway factor1-Active work zone95 (1.1%)
2-Inactive work zone107 (1.3%)
3-Weather154 (1.9%)
4-Others2326 (28.7%)
5-None base5421 (66.9%)
MeanS.D.Min.Max.
(iii) Continuous variables
Total vehicleTotal vehicles involved2.070.62110
Note. Unknown category in the dataset has no actual data.

Data Availability

The data that support the findings of this study are maintained by Nevada Department of Transportation, which can be accessed at the following website: https://data-ndot.opendata.arcgis.com/datasets/

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Authors’ Contributions

Yuan Q contributed to conceptualization and project administration; Xiao D and Xu X contributed to methodology, wrote the original draft , and were responsible for funding acquisition; Kang S. provided the software and performed validation and visualization; Xiao D performed formal analysis, investigation, , supervision, and data curation and provided the resources; Yuan Q. and Xu X reviewed and edited the article.

Acknowledgments

Thanks are due to the Nevada Department of Transportation (NDOT) for providing the dataset. This study was supported by the Fundamental Research Fund for the Central Universities (HUST: 2018KFYYXJJ001) and the National Natural Science Foundation of China (No. 71861023 & 52072214).