Traditional surrogate measures of safety (SMoS) cannot fully consider the crash mechanism or fail to reflect the crash probability and crash severity at the same time. In addition, driving risks are constantly changing with driver’s personal driving characteristics and environmental factors. Considering the heterogeneity of drivers, to study the impact of behavioral characteristics and environmental characteristics on the rear-end crash risk is essential to ensure driving safety. In this study, 16,905 car-following events were identified and extracted from Shanghai Naturalistic Driving Study (SH-NDS). A new SMoS, named rear-end crash risk index (RCRI), was then proposed to quantify rear-end crash risk. Based on this measure, a risk comparative analysis was conducted to investigate the impact of factors from different facets in terms of weather, temporal variables, and traffic conditions. Then, a mixed-effects linear regression model was applied to clarify the relationship between rear-end crash risk and its influencing factors. Results show that RCRI can reflect the dynamic changes of rear-end crash risk and can be applied to any car-following scenarios. The comparative analysis indicates that high traffic density, workdays, and morning peaks lead to higher risks. Moreover, results from the mixed-effects linear regression model suggest that driving characteristics, traffic density, day-of-week (workday vs. holiday), and time-of-day (peak hours vs. off-peak hours) had significant effects on driving risks. This study provides a new surrogate safety measure that can better identify rear-end crash risks in a more reliable way and can be applied to real-time crash risk prediction in driver assistance systems. In addition, the results of this study can be used to provide a theoretical basis for the formulation of traffic management strategies to improve driving safety.

1. Introduction

Statistics from the World Health Organization (WHO) show that the number of deaths caused by road traffic crashes is about 1.35 million each year, ranking eighth among all causes of death [1]. The serious consequences of traffic crashes have driven researchers to investigate the causes of the crashes. Among the many causes, driving behavior has been found to be a crucial one. For example, a study conducted by National Highway Traffic Safety Administration (NHTSA) found that driver-related factors account for 94% of the critical reasons of these crashes, and most studies indicated that traffic crashes can largely result from risky driving behaviors [24]. To reduce casualties and mitigate injuries from traffic crashes, understanding and identifying the crash risk is essential.

Among different types of traffic crashes, rear-end crashes are recognized as one of the most common types [5, 6]. Statistics from the NHTSA show that the rear-end crashes accounting for 32.4% of all accident types that cause personal injury [7]. Since most rear-end crashes occurred in car-following situations, it has become crucial to identify the rear-end crash risk during car-following process and explore its influencing factors [810].

Despite the efforts on rear-end crash risk identification and analysis, several research gaps still exist. Measures, including time-to-collision (TTC), stop distance index (SDI), deceleration rate to avoid crash (DRAC), and others, have been proposed to study driving risks [1113]. However, TTC based on constant velocity assumption ignores the response of the following vehicle (FV) and the changes in the states of the vehicle pairs. Therefore, measures which take the mechanism of driver response and the development of the crash into account should be developed for better representing risks during car-following events. Besides, the other traditional surrogate measures of safety (SMoS) cannot fully reflect the crash probability and crash severity at the same time. In addition, driving risks change with driver’s personal characteristics and environmental factors [3, 1416]. In-depth research on the risk of rear-end crash risk and its influencing factors is essential to formulate effective countermeasures to reduce the risk of rear-end crash.

Judging from the previous studies, the traditional SMoS cannot fully consider the crash mechanism or fail to reflect the crash probability and crash severity at the same time. In addition, just few previous studies have been conducted based on large-scaled naturalistic driving data and comprehensively consider driver heterogeneity, behavioral characteristics, and environmental characteristics to study the influencing factors of rear-end crash risk from the perspective of microscopic car-following behavior.

Therefore, this study aims to propose a reliable measure to quantify the driving risk in the process of car-following and investigate the impact of various influencing factors on the rear-end crash risk considering driver’s heterogeneity. For that purpose, a new rear-end crash risk index (RCRI) was introduced, which fully considers the crash mechanism and integrates crash probability and crash severity. A total of 16,905 car-following events were extracted from Shanghai Naturalistic Driving Study (SH-NDS). Crash risks under different influencing factors were analyzed and compared based on the proposed RCRI. Then, the mixed-effects linear regression model was then employed to study the impact of the behavioral characteristics and environmental factors on rear-end crash risk.

2. Literature Review

2.1. Overview of Studies on Surrogate Measures of Safety (SMoS)

High-risk car-following behaviors, such as close-driving to the leading vehicle (LV), may lead to a high probability of an accident [17]. Based on temporal and spatial proximity, various rear-end crash risk indexes have been proposed that can be used to evaluate driving risks. Among time-based surrogate measures, time-to-crash (TTC) was widely used in practice [11]. Meanwhile, the risk of rear-end crash also depends on driver’s crash avoidance behavior. The crash can also be avoided if the FV brakes in time. Herein, deceleration rate to avoid crash (DRAC) was then introduced to evaluate the braking requirement during vehicle conflicts to quantify the risk of rear-end crash [13]. Besides, maintaining a safety distance between the LV and the FV is the key to avoiding rear-end crashes. Therefore, the stop distance index (SDI), which is based on the concept of safe stopping distance of the FV, was proposed by Oh et al. [12]. To mitigate the risk of rear-end crash, the FV should maintain the safe headway distance from the LV.

Although the traditional SMoS was widely used in quantifying the risk of rear-end crash, they also have some limitations. Kuang et al. [18] summarized three main limitations of the SMoS presented above: (i) the driver’s response characteristics when experiencing a conflict are not considered in these SMoS; (ii) due to the requirements of the boundary conditions, any situation where the speed of the FV is lower than the speed of the LV is considered a safe situation, which may be unreasonable for the situation where the vehicle is traveling at a similar speed and the headway distance is small; (iii) the arbitrary selection of the index threshold will also make the result inaccurate.

Researchers have also proposed some new surrogate measures to address the abovementioned problems. Xie et al. [19] proposed the time-to-crash with disturbance (TTCD) to quantify the risk of rear-end crash. This indicator solves the problem of inaccurate risk identification when the speed of the FV is less than the speed of the LV. Besides, Shi et al. [20] derived a new hybrid indicator named key risk indicator (KRI), which integrates time integrated time-to-crash (TIT), crash potential index (CPI), and stopping sight distance (PSD). However, these proposed surrogate measures cannot make up all the above three limitations. In view of this, Kuang et al. [18] proposed a surrogate measure based on tree structure to evaluate the rear-end crash risk, namely, aggregated crash index (ACI), which considers many major factors in rear-end crash, including disturbance of LV, driver’s reaction characteristics, and available braking capacity of the FV. However, this method is not convenient due to its complicated leaf nodes, which makes it complicated to apply. In addition, this measure only discusses the occurrence probability of a rear-end crash event without considering the severity of the crash. Therefore, it is crucial to propose a new SMoS to address the above limitations and better quantify the risk of rear-end crash.

2.2. Overview of Studies on Driving Risk-Related Factors

Driving risks are often considered to be related to multiple factors, including driver’s individual characteristics and environmental factors [3, 1416]. Several studies have explored the interaction between driving risks and various influencing factors. Table 1 summarizes these previous studies, which can be summarized into three main categories based on experimental methods: survey-based studies, driving simulation studies, and naturalistic driving studies.

Most of the initial research is based on reported survey data. However, the main problem is that the sample size of the data is limited and may be highly subjective. In addition, survey-based research pays more attention to the influence of driver characteristics on driving risks and ignores driving environmental factors. Alternatively, due to its high safety, controllability, and comprehensive data acquisition, driving simulators are used to study driver behavior characteristics to improve driving safety, especially in safety-critical conditions. However, studies based on driving simulation experiments are mostly limited to specific behavior in a limited number of safety-critical scenarios and cannot truly reflect the external driving environment [21]. Research based on naturalistic driving data represents real-world driving situations. It is possible to extract significant driving behavior parameters from naturalistic driving data, such as speed, acceleration, relative position with surrounding vehicles, and environmental conditions to study the influencing factors of driving risk [22]. The collected data are more comprehensive and effective and may provide more valid results.

3. Data Preparation

3.1. Brief Introduction of the Shanghai Naturalistic Driving Study (SH-NDS)

The real-world driving data used in this paper were collected by the SH-NDS, jointly conducted by Tongji University, General Motors (GM), and the Virginia Tech Transportation Institute (VTTI) from 2012 to 2016 [30, 31]. The 60 drivers participating in this naturalistic driving experiment are aged between 35 and 50 years old, and all of whom have a driving experience of more than five years. The total mileage that has been driven before participating in the experiment is more than 20,000 kilometers, and the average daily mileage is not less than 40 kilometers. Each driver drives the assigned experimental vehicle on the open road network, and the driving route is selected according to the driver’s daily travel needs.

The video data of SH-NDS are mainly recorded by 4 cameras, which are installed in hidden locations that are not easy to observe. The video image is shown in Figure 1, which is composed of the front and rear vision of the vehicle, the driver’s facial state, and the hand operation image.

3.2. Car-Following Event Extraction

In this study, the car-following events in the SH-NDS were extracted to analyze the influencing factors of rear-end crash risk. The SH-NDS data cover all daily trips of the drivers participated. Totally data of 18,242 trips were collected. In the SH-NDS, data were automatically collected using the data acquisition system, triggered by the ignition switch of the vehicle. Therefore, the database inevitably contains a large number of trip records that are not related to the research content (vehicle activities including fueling, car washing, maintenance, and other types of short-distance trip were all recorded). In addition, missing values and outliers also exist in the SH-NDS database. Data processing mainly includes the following four steps:(i)Step 1: eliminate invalid record files. Due to the large number of short-distance trip records in the SH-NDS database, considering the distance between adjacent entrances and exits on urban expressways, a trip includes at least entering the urban expressway, driving on the urban expressway, and exiting the urban expressway, so the trip files with the travel time less than 5 minutes should be removed.(ii)Step 2: remove driving data under discontinuous traffic flow. To extract the car-following event under continuous traffic flow, a point-to-point map matching algorithm was used to match the driving trajectory measured by GPS with the electronic map road data to find the trip record on the urban expressway [32]. Then, each trip was verified through camera video recording to ensure the validity of the driving data selected for analysis. The weather and light conditions were also determined during the verification process.(iii)Step 3: handle missing values and outliers. In order to obtain data such as vehicle’s speed, acceleration, relative space-time distribution of surrounding vehicles, and traffic environment conditions during the car-following process, data preprocessing is required. Linear interpolation was applied to deal with missing values. Then, outliers were eliminated based on pauta criterion, and the data were smoothed using the moving average filter.(iv)Step 4: extract data of car-following events. Car-following events were extracted by applying an automatic extraction algorithm proposed by Zhu et al. [33]; (1) radar target’s identification number of LV > 0 and remained constant: guaranteeing the FV was preceded by the same LV; (2) 7 m < longitudinal distance between the FV and the LV < 120 m: eliminating the congested-flow conditions; (3) lateral distance between the FV and the LV < 2 m: ensuring that the FV and the LV were in the same lane; (4) duration of the car-following event >15 s: guaranteeing that each car-following event has enough data for analysis.

After the application of the above steps, 16,905 car-following events (about 135 h of total event duration) were extracted from 1,197 trips. Figure 2 shows the histogram of the duration of car-following events.

4. Methodology

The methodology of this research mainly includes three parts: (i) derivation of a new surrogate measure for rear-end crash risk, (ii) identification of influencing factors for rear-end crash risk, and (iii) mixed-effects linear regression for rear-end crash risk modeling and factor analysis.

4.1. Derivation of New Surrogate Measure for Rear-End Crash Risk
4.1.1. Rear-End Crash Mechanism

In the process of car-following maneuver, the driver will choose the appropriate speed and safe headway distance according to the movement status of the LV. The determination of the safe headway distance needs to consider the driver’s reaction time and the vehicle’s deceleration process. Otherwise, it may easily lead to a rear-end crash when the headway distance is too small. In this research, we imposed a hypothetical disturbance on the LV, assuming the LV decelerated at a certain deceleration rate. As can be seen in Figure 3, the FV will take appropriate evasive actions based on the initial driving condition and the disturbance after reaction to avoid the crash. The crash outcome can be identified by evaluating the initial conditions, the disturbance, the driver’s reaction characteristic, and the degree of evasive action [18, 34].

4.1.2. Rear-End Crash Risk Index (RCRI)

Risk is the product of the possibility that a hazard event will occur and the consequence of the event. To address the limitations of SMoS mentioned above, we propose a new SMoS named RCRI, which considers the crash probability and crash severity at the same time, to quantify the risk of rear-end crash. According to the changes in the speed and distance of the LV and the FV before the crash, the process of rear-end crash can be divided into four categories, as shown in Figure 4 [35, 36].

Then, according to the characteristics of the rear-end crash, that is, the crash is plastic and the two vehicles tend to move together after the crash. In this study, the momentum theorem was used to calculate the speed of the two vehicles after the rear-end crash, that is,where is the mass of the LV, is the mass of the FV, is the speed of the LV when the crash occurs, is the speed of the FV when the crash occurs, and is the speed of the LV and FV after the crash.

Calculate the energy loss of the two vehicles after rear-end crashes using the law of conservation of energy, as follows:

Therefore, this study uses the square of the absolute speed difference (SASD) at the time of the rear-end crash of two vehicles to express the severity of the rear-end crash.

To simplify the rear-end crash avoidance process, the model adopts two assumptions: (i) the braking process of the vehicle is regarded as a uniform deceleration process and (ii) the FV only adopts braking measures to avoid crash. Therefore, the braking stop distances of the two vehicles before the crash can be obtained, as shown in the following formula:where is the braking distance of the LV, is the speed of the LV before braking, is the time of crash, and is the deceleration rate of the LV.

When , the LV has been completely stopped, and the corresponding braking stop distance of LV is

During the entire conflict process, during the reaction time , the FV maintains a constant speed and then decelerates at a constant deceleration rate . When , the braking distance of the FV can be represented aswhere is the braking distance of the FV.

When , the braking stop distance of the FV is

Therefore, if the longitudinal distance between the FV and the LV is reduced to zero before the FV completely stopped, a crash will definitely occur, namely,where is the initial gap between LV and FV.

For scenario 1, where and , when the crash occurs,

The solution is

For scenario 2, where , when the crash occurs,

The solution is

For scenario 3, where and , when the crash occurs,

If , the solution is

If , the solution is

For scenario 4, where and , when the crash occurs,

The solution is

According to the previous studies, the deceleration rate taken by the LV follows a shifted gamma distribution (17.315, 0.128, 0.657), which was suggested and calibrated by Kuang et al. [18]. The reaction time of the FV follows a log-normal distribution (0.17, 0.44), and the braking coordination time is 0.175 s [37]. The maximum available deceleration rate (MADR) was assigned to be a truncated normal distribution with a mean of 8.45 m/s2 and a variance of 1.4 m/s2 between 4.23 m/s2 and 12.68 m/s2 [38].

The Monte Carlo simulation method was used to randomly select the deceleration rate taken by the LV, the reaction time of the driver, and the deceleration rate of the FV on the basis of the distribution function of each parameter mentioned above. According to the initial states of the LV and FV, the crash time (t) and the crash consequence (represented by SASD as discussed before) were calculated based on the above equations. Then, by integrating the possibility and consequences of the crash under the current car-following conditions, a new SMoS named RCRI can be calculated and expressed as below. It should be noted that the SASD needs to be normalized to a value between 0 and 1 before calculating RCRI. Therefore, the calculated RCRI range is also between 0 and 1:where represents the risk of rear-end crash at the moment (0.1 s) in the car-following scenario; N = 10,000 is the number of random samples generated by the Monte Carlo simulation. When crash time t has a solution, that is, a crash occurs, ; otherwise, .

It should be noted that SASD is dimensional, so it needs to be normalized before calculation.

4.2. Identification of the Influencing Factors for Rear-End Crash Risk
4.2.1. Definition of the Variables

As discussed in previous studies, the driving risk is mainly affected by driver’s operational characteristics and the external driving environment [3, 1416]. In order to quantify the various influencing factors for risks in the car-following process, this study extracted three categories of variables: behavioral variables, temporal variables, and environmental variables, as shown in Table 2.

The relevant variables of the driver’s car-following behavior consider the duration of car-following event, time headway, and the driving speed and acceleration of the LV and FV. In order to eliminate the influence of the absolute value of the speed on the modeling, in terms of the speed indicator, this study adopted the average speed difference (ASD) between LV and FV, which is presented aswhere and are the instantaneous speed of the LV and FV at its -th record, and , T is the duration of car-following event.

Acceleration difference ratio (ADR) refers to the ratio of the standard deviation of the acceleration of the FV and LV. The ADR of one car-following event can be calculated aswhere and are the standard deviation of the acceleration of the FV and LV during one car-following period.

Peak hours increase the likelihood of congestion, resulting in a shortage of driving space, which in turn breeds risky driving behaviors. In order to consider the possible impact of the peak period, in this study, the car-following events within a day are divided into three categories: morning peak, evening peak, and off peak. Among them, the morning peak refers to 7:00 to 9:00, while the evening peak refers to 17:00 to 19:00. Besides, the traffic density was determined based on the speed of the FV and forward camera video recording, as described in Yang et al. [39].

To examine the effect of these independent variables on rear-end crash risk, the mean RCRI of a single car-following event was used as the dependent variable in this study.

4.2.2. Statistics Description of the Variables

As mentioned, we extracted 16,905 car-following events from 1,197 trips. Table 3 presents the descriptive statistics of the continuous variables and frequency information of the discrete independent variables in this study.

4.3. Mixed-Effects Linear Regression for Rear-End Crash Risk Modeling and Factor Analysis

In this research, all drivers participated in multiple car-following events, and car-following behavior characteristic variables were repeatedly collected from each participant. Therefore, the correlation problem between repeated observations will be exposed, that is, within-cluster correlation, and this problem can be solved by applying mixed-effects linear regression model [30, 40].

The mixed-effects linear regression model is an extension of the linear regression model, including fixed effects and random effects. Compared with the ordinary linear regression model, the mixed-effect linear regression model can well control the influence of the driver’s personality factors. The results of the model can reflect the commonality of drivers and consider the internal correlation between samples, which is more suitable for solving the research problems.

In this paper, the fixed effects are the independent variables that the research focuses on. Besides, the drivers were treated as random effects to address the problem of within-cluster correlation.

Formally, the mixed-effects linear regression model can be written aswhere is the dependent variable; is a matrix of the independent variables; is the coefficients of the fixed effects; is the matrix for random effects; is the coefficients of the random effects; and is a column vector of the residuals. To recap,

To better understand the structure of the model, here we provide an example where 16,905 () car-following events were collected from 58 () drivers. Our outcome y is the risk of car-following event. As mentioned above, we have 11 fixed effect predictors. The following equations represent the vectors and matrices provided in the previous equations:

The random effects in the regression model are a column vector containing random intercepts. However, it is not necessary to estimate in actual regression modeling. Instead, the model assumes that follows a normal distribution, with a mean of zero and a variance of :

Parameters of all components were estimated using the mixed procedure in Stata/MP 16.0. The statistical significance level was set at 0.05.

5. Results and Discussion

5.1. Rear-End Crash Risk Identification Using SH-NDS Data and RCRI

To better illustrate the rear-end crash risk identification measure proposed in this study, TTC, DRAC, SDI, and RCRI were all employed to quantify the risk for one car-following event. According to the previous studies, the thresholds of each SMoS are chosen as follows. The threshold of TTC is normally chosen as 3 s [41], and the threshold of DRAC is 3.4 m/s2 [42]. In the SDI calculation, it is assumed that the deceleration speed is 3.3 m/s2 and the reaction time is 1.0 s [20]. A portion of the vehicle movement data during one car-following event is presented in Table 4. The risks identified at intervals of 0.1 s are shown in Figure 5.

As mentioned in the methodology, the RCRI is calculated based on the assumed disturbance. This measure can be used to quantify the risk in any scenario, even when the LV’s speed is greater than that of the FV. Besides, the RCRI takes into account the most critical variables in crash mechanisms such as driver reaction characteristics and vehicle braking performance and comprehensively considers the crash probability and consequences. Moreover, the RCRI is a continuous variable, so it has better flexibility to quantify the real-time change process of rear-end crash risk. On the contrary, the risk quantification results based on TTC, DRAC, and SDI are dummy variables. Therefore, the RCRI can more accurately represent the risk and has wider applicability.

5.2. Comparative Analysis of Rear-End Crash Risk under Different Influencing Factors

Before the significance test, the Kolmogorov–Smirnov (K–S) test was employed to verify the distribution of rear-end crash risks under different influencing factors. The results of the K–S test show that rear-end crash risk under different influencing factors meets the requirements of homogeneity of variance and normal distribution. Therefore, the analysis of variance (ANOVA) was then applied to test the significance of difference in rear-end crash risk under different influencing factors. Table 5 provides the summary statistics of the driving risk under these significant influencing factors. It can be seen from Table 5 that the rear-end crash risk under most influencing factors, such as day-of-week, time-of-day, light condition, and traffic density, is significantly different.

Figure 6 shows the comparison results of rear-end crash risk under different influencing factors. Specifically, for day-of-week, driving risk increased with the workday (by 14.3% from 0.0028 to 0.0032). In addition, the driving risks are 0.0030, 0.0034, and 0.0032 respectively in the three cases of off peak, morning peak, and evening peak. The significant differences in the risks of these temporal variables indicate that the driving risks are different for different travel purposes. The decrease in driving risk was slight but significant by 0.0002 (6.3%) from daytime to nighttime. The driving risk decreased from 0.0049 in high traffic density to 0.0021(by 57.1%) in low traffic density, indicating that traffic congestion leads to a decrease in driving safety.

5.3. Results of Mixed-Effects Linear Regression Model

Based on the car-following behavior variables and environmental factors variables obtained from SH-NDS, the influencing factors of rear-end crash risk are investigated. Table 6 presents the results of mixed-effects linear regression model. As shown in Table 6, the results of chi- squared goodness of fit indicate that the mixed-effects regression model fits well (, Prob ). Except for the morning peak in temporal variables and weather conditions, all the variables listed in Table 6 are significant at 95% confidence level.

As shown in Table 6, all selected behavioral variables affect the driving risk. The longer the duration of car-following event, the lower the driving risk. This can be understood as the risk of driving increases when the LV is changed frequently. Clearly, the larger the time headway, the lower the driving risk. This result is consistent with Duan et al. [9], who evaluated risk perception in car-following process. In addition, based on the SH-NDS data, Zhu et al. [43] concluded that the aggressive drivers have a shorter time gap than conservative drivers. Existing studies found evidence that speed dispersion is also an important factor in determining crash risk [4446]. The larger speed difference between LV and FV is associated with a higher crash rate, which is generally consistent with our findings. In addition to the speed difference, this paper investigates the impact of acceleration difference on driving risk. The results show that the higher the acceleration difference ratio, the higher the risk, which indicates that the driving risk will increase when the FV uses more frequent acceleration and deceleration operations than the LV during car-following processes.

Qin et al. [47] suggested that due to the different travel purposes (to/from work) of drivers, the probability of a crash during working day and nonworking day is different, and the probability of crashes on working days is higher. This finding is consistent with the results of the regression model in this paper; that is, working days lead to higher driving risks. Furthermore, from the regression coefficients of this study, it can be concluded that the crash risk is higher for the morning peak compared with other times of the day, and the evening peak is the lowest risk period of the day. The results are further confirmed that that there is a significant correlation between driving risk and driving purpose. The driving risk of drivers on the way to work is higher than the risk of leaving work.

As for the diverse environment, compared with sunny days, the risk of rear-end crash is higher for drivers on rainy days, which is consistent with Das et al. [48] and Jung et al. [49]. From the obtained results, we can draw the conclusion that traffic density has a greater impact on rear-end crash risk [50]. The variable of median-density and low-density show a negative coefficient ( for median-density and for low-density), indicating that driving risk decreased in lower density traffic. The high-density traffic flow leads to an increase in the uncertainty of traffic flow and increases the driving risk. This result is consistent with Huang et al. [22], who investigated the driving risks under different conditions using naturalistic driving study and driver attitude questionnaire.

6. Conclusions

This study proposes a new SMoS to quantify driving risks in car-following situations and investigates the impact of different influencing factors (behavioral factors, temporal factors, and environmental factors) on rear-end crash risk considering driver’s heterogeneity. A total number of 16,905 car-following events were extracted from SH-NDS database. Risks of rear-end crash under different influencing factors were compared. In addition, a mixed-effects linear regression model was then applied to investigate the relationship between rear-end crash risk and various influencing factors. Several key conclusions can be drawn:(i)Different from TTC, DRAC, SDI, and other indicators, the surrogate measure RCRI was proposed based on crash mechanism and comprehensively considers the crash probability and consequences. This measure can be applied in any car-following situation, even when the speed of the LV is greater than the speed of the FV. The RCRI proposed in this study is a continuous variable, so it can be more flexible to quantify the risk of rear-end crash.(ii)Among different temporal variables, workday and morning peak hour had the highest mean value of driving risk. For different light conditions, the crash risk increased for daytime compared to nighttime. As for different traffic density, the driving risks corresponding to low-density traffic flows are significantly lower than those corresponding to high-density and medium-density traffic flows.(iii)The mixed-effects linear regression model performed well in quantitatively evaluating the impact of various influencing factors on rear-end crash risk. The developed models demonstrated that duration of car-following event, mean time headway, average speed difference, acceleration difference ratio, day-of-week, time-of-day, weather condition, and traffic density had significant effects on rear-end crash risk. Workday and morning peak negatively affected driver safety. As for environmental variables, rainy and high-density traffic decreased driver safety.

As the main contribution, this paper utilizes a new SMoS and naturalistic driving data to quantify rear-end crash risk and identify the impacts of different influencing factors on the crash risks. Research was conducted based on naturalistic driving data, which objectively reflects the real operation of drivers. The new SMoS can not only be used to investigate the driving safety of drivers under different driving environments but also be used for driving risk evaluation and real-time risk prediction. Results from the mixed-effects linear regression model can be used to improve driving safety by adopting appropriate countermeasures. For example, traffic safety management can be strengthened during working days and morning peak hours to ensure safe driving.

Still, limitations exist in this study. The proposed indicators mainly focus on the risk of rear-end crash and cannot comprehensively consider other types of crash risks. In addition, no crash data were obtained from SH-NDS database that can be used to verify the effectiveness of RCRI. For future work, further validation will be applied to evaluate the effectiveness of RCRI based on crash data. Moreover, the RCRI will be used to predict the real-time change process of driving risk and explore the impact of risky driving.

Data Availability

The data used in this paper are available from the corresponding author upon request.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.


This paper was jointly supported by the Chinese National Natural Science Foundation (71871161) and the Science and Technology Commission of Shanxi Province (19-JKCF-02).