Abstract

The aim of this study was to evaluate the effects of driver-related factors on crash involvement of four different types of commercial vehicles—express buses, local buses, taxis, and trucks—and to compare outcomes across types. Previous studies on commercial vehicle crashes have generally been focused on a single type of commercial vehicle; however, the characteristics of drivers as factors affecting crashes vary widely across types of commercial vehicles as well as across study sites. This underscores the need for comparative analysis between different types of commercial vehicles that operate in similar environments. Toward these ends, we analyzed 627,594 commercial vehicle driver records in South Korea using a mixed logit model able to address unobserved heterogeneity in crash-related data. The estimated outcomes showed that driver-related factors have common effects on crash involvement: greater experience had a positive effect (diminished driver crash involvement), while traffic violations, job change, and previous crash involvement had negative effects. However, the magnitude of the effects and heterogeneity varied across different types of commercial vehicles. The findings support the contention that the safety management policy of commercial drivers needs to be set differently according to the vehicle type. Furthermore, the variables in this study can be used as promising predictors to quantify potential crash involvement of commercial vehicles. Using these variables, it is possible to proactively identify groups of accident-prone commercial vehicle drivers and to implement effective measures to reduce their involvement in crashes.

1. Introduction

Commercial vehicles have a high risk of traffic injury because they are driven long distances, which often leads to driver fatigue [1]. Moreover, when involved in traffic crashes, the heavy weight of such vehicles generates greater impact damage on the occupants of other vehicles or pedestrians [2, 3]. In South Korea in the year 2018, commercial vehicles—buses, taxis, and trucks—were involved in 43,632 traffic accidents, resulting in 67,262 injuries (20.8% of total traffic injuries) and 692 deaths (18.3% of total traffic injuries) [4]. Considering that commercial vehicles account for only 6.2% of total registered vehicles in South Korea, the proportions of traffic injuries and deaths involving commercial vehicles are substantially higher than those of other vehicle classes. While controlling for driving distance, commercial vehicles had 792.7 traffic accidents and 12.6 traffic deaths per one billion km, both of these values were 1.5 times greater than the values for noncommercial vehicles. In general, a large portion of traffic crashes is caused by driver-related factors [5].

Previous studies have mainly used two approaches to evaluate the effect of driver-related factors on traffic accidents. One approach used questionnaires or interviews to evaluate psychological factors deriving from latent driver characteristics such as personality and attitude [612]. The use of self-reported data had the advantage of allowing the collection of enough samples to adopt statistical models, but it raised concerns about what are called “dishonest biases” and that the variables were unobservable in practical use. Due to these limitations, safety practitioners and scholars turned to the analysis of more measurable traffic-crash data. Such studies were generally performed with a focus on a single type of commercial vehicle [1, 1319]. Although the vehicle types were different, driver characteristic variables such as age, gender, education, driving experience, license type, violation history, crash history, and job change history were commonly specified as explanatory variables [1, 10, 1418, 20]. On the other hand, coefficient estimates of those common variables varied widely across types of commercial vehicles, as well as across study sites [21]. This underscores the need for a comparative analysis of different types of commercial vehicles operated in comparable environments. Such research would reveal the differential effects of driver characteristics on commercial vehicle safety while controlling for site-specific effects.

In the field of traffic-crash research, logistic regression has proven to be a popular and reliable way to reveal the relationships between response variables and explanatory variables [2224]. However, in the case of logistic regression, a fixed, unique coefficient represents the effect of a particular factor on all individuals. This could lead to bias in the interpretation of the results, given that the incidence of one individual's crash is inherently heterogeneous [25]. Therefore, for the analysis of crash-related data, it is necessary to use a model that can explain unobserved heterogeneity. To address the issue, previous researchers used other, more promising approaches: several statistical and econometric methods are available to account for unobserved heterogeneity [2629]. In many recent studies, mixed logit models have been used to account for unobserved heterogeneity of variables by applying individual-specific coefficients to standard logit models [3033].

In this context, the aim of the present study was to evaluate the effect of driver characteristics on traffic crashes of four different types of commercial vehicles—express and local buses, taxis, and trucks—and to compare outcomes across types. Toward these ends, we analyzed commercial vehicle driver records in South Korea using a mixed logit model that could address unobserved heterogeneity in crash-related data. This study is distinct in that long-term driver characteristic variables were used for the crash analysis. A total of 627,594 commercial vehicle driver data were processed and the accumulated crash counts, traffic violations, and job changes from driver license acquisition date were used in the model. Furthermore, to control for site-specific effects, which set limitations on many previous studies in which only one type of commercial vehicle was analyzed, a comparative analysis of four vehicle types that operate in the same environment was executed to reveal the differences according to vehicle type. The results of this study can be used to establish traffic safety improvement policies by the characteristics of each commercial vehicle type.

2. Data Description

Records on 627,594 commercial vehicle drivers without identifiable personal information were obtained from the Korea Transportation Safety Authority (KTSA). The dataset contains driver information including date of birth, gender, issue date of commercial driver’s license, violation records, previous job changes, license type (i.e., type of commercial vehicle), and crash records. In this study, only data for male drivers were used due to the insufficient sample size of female commercial vehicle drivers.

The dependent variable is binary: whether the drivers had been involved in traffic accidents for the recent three years between July 2014 and June 2017. Independent variables include driver’s age, driving experience (in years), violation rate (numbers of violations per 10 years), job change rate (numbers of job changes per year), and previous crash involvement rate (numbers of crashes per 10 years). Driver age was discretized into several levels for convenience of analysis, while other variables remained as continuous variables. Descriptive statistics of variables are provided for each vehicle type from Tables 14.

Overall, the proportion of crash involvement was found to vary across types of commercial vehicles. Higher crash involvement was observed in local buses and taxis, which is presumed to be because they operate mainly in urban environments. The distribution patterns are also different. The proportion of recent crash involvement of trucks ranged from approximately 5% to 10%, while taxis ranged from 15% to 30%, showing an almost threefold difference. This trend was similar for other variables. The proportion of crash involvement tended to increase as age increased; however, taxi and truck drivers showed the highest rate at 21–30, and there was no clear trend of crash rate increase with age. The proportions of recent crash involvement of local bus, express bus, and taxi drivers were highest at 5–15 years of driving experience. For truck drivers, the rate was highest at 1–5 years. As traffic violation rate increased, the proportion of recent crash involvement tended to increase. However, the levels that had the highest proportions were different depending on the vehicle type. Local bus and express bus drivers had the highest traffic violation rates of 0.15–0.2. Taxi drivers had values of 0.1–0.15 and truck drivers have values above 0.2. The higher the job change rate, the higher the proportion of recent crash involvement. There is also a tendency similar to that of job change rate for past crash rates.

3. Methods

In this study, a mixed logit model was used to unveil significant factors that affect the crash involvement of commercial vehicle drivers in South Korea. The mixed logit model assumes that the effects of the parameters on the logit model are different for each individual [31]. Thus, the model can account for unobserved factors that are not captured directly through the data. To compare the explanatory power of the mixed logit model, we performed the log-likelihood ratio test with the logistic regression model that is using only fixed parameters. In the logistic regression model, the coefficient value is fixed. The probability that driver belongs to recent crash category , consisting of two categories that are involved or not involved, is shown in the following equation:

In the mixed logit model, it is assumed that the coefficient is not fixed to but is individual-specific , and the probability of crash involvement could be expressed as shown in the following equation:

In order to derive a consistent estimated value through the above equation, it is necessary to use various data for each individual. That is, it is difficult to use the above equation with limited crash data. In the mixed logit model, it is assumed that is a random parameter that is estimated for each individual. Therefore, given , the probability of having a crash experience for driver could be expressed by the following equation:

Since the above equation is a conditional probability, the unconditional probability is derived as equation (4) through integration using the probability density function of :where is a probability density function for having a parameter of . The normal distribution is most widely used as a function of the probability density of random parameters, and a uniform distribution is appropriate for dummy variables [34]. In this study, the multidimensional integration of is necessary because the factors related to driver characteristics are reflected by several variables instead of one. It is difficult to calculate a multidimensional integral because of the complicated processes of numerical integration, such as the quadrature method, but this integral can be calculated by a simulation-based maximum likelihood method. In many works in the literature, Halton draws have proven to be the most efficient way of estimating coefficients [3537]. We used a free statistical package R to implement the mixed logit model in this study. The risk factors include driver characteristics such as age, driving experience, traffic violation rate, job change rate, and previous crash rate. The response variable is a binary variable: whether or not a driver had a crash experience in the period from July 2014 to June 2017.

4. Results and Discussion

The results of the mixed logit model derived in this study successfully converged for all vehicle types, and the variables with random parameters showed statistically significant results. Heterogeneity variables have statistically significant results for standard deviation. In the derived model, there were significant differences among commercial vehicle types. Variables with heterogeneity were different for each vehicle type, and the sizes of statistically significant coefficients were also different. First, compared to the model using only fixed parameters, the log-likelihood ratio test was used to verify the explanatory power of the mixed logit model. The test results are shown in Table 5.

In the results of the log-likelihood ratio test, it was found that all models in this study that applied random parameters were superior to models using fixed parameters. The express bus model was statistically significant at the level of 95% confidence and the other models were significant at the level of 99.9% confidence. This is because random parameters applied to each model were able to reflect driver heterogeneity, which could not be considered in the fixed-parameter model.

The coefficient estimation results of the variables obtained using the mixed logit model are provided in Table 6. Among the eight variables used in this study, the variables that showed statistical significance differed between commercial vehicle types. Driving experience, traffic violation rate, and crash rate were statistically significant at a 99% confidence level in all models, and job change rate was statistically significant in all vehicle types except local bus. The driver's age was statistically significant at the level of 95% confidence in all models for the age group of 61 and older. There were differences in statistical significance for the other age groups by vehicle type. Heterogeneous variables also had different levels of statistical significance depending on the vehicle type. In the case of traffic violation rate, all vehicle types showed heterogeneity according to drivers, and the driving experience variable exhibited heterogeneity for all vehicle types except express bus. The age group of 31 to 40 years exhibited heterogeneity in groups of truck and taxi, while the older driver group (51 to 60 years) and job change rate variables exhibited heterogeneity only in a taxi.

The coefficient of the oldest age group for all vehicle types is positive. This could be interpreted showing that the oldest group of drivers had increased risk of crash involvement compared to drivers in the reference group (21 to 30 years). This finding was consistent with the findings reported in previous studies [3843]. Similarly, Valent et al. [39] reported that drivers aged 65 and above had a significantly increased risk of fatal injury for most kinds of transportation modes. As is well known, as people age, their cognitive and perceptual faculties deteriorate, which could increase crash risk [4446]. Especially, local bus drivers in their 51 s and 60 s also had positive coefficients, while results for this age group with other vehicle types were not significant. Unlike other types of vehicles, local buses have tightly spaced schedules, which means they have to speed up and cut other vehicles off in order to meet the intervals, which leads to high labor intensity on the part of the driver. Therefore, for local bus drivers, the older driver management age range should be wider than for other vehicle drivers.

On the other hand, truck and taxi drivers in their 30 s and 40 s exhibited heterogeneity. Among heterogeneous variables that have different effects on traffic crashes, the probability of the positive impact of the crash involvement could be calculated if the standard deviation of the coefficients derived from each variable was statistically significant. The traffic violation rate, a variable that is heterogeneous for all vehicle types, with values of 69.89%, 93.67%, 75.53%, and 83.28%, affects involvement in crashes as it increases. Generally, traffic violations are a major risk to road safety, as confirmed by the results of many previous studies. The results of this study also show that up to 93% of drivers increase the risk of crashes, depending on the type of vehicles, when traffic violation rates increase if all other variables are kept constant. Similarly, many researchers have found a positive relationship between traffic violations and crash occurrence [6, 4750]. However, most of the studies have been conducted on general drivers; only a few studies have been conducted on commercial vehicle drivers [10]. In the case of a commercial vehicle driver, heterogeneity in traffic violations occurred because the vehicle type and driving environment are different from those of general drivers. Especially for local buses, about 30% of drivers have an effect of reducing crash involvement when the violation rate increases. This is because the violation rate used in this study is the count value of all traffic violations. Not all traffic violations, as defined by national law, are directly related to crashes. Traffic violations that include violations that increase the risk of a crash, as found in the previous literature, are speeding and drunk driving; however, violations also include items not directly related to crash involvement such as parking violations or designated lane violations. Therefore, a more detailed analysis of traffic violations is required to identify crash-related violations of commercial vehicle drivers.

For driving experience, express buses had a fixed coefficient of −0.02; local buses, trucks, and taxis exhibited heterogeneity. For vehicle types with heterogeneity in the driving experience, the probability values for drivers who reduced the risk of crashes as driving experience increases were as follows: 99.77% for local bus, 95.66% for truck, and 99.99% for a taxi. This indicates that most drivers with short driving experience are likely to have recently been involved in a crash, regardless of vehicle type. This is in line with the fact that accident rates tend to diminish with experience [9, 51, 52]. Cooper et al. [52] revealed that crash rates of novice drivers aged 16 to 55 decreased with increasing experience. McCartt et al. [9] used survival analysis to determine that the risk of a first crash during the first month of licensure was much higher than during any of the next 11 months. As has been previously found, the driving experience has a positive effect on driving skills [53, 54].

For express bus and truck drivers, the relationship between job change rate and crashes was found to be statistically significant, with positive coefficients, meaning that a job change increases the risk of crash involvement. The job change rate is a variable that represents a comprehensive measure of overall driver behavior. Frequent job changes imply that the duration of driving experience at a job position is not sufficient for that driver to be fully adept at the job and that the driver’s overall skills and adaptation to the commercial vehicle industry may not be satisfactory. However, little research has been conducted to evaluate job change as a predictor of crash involvement of commercial vehicle drivers. Corsi and Fanara [55] reported that motor carriers with high driver turnover had significantly higher crash rates than those with lower turnover rates. Extending this research, Staplin and Gish [56] estimated the risk of crash involvement as a function of job change rate among truck drivers. However, these studies were based on univariate analysis, which might have caused confounding issues.

Meanwhile, taxi drivers, as examined in this study, exhibited heterogeneity in job change rate, with 46.9% affecting the increase of crash involvement and 53.1% affecting the decrease. In this context, the outcomes of this study provide solid evidence that the job change rate is a reliable crash predictor for express bus and truck drivers. For taxi and local bus drivers, it is not clear that job change rate increases the risk of a crash; it is found that when using job change rates for commercial vehicle driver safety management, it needs to be applied to specific vehicle types.

In contrast, a driver’s previous safety performance, represented by the variable of crash rate, has a statistically significant effect on crash involvement for all commercial vehicle drivers. This is consistent with previous findings [1, 57]. Since the coefficient of the crash rate was not heterogeneous for all vehicle types, intensive safety management is needed for drivers who have higher crash rates, regardless of vehicle type.

5. Conclusions

The present study provides the first report on crashes of commercial vehicle drivers operating four different types of vehicles in Korea. The information in 627,594 driver records obtained from the Korea Transportation Safety Authority was used in this study to evaluate the effects of driver-related factors on crash involvement of four types of commercial vehicles: local buses, express buses, taxis, and trucks. Then, the outcomes were compared across vehicle types. We selected a mixed logit model able to account for unobserved heterogeneity in the driver data and revealed the relationships between commercial vehicle driver characteristics and crash involvement. The dependent variable was the crash involvement of commercial vehicle drivers for the most recent three years, and five driver-related factors were specified as explanatory variables.

The log-likelihood ratio test showed that the mixed logit model derived from this study was superior to the logit model using fixed parameters. The estimated outcomes showed that driver-related factors have common effects on crash involvement: driver experience diminished driver crash involvement, while driver traffic violations, job change, and previous crash involvement had negative effects. However, the magnitude of the effects and heterogeneity varied across different types of commercial vehicles. In the case of local buses, unlike other vehicle types, the job change rate was not statistically significant and the range of ages that increased the risk of crash involvement was wider than other types. Moreover, the crash rate increased the crash involvement in all vehicle types, and taxis, in particular, had a higher coefficient than other vehicle types. Therefore, the result of this study supports the contention that the safety management policy of commercial drivers needs to be set differently according to the vehicle type. Furthermore, because all the variables used in this study were measurable, the expected crash involvement could be estimated using commercial vehicle driver records by vehicle type. By properly using the outcomes, it should, therefore, be possible to proactively identify groups of accident-prone commercial vehicle drivers and to implement effective measures such as education and, if necessary, enforcement.

There are also several limitations of this study related to limitations in the data available. This study did not reflect other variables that might affect crash involvement, such as the average daily driving time related to labor intensity. In addition, it is necessary to study further the relationship between lists of traffic violations by vehicle type and crash involvement. If more information is available about commercial drivers in the future, more in-depth research to improve traffic safety will be possible.

Data Availability

The data used to support the findings of this study have not been made available because of KTSA’s policy reasons.

Conflicts of Interest

The authors declare no conflicts of interest.

Acknowledgments

This work was supported by an Incheon National University Research Grant in 2016. The authors would like to thank the staff of the Big Data Center at the Korea Transportation Safety Authority for their kind assistance in providing the data used in this study.