Abstract

High severity crashes are one of the negative consequences of suburban transportation for a range of factors. Fatalities, injuries, and medical costs, as well as road and car damage and mental side effects, are more important consequences of severe crashes. The goal of this research is to figure out what factors contribute to different crash severity levels, in order to reduce the likelihood of such crashes. This study is unique in that it tries to investigate the capabilities of various discrete choice methods in order to explore which one performs best given the current database and research restrictions. Furthermore, the data fusion approach allows this study to take advantage of a wide range of characteristics that influence crash severity. To achieve this objective, the current study used several types of discrete choice models, such as ordered logit (OL), multinominal logit (MNL), and mixed logit (ML) models, to examine the factors influencing the severity of crashes in the suburban highway area. The data are related to crashes and traffic counters in Khorasan Razavi province in the northeast of Iran. Spatial-temporal analysis of crash data with a data fusion approach has been conducted to prepare a multisource data set with a wide spectrum of independent variables to acquire reliable results using logit models. Independent descriptive variables include geometric design, time-related, weather and environmental conditions, land use, traffic attributes, vehicle characteristics, and driver characteristics. ML provided the best fit with the available data set when compared to other discrete choice techniques. In addition, in all three logit models, coefficients of geometric design, vehicle characteristics, driver characteristics, land use, and weather and environmental conditions are significant, demonstrating the significance of using multisource data in defining factors impacting crash severity.

1. Introduction

Road traffic crashes are one of the most serious health problems, accounting for the ninth leading cause of death worldwide [1]. Nearly 1.35 million people are killed and 20 to 50 million people are injured in road traffic accidents each year around the world [2]. Over the past decade, the travel demand has increased dramatically by the population growth and urban infrastructure developments [3]. Global road traffic deaths are expected to rise by more than 35 percent between 2010 and 2020 as a result of continued economic expansion and greater motorization [4]. Road traffic crashes will also become the sixth largest cause of death by 2030 if no action is taken [5, 6].

The issue of road traffic injuries varies greatly between different parts of the world. Approximately 90% of those killed in road crashes live in low- and middle-income nations [5]. Iran is in the Eastern Mediterranean region, which has the world’s second highest traffic-related fatality rate [7]. Furthermore, road traffic crashes are the second leading cause of death, the first leading cause of years of life lost due to premature mortality (YLL), the second leading cause of disability-adjusted life years (DALYs) after cardiovascular diseases, and the most common cause of injury in Iran [7, 8]. According to studies, road traffic crashes claim the lives of 13.5 percent of Iranians, a high percentage compared to the rest of the globe and the Eastern Mediterranean region [9, 10]. More over half of all road traffic fatalities occur among those aged 15 to 44, who are considered the community’s economically productive group [5, 11]. In this regard, it is critical to use accurate and valid techniques to investigate the contributing factors that lead to crashes.

Crash prevention necessitates a thorough understanding of the numerous factors that influence road crashes [12]. On the other hand, recent improvements in intelligent transportation systems have made it possible to collect data in a more consistent and appropriate manner [13, 14]. The data used in this study were collected over a four-year period (2013–2016) from all roadways in the Khorasan Razavi province in Iran’s northeast. Unlike the majority of research considered in the literature review section, we used a unique approach in this study and aimed to compare various prominent logit family models, including ordered logit (OL), multinominal logit (MNL), and mixed logit (ML), in order to reach the best results. The efficiency of a comprehensive approach for identifying and analyzing crash-related data is another noteworthy aspect of this study. The data from crashes and traffic counters were integrated and classified into geometric design, time-related, weather and environmental conditions, land use, traffic attributes, vehicle characteristics, and driver characteristics variables, which were then used and evaluated in the logit family models. This diverse set of factors is quite useful in generating reliable and accurate results. Finally, the study’s significant contributions are the collection of diverse spatial-temporal variables, the creation of a crash data set, and the use of data fusion and statistical modelling tools on this data set to obtain dependable results and set wise safety policies.

2. Literature Review

Previous studies have defined road safety indicators [1520] and have examined influential factors affecting road safety indicators [19, 21, 22]. For example, Anastasopoulos and Mannering [23] consider the effect of pavement, traffic, and geometric characteristics on crash injury severity by using random-parameter logit and fixed-parameter logit models. The model comparison shows the statistical superiority of the random-parameter logit model compared to the fixed-parameter logit model. Also, models based on individual crash data provide a better overall fit relative to the models based on the proportions of crashes by severity type.

In Azimi et al. study [24], crash severity is explored for large trucks specifically. Independent variables consist of driver action, driver age, driver condition, driver distraction, vision obstruction, vehicle defect, the vehicle involved, road surface condition, road alignment, roadway grade, weather condition, and lighting condition. Results show that the impacts of lightning conditions and driving speed had significant variation across observations. This variation could be attributed to driver actions, state, and vision obstruction.

Wu et al. [25] use roadway segment length, average daily traffic (ADT), average annual precipitation, average annual snowfall, percentage of trucks, number of interchanges, speed limit, friction number of pavement surface, number of horizontal curves, ADT of trucks, and number of grade breaks as independent variables. Penmesta et al. [26], for example, used MNL to model and identify important road characteristics (or locations) based on crash injury severity, as well as identify drivers who are more likely to contribute to crashes given road feature.

The current study offers significant advantages over much of the previous research and can cover existing gaps. To begin with, analyzing numerous models onto a data set to obtain the best results is rare in prior studies. In addition, the majority of these research reported their findings using only one of the logit models, but we used three distinct logit models (OL, MNL, and ML) in our study. The usage of a wide variety of variables is the second plus point that distinguishes this paper.

As previously stated, we look into the effects of independent variables, including geometric design, time-related, weather, and environmental conditions, land use, traffic attributes, vehicle characteristics, and driver characteristics variables on crash severity. It is also worth noting that most previous studies exploring the key factors that contribute to crash severity have only looked at limited categories of the independent variables [2729]. As a result, the current study provides a comprehensive result that can be used to improve the analysis.

Furthermore, in developing nations like Iran, crash severity analysis using a broad range of independent factors is a relatively new field. These research could help developing countries make the transition to development, making this paper even more valuable. The following sections provide more information on the data and the study’s context.

3. Methodology

Crash severity models are designed to predict the likelihood that a crash will fall into one of the severity levels, given that the crash has occurred. The three crash severity models are described in this section: the ordered logit, multinomial logit, and mixed logit models.

3.1. Ordered Logit

One of the essential elements in creating crash severity models is the ordered nature of crash data (for example, in the field of severity; damage, injury, and fatal) [30]. In such models, the most frequent technique is to employ a latent variable Z to model the ordered entity of the data. This invisible variable is allocated to each observed crash as a linear function [31].where x denotes the vector that specifies the discrete order of each crash recorded, ß is a vector that can be used to estimate parameters, and ε denotes the error. The following is the definition of the observed ordered data of the severity of crashes, y [32]:where µ are the estimated limit parameters that define y and µ is a parameter determined by β, which is the same as the model parameters. If µ0 is set to 0, the probability function will be as follows [32]:

The upper and lower bounds for the severity of the crash are indicated by and, respectively. An ordered logit model is created by examining the logistic distribution of observations for the error section.

The ordered logit technique, on the other hand, has two flaws. First, the under-reporting of accident severity data makes probability models vulnerable to incoherent and biased parameter estimates. In many circumstances, the presence of a negative x coefficient does not provide us with a correct understanding of the data’s current reality. A thorough examination of the data is essential to obtain a better prediction.

3.2. Multinominal and Mixed Logit

Let the utility that individual n derives from alternative j in choice scenario t be denoted as [33]

is a vector of seen qualities, n is a vector of utility coefficients that vary randomly between people, and njt is a random term representing the unobserved utility component. 0/1 terms can be included in the vector xnjt to allow alternative-specific constants, individual attribute levels, and continuous attributes [33].

The unobserved term njt is supposed to have a value of iid extreme. Under this premise, the logit formula calculates the likelihood that individual n will choose alternative i in choice circumstance t, conditional on n.

The researcher does not observe the utility coefficients of each individual but is aware that they vary between individuals. F, which is dependent on parameters, is the cumulative distribution function of n in the population. Distinct elements may follow different distributions, the distribution can be continuous or discrete (including some being fixed), and the aspects can be linked to one another [34].

Given the researcher’s information, the choice probability for the person's sequence of choices with continuous F is

A mixed logit model’s probability function is an integral of a multinomial logit model’s probability function on a density function of parameters [35]. The density associated with F is denoted by f. If F is discrete, the mixed logit formula is as follows:

McFadden and Train [36] demonstrated that any choice model could be used with any distribution of parameters. F considers different distributions such as normal, log-normal, and triangular. In the mixed logit model, the normal distribution is the most commonly utilized distribution. A mixed logit can approximate preferences to any degree of accuracy. This is the end outcome, which means that there are no theoretical constraints on the decision in the mixed logit model [37].

4. Data

Khorasan Razavi province is one of the most crowded provinces located in the northeast of Iran. The diverse attributes of this province have caused to see a different and wide range of weather characteristics, from narrow roads in the relatively mountainous location of the north and northeast of the province to wide roads with a good level of service located in the desert environment of the south of the province. Road and climate diversities make this province a good sample for accident severity models. This study analyzes crash and traffic data for four years, from 2013 to 2016. Table 1 shows the candidate independent variables in the data set.

The dependent variable explored in this study is the severity of crashes examined at three levels of damage, injury, and fatal crashes. The total number of observed crashes in the data is 485. Table 1 shows the share of each crash severity. Also, Table 2 shows independent variables in the data set.

To integrate information related to crashes and traffic attributes such as traffic volume and average speed, recorded by traffic counters, circles are drawn to the centre of each traffic counter with different radius. Next, crashes and adjacent traffic information extracted from traffic counters are combined. This information was integrated by ArcGIS software. Due to the low number of crashes, the larger radius of the circle to the centre of each traffic counter leads to more included crashes. Finally, logit models were calibrated for a 2500 m radius for spatial analysis. Regarding the use of traffic data in modelling, it should be noted that traffic data are considered one hour before the occurrence of crashes for temporal features. Figure 1 shows an output of ArcGis software.

5. Result

In this study, crash severity is modelled by three logit family models: ordered, multinomial, and mixed logit. In this section, achieved results and related interpretations are presented.

5.1. Ordered Logit Model

Table 3 shows the result of calibrating the ordered logit model for crash severity. Independent variables consist of traffic, geometric, human, vehicle, weather, land use, and time variables in this model.

Results show that the minimum headway as a traffic attribute variable is one of the influential variables on crash severity. As the minimum headway increases, the severity of crashes decreases because it gives the driver more time to decide and react before the crash. The nonseparate two-way road has a positive effect on increasing the severity of crashes; this means that on nonseparate two-way roads, there is a possibility of colliding with higher intensity due to the lack of protection between the two routes. Recreational land use has a positive effect on increasing the severity of crashes on suburban roads. Sudden changes in the speed at recreational land use sites on suburban roads increase the probability of more severe collisions and crashes. Snowy weather with a coefficient of −1.198 reduces the severity of crashes because the mental prerequisite of drivers in snowy weather is caution and drivers in snowy weather drive more cautiously. The negative coefficient for the negative slope turn means a decreasing trend in the severity of crashes. The presence of vehicles in the negative slope turn equals low speeds. If the vehicle is a bus, truck, or trailer, the severity of crashes is lower than other vehicles. The lower speed of these vehicles can justify this finding. Agricultural land use is also effective in more severe crashes. The presence of agricultural machinery and more road damage in this land use are the two main factors that increase the severity of crashes in more agricultural land use areas.

5.2. Multinnominal Logit Model

Table 4 shows the results of calibrating the multinominal logit model for the severity of crashes.

The variable “narrow width of the passage” increases the severity of accidents. Office commercial land use has a significant effect on the occurrence of damaging crashes. In suburban roads, places with office-commercial land use have heavy and crowded traffic, which can be effective in low-severity crashes (damage). “Speed limit of 40 km/h for vehicles” is one of the traffic attribute variables that affect the occurrence of damage crashes. “A malfunction of the vehicle’s braking system” has a positive effect on the event of damage crashes. It can be expected that drivers are aware of the defect of the brake system and, as a result, use caution in their driving. “Driver operating under the age of eighteen” is involved in the occurrence of damaging crashes. Despite the higher excitement of people at this age, the fear of possible crashes and the consequences of driving without a license overcomes that excitement to some extent, causing drive more cautiously and then leading to less severe crashes (damage). “Nonseparated two-way road” as a geometric design variable has become significant in both damaging and fatal crashes. “Driver’s unfamiliarity with the road” is an effective and significant factor in both damage and fatal crashes.

In most cases, unfamiliar drivers drive cautiously, which makes the accident less severe. In some cases where these drivers drive recklessly, their unfamiliarity with the road leads to high-severity crashes that lead to fatal crashes. “Rainy weather” has a direct effect on both injuries and fatalities. “The slippery road surface after rain” directly is effective in the occurrence of injuries and fatalities (higher severity). “Residential land use” is effective in preventing the occurrence of injuries. In fact, in suburban areas with residential land use, such as residential towns, the vehicle speed is generally slower, and drivers drive more cautiously, which effectively reduces injuries. “Snowy weather” has a positive effect on the occurrence of injuries due to limited visibility and slippery road surface. “Negative slope turn” has a positive impact on the injuries. A negative slope automatically makes it more difficult to control the vehicle. Both buses and vans are expected to be among the heavy vehicles involved in severely injured crashes. In fact, due to these two vehicles’ larger dimensions and weight, their collision causes an accident with severe injuries. The same interpretation applies to trailers for fatal crashes. The “male driver” positively affects injuries; there are two reasons for this: male drivers are more likely to drive on suburban roads than female drivers. The second reason is that male drivers drive more recklessly than female drivers. “The occurrence time of the accident in the afternoon, 2 to 5 pm”, has a negative effect on the occurrence of fatal crashes. Adequate light in the afternoon and heavier traffic in the afternoon reduce the severity of the accident. “Agricultural land use” is effective in the occurrence of fatal crashes due to two important reasons: the presence of agricultural machinery (safety deficiency of these devices such as lighting) and further damage of roads with this land use. The “speed range of 80–100 km/h” has a negative effect on fatal crashes. One of the justifiable reasons could be the average speed limit for suburban roads, which is 90 km/h on two-way and 110 km/h on freeways. The speed range shows that this range is less than the average permitted value (100 km/h) of suburban roads. The “motorcycle” is effective in both damaging and fatal crashes. In Iran, many motorcycles are without safety equipment, riding at an unauthorized speed and in the middle sections of the route.

5.3. Mixed Logit Model

Table 5 shows the output of the combined logit model of accident severity, which includes geometric design, time-related, weather and environmental conditions, land use, traffic attributes, vehicle characteristics, and driver characteristics variables, which were examined at three levels of severity: damage, injury, and fatal.

Like the multinominal logit model, the variable “path width” positively affects damaging crashes. “Road shoulder” is one of the geometric design variables that plays an effective role in the occurrence of damage crashes. The presence of the shoulder itself is an influential factor in decreasing the severity of crashes or avoiding it. Still, the soil characteristic of the shoulder leads to an accident with severe damage. Like the multinominal logit model, office-commercial land use significantly affects the occurrence of damaging crashes, and “40 km/h speed limit for vehicles” is one of the traffic variables that affect the event of damage crashes. “Vehicle brake system malfunction” is effective in the occurrence of damage crashes. The “underage driver (under the age of eighteen)” is involved in damage crashes. The “nonseparated two-way road” as a geometric design variable is significant in fatal and fatal crashes, similar to the multinomial logit model. The “driver’s unfamiliarity with the road” is an effective and influential factor in both fatal and fatal crashes, and “rainy weather” directly affects both injuries and fatalities. The “motorcycle” is one of the influential variables in the occurrence of crashes on suburban roads. The variable “continuous technical defect in the vehicle” is effective in both injury and fatal crashes. The vehicle driver is often aware of this problem but ignores it, such as broken lights and tire defects, leading to more severe crashes. The “residential land use” is effective in preventing injuries. Like the multinominal logit model, “snowy weather” positively affects the occurrence of injuries due to limited visibility and slippery road surface. The “bus” and the “truck” cause crash with severe injuries. The “the time of the accident in the afternoon, from 2 to 5 pm” decreases the probability of fatal crashes, and “agricultural land use” is effective in fatal crashes. The multinominal logit model, “speed range of 80–100 km/h”, has a negative effect on fatal crashes.

6. Conclusion

The high severity of crashes is one of the negative consequences of suburban transportation. People killed, injuries and hospital expenditures, road and vehicle damage, and psychological repercussions are all part of the serious crash costs. This study used the descriptive logit model family to try to figure out what factors influence the severity of crashes on the Khorasan Razavi province’s suburban roads in Iran's northeast. Crash severity is classified into three categories: damage, injury, and fatal. We had the opportunity to analyze one of the most comprehensive databases with the integration of crash data and data obtained from traffic counters over a four-year period, including geometric design, time-related, weather and environmental conditions, land use, traffic attributes, vehicle characteristics, and driver characteristics as independent variables. The following conclusions are drawn from interpreting our data using three logit models (OL, MNL, and ML) and comparing their results:(1)The ordered logit model has a p2 of 0.035, the multinominal logit model has a p2 of 0.129, and the mixed logit model has a p2 of 0.135. In terms of p2, there is a progression from the ordered logit model to the multinominal logit model and then to the mixed logit model.(2)When it comes to crash severity, the geometric design, vehicle characteristics, driver characteristics, land use, and weather and environmental condition variables are more likely to be crucial. Because the above factors showed significance in all of the logit models examined in this study. As a result, developing countries, such as Iran, should pursue methods that place a greater emphasis on these influential factors in order to mitigate the severity of crashes.(3)Based on the findings of this study, the following policies can be mentioned as effective in Khorasan Razavi province that can help reduce the severity of crashes. Variables in geometric design, such as “road shoulders” and “nonseparated two-way road”, can be improved. Educating society’s underage sections can help to promote good behavior and prevent many of the human-related impact elements from occurring in the future. Anticipate efficient traffic management strategies in adverse weather conditions. The traffic police will be stricter in the future when it comes to vehicle defects. To avoid vehicle characteristics-related factors, traffic police can be stricter once it regards to vehicle defects. It is also proposed that traffic attributes such as “speed limit” can be better controlled by increasing the number of traffic cameras and police officers.

The conclusions of this study could be improved by a number of future research directions. First, to give a quantitative comparison of such diverse situations, it would be useful to assess the model's geographic transferability versus models in developed countries [38]. It is also worthwhile considering temporal employability to see if findings are consistent throughout time or if they alter from year to year. Furthermore, more research can be done to identify why crash data are underrepresented in the data collection; one option is to expand the data set's geographical coverage. Finding methods to boost crash records, such as developing upsampling algorithms to simulate missing observations, could be significant [39]. Finally, evaluating the contribution of various proposed transportation policies in the event that they are implemented in the Khorasan Razavi province in the future would be advantageous.

Data Availability

Access to data is restricted due to third-party rights.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

The authors gratefully appreciate Road Maintenance and Transportation Organization (RMTO) for providing crash data.