Mathematical Problems in Engineering

Mathematical Problems in Engineering / 2014 / Article
Special Issue

Green Transportation System and Safety

View this Special Issue

Research Article | Open Access

Volume 2014 |Article ID 832723 |

Jian-feng Xi, Hai-zhu Liu, Wei Cheng, Zhong-hao Zhao, Tong-qiang Ding, "The Model of Severity Prediction of Traffic Crash on the Curve", Mathematical Problems in Engineering, vol. 2014, Article ID 832723, 5 pages, 2014.

The Model of Severity Prediction of Traffic Crash on the Curve

Academic Editor: Wuhong Wang
Received05 Sep 2013
Revised20 Nov 2013
Accepted29 Nov 2013
Published09 Jan 2014


With the study of traffic crashes on curved road segments as the focus of research, a logistic regression based curve road crash severity prediction model was established based on a sample crash database of 20000 entries collected from 4 regions of China and 15 evaluation indicators involving driver, driving environment, and traffic environment factors. Maximum Likelihood Estimation and step-back technique were deployed for data analysis, the conclusion of which is that the three main contributory factors on curve road crash severity are weather, roadside protection facility, and pavement structure. Hosmer and Lemeshow tests were used to verify the reliability of the model, and the model variables were discussed to a certain degree as well.

1. Introduction

As a major component of road geometric design, curved road segments, due to their alignment characteristics, are most prone to traffic crashes among all road geometric elements. In 2010, crashes on curved segments accounted for 10.5% of the total number of traffic crashes in China. Correspondingly, the number of deaths accounted for 12.89% of the total number of deaths, as shown in Figure 1.

Many researches have been conducted with regard to the characteristics and causes of crash on curved segments, ranging from those based on macroeconomic statistics, where crash rates of straight and curved segments are compared [1, 2], and those based on vehicle dynamics, where the focuses are on improving vehicle’s safety and through performance on curves [3, 4], to those that are based on the study of human factors, where the studies of traffic crash have been heavily focused on drivers and driver behavior [5]. Vehicle speed and geometric design are also topics of study when curve road crashes are concerned [6, 7]. Regarding crash severity, certain mathematical models have been established, such as the Bayesian Method [8], the Ordered Probit Model [9], and the Neuronal Network Approach [10]. However, among the aforementioned studies and researches, few had aimed specifically at investigating crash severity on curved road segments.

Analysis and prediction models are based on crash data. In the past, data used by traffic crash studies on curve roads was mostly gathered from specific segments. Crash samples and crash-related factors had not been duly considered either. For these reasons, crash analysis would often fail to explain the real causes of crash, and the models were often questionable in terms of reliability and adaptability.

In this study, the crash data consists of 500 randomly chosen samples, according to the data gathering and management standards of China’s “National Road Traffic Crash Information System,” from a crash database of 20000 valid crash data entries that covers 4 regions of China from 2004 to 2008. The database comprises 60 items of crash information, which is comprehensive enough to recreate the process of a crash and provide an important basis for the analysis of causes. In terms of analysis method, the crash data is the foundation of analysis of the cause of crash and the structure and form of the crash determines the model for crash causation analysis. Since the attributed causes of each crash cannot be isolated from a single point of interpretation [11] and various internal and external factors are affecting drivers’ reaction to road traffic safety, this paper made a comprehensive analysis of the severity of crash on curves from three aspects that include driver, vehicle type, and the road environment.

In recent years, due to its innate suitability, the logistic regression model has been widely used in practical problems, such as Banking [12], Genomics [13], and Psychopathology [14]. Many researchers used the logistic regression model to analyze the safety of roads, such as crashes of single-vehicle motorcycles [15], and crash prevention [16, 17]. Descriptive factors are defined as natural numbers that start from 0 and can be transformed into discrete variables. By this transformation, traffic crash data can meet the requirements of logistic regression model; therefore, the severity prediction model of traffic crash on curved segments may be built based on these data. The model [18] can overcome the deficiencies of analysis method and linear regression analysis of the traditional Mantel-Haenszel model, for it can contain multiple influencing factors that include an analysis of both discrete and continuous variables. The model can effectively analyze mixed influences and interactions from external variables and therefore also provides a methodological basis for the quantitative description relationship between multiple influencing factors and the prediction of severity on curves.

2. Establishing the Logistic Regression Model

2.1. Binary Logistic Regression Probability Formula

The severity of traffic crash on curved segments is regarded as dependent variable , when the th crash has any number of deaths, ; otherwise, .

Assuming there are influence factors that are related to dependent variable and are denoted , the probability of fatal crash under the impact of influencing factors is where are the influencing factors of crash severity; it can be a continuous variable, categorical variables, or dummy variables; are the regression coefficients.

2.2. Binary Logistic Regression Model Parameter Estimation

Considering the severity of the crash to be a dichotomous dependent variable, are its corresponding independent variables. Let , and   be its corresponding variable vector, . Thus, a logistic model may be established as [15] in which .

The model used Maximum Likelihood Estimation method for variable estimation and the likelihood function of variable can easily be derived from the binary logistic regression model, which is where . Therefore, the logarithmic likelihood function is

For , it resulted in

In the logistic regression model, the Newton-Raphson method is used. Define , , and let , ; then in which

If the number of iterations is , then the maximum likelihood estimate of variable becomes

3. Selection of Crash Impact Factor

Through a detailed analysis on the structure of the crash database, it is found that the database is a superdimensional structure with radial, multidimensional, and multilevel characteristics. Each crash record contains multiple data attributes and each value reflects a traffic crash’s characteristics in one aspect. In addition, the setting of data attributes is based on five factors, which are the information of the crash, personnel information, vehicle information, road information, and environmental information. Therefore, logically the data attributes and five factors of people, vehicles, roads, climate environment and crash basic information formed a two-level formation, in which the five factors of people, vehicles, roads, environment, and crash basic information are in the upper layer, and the data attributes are in the lower layer. Also, different attributes and the attribute value formed a set complying with their specific logical relations and within each attribute there existed a specific hierarchy.

500 crash samples, randomly selected from a crash information database of 20000 entries, have been analyzed. The causes of crash are mostly related to bad subjective judgment and human errors, vehicle performance issues, change of external environment, and change of road conditions. However, vehicle performance is generally not considered in crash analysis and is neglected in this study.

Therefore, from three-level analysis system of drivers, driving environment, and road environment, the model selects 15 evaluation items as independent variables. The evaluation of independent variables is shown in Table 1.

Evaluation of the categoryVariable symbolVariable contentVariable assignment

Driver GenderFemale = 0, male = 1
Age classificationAge from 16 to 25 = 0, 26 to 35 = 1, 36 to 45 = 2, above 46 = 3
Household registerAgriculture household = 0,
nonagricultural household = 1
Driving experience classification1–5 years = 0, 6–10 years = 1, 11–15 years = 2,
16–20 years = 3, above 20 years = 4
The accident responsibilityMinor responsibility = 0, equal responsibility = 1, main responsibility = 2, all responsibility = 3

Driving environment WeatherSunny = 0, not sunny (rain, snow, fog, cloudy, wind) = 1
TerrainPlain = 0, others = 1
VisibilityAbove 200 m = 0, 100–200 m = 1, 50–100 m = 2, below 50 = 3
Road surface conditionDry = 0, not dry (rain, ice) = 1
Lighting conditionDay = 0, night with light = 1, night without light = 2

Road environment Traffic signal typeWith signal control = 0, without signal control = 1
Roadside facilities protection typeWith roadside protection = 0,
without roadside protection = 1
Road physical isolationWith isolated = 0, without isolated = 1
Pavement structureAsphalt pavement = 0, not asphalt pavement = 1
Road conditionsRoad surface in good condition = 0,
pavement damage = 1

4. Prediction Model of Traffic Crash Severity on Curved Segments

The model used stepwise regression method to analyze the independent variables and step-back technique to obtain the results. In step 1, 15 independent variables were all put into the model and the variables based on the probability of the likelihood ratio for the test were assumed. In step 13, the weather , roadside protective facility type , the road pavement , and constant were selected. Results are shown in Table 2.

Variable S.E.WalsdfSig.

Roadside protective equipment−1.3210.8532.39810.1220.267
Road pavement1.460.5317.54610.0064.304

Taking a significance level of 0.05 and using the reverse stepwise method and 13 times of screening for selection, the model obtained the correlation of crash severity on curves with weather, road-side protective facility type, and road pavement structure. From Hosmer and Lemeshow tests, the results in Table 3 showed that the significance level of 0.674 is greater than 0.05 and thus proved that the original assumption is valid. Also, the chi-square value from 7.856 of the first screening reduced to 0.79 in the 13th screening, which proved that the model is correct.

StepChi-squareFreedomSignificance level


The probability of crash with death from the logistic regression is expressed by

5. Conclusion

(1) Not all of the 15 impact factors are selected to put into the curve road crash severity prediction model. Some were excluded partly due to weak correlations, but it does not mean that they have little impact on the severity of a crash. For example, visibility and road surface conditions are closely related to weather conditions and their effects may be indirectly reflected in the final model, if such factors are to be further considered.

(2) During bad weather conditions, road surface friction coefficient decreases and visibility of road condition reduced, and vehicles driving on curved segments will be prone to cross over the median, causing side scraping, rollover, or rear-end crashes. During rainy or foggy weathers, water film will form on the surface of the road reducing tire friction. Therefore, weather is one of the main factors that affect crash severity on curved segments, and the severity will be even greater especially under ice and snow conditions.

(3) Highway safety design often focuses on road alignment and profile, but lacks due considerations toward road-side facilities. However, in the real world, road-side environment in general is closely related to traffic crashes. On curve roads, off-road crashes frequently occur and it is mainly because of improper speed control and road-side environment, especially on mountainous curves where one side of the road is either deep grooves or cliffs. If curved segments have roadside protection facilities, it can greatly reduce the severity of the crash. Hence, better design and installment of road-side protection facilities should be one of the priorities for preventing severe crashes from happening.

(4) Compared to gravel pavement, asphalt roads generally have higher crash rates. Also, compared to the cement concrete pavement, asphalt roads have smoother surface and greater involuntary horizontal movement from vehicles. Therefore, drivers usually feel more comfortable driving on an asphalt-paved road surface, which often leads to high speed driving and speed related crashes. Asphalt pavement is more sensitive to the temperature; its structural strength decreases at high temperature and cracks at low temperature. However, drivers would often neglect the effect of temperature changes on the condition of road surface and their effects on safe driving, which is another very important factor in causing severe crashes.

(5) The existing road traffic crash database of China was originally designed to determine the responsibility of crash for legal purposes. With its 60 data items or attributes, although very comprehensive, it still needs future improvement in order to better describe the whole process of an actual traffic crash and to be used as the basis for crash prediction and prevention studies.

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.


This project was supported by the NSFC (Grant no. 50808093).


  1. T. Miller, D. Lestina, M. Galbraith, T. Schlax, P. Mabery, and R. Deering, “United States passenger-vehicle crashes by crash geometry: direct costs and other losses,” Accident Analysis and Prevention, vol. 29, no. 3, pp. 343–352, 1997. View at: Google Scholar
  2. C. Zhang and J. N. Ivan, “Effects of geometric characteristics on head-on crash incidence on two-lane roads in connecticut,” Journal of the Transportation Research Board, vol. 1908, pp. 159–164, 2006. View at: Google Scholar
  3. S. K. Rao, “Unit curve for design of highway transitions,” Journal of Transportation Engineering, vol. 121, no. 2, pp. 169–175, 1995. View at: Publisher Site | Google Scholar
  4. J. B. Beniley and W. E. Gallagher, “Review of highway curve design,” Journal Highway, vol. 18, no. 11, pp. 7–16, 2005. View at: Google Scholar
  5. S. Schick, “Accident related factors,” Trace Coordinator V3, 2009. View at: Google Scholar
  6. R. Yuanyuan, Research on Driving Dangerous Area and Driving Behavior Model in Road Curved Section, Traffic Environment and Safety Technology, Jilin University, Changchun, China, 2011.
  7. W. Chaoshen, Study on Road Traffic Safety Analysis and Countermeasures in Curve, Transportation Planning and Management, Chang'an University, Xi'an, China, 2010.
  8. J. Ma, K. M. Kockelman, and P. Damien, “A multivariate poisson-lognormal regression model for prediction of crash counts by severity, using bayesian methods,” Accident Analysis and Prevention, vol. 40, no. 3, pp. 964–975, 2008. View at: Publisher Site | Google Scholar
  9. B. Mediouny, Z. Anis, and H. S. Habib, “Motorway traffic crash prediction based on the network approach: application to the ringway of Paris,” Evaluation at Optimization System in the Production Services, vol. 1, pp. 1–8, 2011. View at: Google Scholar
  10. S. M. Rifaat and H. C. Chin, “Accident severity analysis using ordered probit model,” Journal of Advanced Transportation, vol. 41, no. 1, pp. 91–114, 2007. View at: Google Scholar
  11. R. Howard, Statistical Modeling and Analysis of Injury Severity Sustained by Occupants of Passenger Vehicle Involved in Crashes with Large Trucks, University of Nevada, Reno, Nev, USA, 2010.
  12. S. Y. Sohn and H. S. Kim, “Random effects logistic regression model for default prediction of technology credit guarantee fund,” European Journal of Operational Research, vol. 183, no. 1, pp. 472–478, 2007. View at: Publisher Site | Google Scholar
  13. C. M. Zhang, H. D. Fu, Y. Jiang, and T. Yu, “High-dimensional pseudo-logistic regression and classification with applications to gene expression data,” Computational Statistics and Data Analysis, vol. 52, no. 1, pp. 452–470, 2007. View at: Publisher Site | Google Scholar
  14. C. R. García-Alonso, J. Guardiola, and C. Hervás-Martínez, “Logistic evolutionary product-unit neural networks: innovation capacity of poor guatemalan households,” European Journal of Operational Research, vol. 195, no. 2, pp. 543–551, 2009. View at: Publisher Site | Google Scholar
  15. V. Shankar and F. Mannering, “An exploratory multinomial logit analysis of single-vehicle motorcycle accident severity,” Journal of Safety Research, vol. 27, no. 3, pp. 183–194, 1996. View at: Publisher Site | Google Scholar
  16. M. A. Abdel-Aty and H. T. Abdelwahab, “Predicting injury severity levels in traffic crashes: a modeling comparison,” Journal of Transportation Engineering, vol. 130, no. 2, pp. 204–210, 2004. View at: Publisher Site | Google Scholar
  17. R. Harb, E. Radwan, X. Yan, A. Pande, and M. Abdel-Aty, “Freeway work-zone crash analysis and risk identification using multiple and conditional logistic regression,” Journal of Transportation Engineering, vol. 134, no. 5, pp. 203–214, 2008. View at: Publisher Site | Google Scholar
  18. A. S. Al-Ghamdi, “Using logistic regression to estimate the influence of accident factors on accident severity,” Accident Analysis and Prevention, vol. 34, no. 6, pp. 729–741, 2002. View at: Publisher Site | Google Scholar

Copyright © 2014 Jian-feng Xi et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

More related articles

 PDF Download Citation Citation
 Download other formatsMore
 Order printed copiesOrder

Related articles

Article of the Year Award: Outstanding research contributions of 2020, as selected by our Chief Editors. Read the winning articles.