## Combined Prediction Model of Death Toll for Road Traffic Accidents Based on Independent and Dependent Variables

#### Abstract

In order to build a combined model which can meet the variation rule of death toll data for road traffic accidents and can reflect the influence of multiple factors on traffic accidents and improve prediction accuracy for accidents, the Verhulst model was built based on the number of death tolls for road traffic accidents in China from 2002 to 2011; and car ownership, population, GDP, highway freight volume, highway passenger transportation volume, and highway mileage were chosen as the factors to build the death toll multivariate linear regression model. Then the two models were combined to be a combined prediction model which has weight coefficient. Shapley value method was applied to calculate the weight coefficient by assessing contributions. Finally, the combined model was used to recalculate the number of death tolls from 2002 to 2011, and the combined model was compared with the Verhulst and multivariate linear regression models. The results showed that the new model could not only characterize the death toll data characteristics but also quantify the degree of influence to the death toll by each influencing factor and had high accuracy as well as strong practicability.

#### 1. Introduction

With the gradual progress of “Science and Technology Action Plan of Traffic Safety” and the implementation of “Law of the People’s Republic of China on Road Traffic Safety,” the number of traffic accidents and the degree of injury have shown a decreasing trend since 2004; however, the death toll was still about 60 000 every year. Among the four traffic accident indicators, death toll has a direct effect on the sense of security and the degree of stability of the society, so understanding the death toll in the future will be of great guiding significance for making subsequent traffic management measures and policies and will play a guiding role in the development and orientation of traffic safety guarantee technology. Therefore, the agreement on the predictions of death toll has always been the key point of related studies [1–5]. However, traffic accident is difficult to forecast due to its randomness. And because the related forecasting methods are affected by various factors, the precision is difficult to guarantee.

The commonly used predicting methods include regression analysis method, exponential smoothing method, fuzzy analysis method, and time series method. In the area of traffic safety and accident, gray theory, Markov method, and artificial neural network are several major predicting methods. For example, on the basis of gray model GM (1, 1) for traffic accident prediction, Markov chain prediction method was introduced by Li et al. [6] and then the gray Markov prediction model was built by him. By making use of the advantages of artificial neural network such as strong nonlinear approximation, fuzzy reasoning, and self-learning, Dong and Shi [7] built the BP neural network prediction model of traffic accident. Zhang et al. [8] and others, using ARIMA model, did some research on the time series’ stationarity of the mortality among 100 000 people in traffic accidents in China from 1970 to 1997 and used SPSS software to fit the model and made forecast. The conclusion was as follows: ARIMA model could improve prediction accuracy and could be applied to different kinds of nonseasonal and seasonal time series. Based on gray system theory and Markov theory, Zhao and Xu [9] and others used the system cloud gray model SCGM(1, 1) c to fit the general trend of the time series data of road traffic and put forward the gray weighting Markov model SCGM(1, 1) c which could be used to predict the number of traffic accidents. This model was suitable for dynamic prediction with short time series, less data, and not too big random fluctuations.

The above methods have their own characteristics, but each has its own defects; for example, using the GM (1, 1) model in gray theory, we can proceed from traffic accident data to analyze the characteristics and change law of the data and to predict the trend in the future. The model is easy to use and there is no need to consider other factors, but it can only describe monotonic changing process. If combining it with Markov theory, we can get a new model “gray Markov model,” which is applicable to random fluctuation process of traffic accidents; however, as for the model, there are no uniform standards for the classification of the system states. Artificial neural network is a kind of method which simulates the process of information input and decision-making output of human brains, during which process, the specific process of information processing and model building is not shown, which is very simple and convenient, but the accuracy is influenced by the data greatly. Multiple regression method can build a mathematical relationship between accident results and related factors and quantify the process and extent of the influence of various factors to accidents, whereas the accuracy of the model is comparatively poor, for the selection of factors is variable, and the future trends of the factors must be predicted prior to the final prediction of accidents, which means that the predicted data will be used as dependent variables of next prediction.

This paper concluded and analyzed both advantages and disadvantages of the above models. To begin with, it planned to use Verhulst model, most suitable to traffic accident in gray theory, to make a preliminary prediction on the basis of analyzing the characteristics of accident data [10, 11], which could characterize the changing trend of accidents; and meanwhile, in order to reflect the impact of other factors on traffic accidents, a multiple regression model of death toll caused by traffic accidents was built to analyze the dependent relationship of traffic accidents; then, for the purpose of combining the advantages of the two models, a combined prediction model of traffic death toll, based on independent and dependent variables, was built. This model could not only reflect the fluctuation law of the change of traffic accidents but also reflect the dependent law of traffic death toll caused by the interaction of multiple factors.

#### 2. The Verhulst Model of Traffic Death Toll

##### 2.1. Verhulst Model

In recent years, the development tendency of traffic death toll in China has shown a saturated S-shaped process, so it was suitable to use Verhulst model to make prediction [12]. The fundamental and process of Verhulst model building are as follows.

*(**1) Model Building.* Let the original data sequence of traffic accident death toll be ; was the number of data.

According to the original data, one accumulated that generating operation data sequence of death toll was built as follows: In the above formula, the relationship between and was , . was the consecutive neighbor mean sequence of , : So was called gray Verhulst model. Consider that was the winterization equation of Verhulst model.

Let be the vector of parameter to be determined, and , in which was development gray number and was inner control gray number.

Discretizing formula (4) gives .

Using the least square method to solve the problem gives .

In the formula,

Then the parameter values of , could be obtained. Substitute them into formula (3) to solve and get the solution of winterization equation. Namely, the time response function of accumulated generating sequence was

The time response formula of the model was

The inverse accumulated reduction formula was

*(**2) Residual Test of the Model.* It is necessary to test the accuracy before building a prediction model, which can determine the validity. Residual test is a commonly used method for the test. In detail, model values and measured values are tested point by point.

Calculate the value of by the model and transfer to by inverse accumulated generation operation and then calculate the absolute and relative error sequences of the original sequence and :

##### 2.2. Death Toll Forecast

Based on the data of traffic accident death tolls in China from 2002 to 2011 and the above calculation method of Verhulst model, a prediction model could be built. The specific data were shown in Table 1.