Research Article  Open Access
Shahriar Afandizadeh, Shahab Hassanpour, "Evaluating the Effect of Roadway and Development Factors on the Rural Road Safety Risk Index", Advances in Civil Engineering, vol. 2020, Article ID 7820565, 14 pages, 2020. https://doi.org/10.1155/2020/7820565
Evaluating the Effect of Roadway and Development Factors on the Rural Road Safety Risk Index
Abstract
As roadway and development factors are identified as the most effective factors contributing to road traffic accidents, investigating these factors could lead to reducing the accident frequency rate. However, previous works focused on investigating the effect of roadway factors on the accident frequency rate using statistical analysis. The present study aimed to evaluate the effect of roadway and development factors on the accident frequency rate using ANOVA and Chisquare tests on a rural road. Secondly, it aimed to develop a rural road safety risk index based on Kmeans clustering and Gaussian models. The findings indicated that the operating speed and the differences between posted speed limits and the operating speed are the pivotal influencing factors on the accident frequency rate. Moreover, clustering analysis of the roadway and development factors on the twolane, twoway road of BorujerdKhorramabad indicated six clusters which were identified as highly, relatively highly, moderately, relatively lowly, lowly risky, and not risky (safe) clusters. Regarding clusters, the accident frequency rate increased by decreasing the difference between the posted speed limits and the operating speed from the safe cluster. In addition, the risky index model based on the Gaussian model showed that the average reducing factor of accident frequency rate reached 0.99 by increasing per km/hr in the difference between the posted speed limits and the operating speed among low risky and safe clusters, while it was equal to 1.17 in risky and unsafe clusters. The comparison of the clusters revealed that accident occurrence probability in risky clusters was more than the ones in low risky or safe clusters. Therefore, the maximum and minimum values of the safety risk index were observed in the sixth and the third clusters, respectively.
1. Introduction
Road traffic accidents cost most countries 3% of their gross domestic product [1], and traffic safety has become one of the most challenging issues in the recent decades. According to the report by the WHO, 1.52 million people are killed in traffic accidents every year [2]. Particularly, the cost of road fatalities and injuries is 2.19% of the gross national product in Iran, which is higher than the global average [3]. Location is considered as a crucial parameter in crash analyses since it closely relies on the identification of the traffic and geometric conditions that are related to an accident [4]. Anderson and Krammes [5] indicated that curves with a degree of curvature greater than four had higher accident rates. These curves required speed reductions while there was no need for such a decrease in the curves with lower values. In addition, Caliendo and Lamberti [6] studied the influence of radius on accident rates and found a decrease in accident rates by increasing the radius between 200 and 500 m. Similarly, Cenek et al. [7] investigated the relationship for a wider range of radii, while Hauer [8] confirmed such a relationship for all radii. Hauer [9] also reported that curves with large deflection angles are more risky than the smaller ones.
Other research studies focused on evaluating geometric variables such as the lane and shoulder width, pavement type, skid resistance, annual average daily traffic, spiral transitions, and passing behavior [7, 10]. Furthermore, some other studies delved into the relationship between the speed and curvature [11, 12]. According to Tate and Turner [13], the difference between the negotiation speed and design speed on curves has a significant effect on the injury crash rate. Studying the relationship between the operating speed and accident frequency rate, Bird and Hashim [14] indicated that higher operating speeds generally cause fewer accidents. Likewise, Wang et al. [15] investigated the relationship between average operating speed and accident severity and found that the operating speed with a 1% increase in the average operating speed results in a 0.074% decrease in the number of minor injuries with a 0.095% increase in the number of fatalities. Other studies evaluated the relationship between higher speed limits and the probability of accident severities and reported that higher speed limits increased the probability of a more severe accident and that accident severity increased outside level and straight roadways [16, 17]. Furthermore, Thomas [18] examined the influence of the segment length on crash analysis outside intersectionrelated sites and concluded that there is no definitive length which performs better than any other and that the length of the used segment solely depends on the type of the research. Few studies indicated that based on geometric and environmental features, variablelength segments perform better in the crash analysis compared to the fixed segments [19, 20]. Moreover, Caliendo and Lamberti [6], in a study, focused on the relationship between roadway factors and crash rates and demonstrated that segment types, access control, sight distance, and design consistency were highly correlated with crash rates.
Therefore, this study aims to evaluate the effect of roadway and development factors on accident frequency using ANOVA test and Chisquare tests on a rural road. Moreover, it develops a rural road safety risk index based on Kmeans clustering and Gaussian models to produce a technique for supporting the road safety analysis.
The organization of the remaining parts of the study is as follows. In Section 2, the literature review is presented, together with a discussion of previous studies related to the importance of factors contributing to road accidents and previous methods for accident data analysis. Additionally, Section 3 involves a description of data collection, followed by explaining the method of the present study about significance and clustering analyses and proposing the safety risk index. The obtained results regarding the proposed method are presented in Section 4. In Section 5, a sensitivity analysis is conducted by comparing the proposed safety risk index of the current study and that of the other studies. Finally, Section 6 contains the conclusion about the obtained results.
2. Literature Review
Several studies focused on driving safety affected by various factors and investigated the relationship between these factors and road accidents. Road accident data are classified as big data and include many attributes belonging to the accident such as driver attributes, environmental causes, as well as traffic, vehicle, and geometric characteristics and the location nature and the time of the day. In addition, data related to road accidents are taken for a long period of time and available as datasets, statistical tables and reports, or even Global Positioning System data. According to several studies, statistical and data mining techniques are proper for analyzing the road accident data [21–24]. Lee et al. [25] designed a statistical framework as a fine choice for analyzing the road accidents with geometric factors including driver characteristics and road layout, along with the design of the car and weather condition. However, most road accidents are attributed to the “human factor,” most especially to road safety violations [26].
Some researchers investigated the effect of roadway factors on the number of road accidents on urban highways. They applied different techniques to establish a relationship between these factors and the accident frequency rate [27–29]. In addition, others reported that not only roadway factors but also development factors including land use and accessibility number are the main factors influencing the number of traffic accidents on multilane highways. They found a robust relationship between the accident frequency rate and development factors. In order to reduce accident frequency rates, it is vital to apply development factors in accident analysis in order to promote safety on roads [28–34].
Shirmohammadi et al. [35] highlighted the clustering drivers regarding driving behaviors and skills as important factors which contribute to road accidents using the clustering analysis. Shen et al. [36] used clustering analyses to identify accident blackspots on rural roads. In addition, Alotaibi [37] employed data mining techniques to simplify road accident data since such methods are novel and superior to classical statistical techniques and help the researchers to discover the relationship between the hidden data. Several data mining methods in the transportation field are broadly utilized for road accident data analysis, including clustering algorithms, as well as classification and association rule mining [38, 39], although accident data are heterogeneous (different variables).
Among accident data analysis methods, clustering analysis is the best way to find several betweendata correlations which probably remain unknown [40]. Moreover, data mining techniques are useful for overcoming the accident data [41]. Ma and Kockelman [42] classified road segments which have similar characteristics. The results of this study were based on a linear regression model to estimate crash frequency within each cluster. Other studies employed clustering analysis for roadway crashes and safety projects [43–45]. Similarly, Sekuła et al. [46] proposed a clustering approach to predict the probability of a collision occurring in the proximity of planned road maintenance operations (i.e., work zones). Different other studies also concluded data mining techniques are more advanced and better than traditional statistical techniques [47–51].
To our best knowledge, no study has investigated the effect of roadway and development factors, especially the difference between posted speed limits and operating speed and operating speed on accident frequency rate on rural roads. Furthermore, we did not find any previous study on developing a rural safety risk index using roadway and development factors. Furthermore, previous studies only used clustering analysis for drivers’ behavioral characteristics concerning the accidents. Given this, the novelty of the present study is, firstly, investigating the effects of roadway and development factors on the accident frequency rate. Secondly, it applies clustering analysis and the Gaussian model for developing a rural risk index of the clusters regarding roadway and development factors. Moreover, finding the contributing factors to accidents plays an important role in collision statistics, which is considered as another reason for developing the subjective and driverbased evaluation of road safety risk. Finally, SPSS 17.0 and MATLAB R2013a software were employed to obtain the results.
3. Research Method
The process of evaluating the effect of roadway and development factors on the accident frequency rate for the development of a rural road safety risk index is performed as follows (see Figure 1).
3.1. Case Study Area
Lorestan Province has an area of 29308 km^{2} and a population of about 1.76 million. The capital city Khorramabad is located in the southern part of Lorestan. The province is widely known as a popular tourist destination. Since the BoroujerdKhorramabad road is located throughout the transit road of the North to the South of Iran, it is the most densely populated part of the Lorestan roads, and the number of motor vehicles accidents had been steadily rising during 2013 to 2016. A comparison of the motor vehicle accidents from 2013–2016 along the BoroujerdKhorramabad road revealed that the mortality rate reached up to 67% and the injury rate was up to 30%. During this period in total, there were 1409 accidents.
3.2. Data Collection
The accident frequency rate, normalized by the segment length, was used for this study and belongs to the accidents that occurred during three years (2013–2016). Regarding roadway and development factors in previous studies [28, 30–34] and data availability from the local police accident reports from 2013 to 2016 in the BorujerdKhorramabad rural road, evaluation of these factors and development of the rural risk index was based on such data. Using roadway and development factors not only makes the risk index more practical for rural roads but also reduces fatal and injury rates from accidents in future. Likewise, the roadway variables were average operating speed (km/hr), the difference between posted speed limits and operating speed (km/hr), annual average daily traffic (veh/day), segment length (km), the presence or absence of a speed control camera, homogeneous sections, and gradient (%). Moreover, development factors included dominant land uses along the roadways and the number of accessibility (Table 1). In this study, the twolane, twoway rural highway of the BorujerdKhorramabad road in Lorestan province, Iran (Figure 2(a)), was considered as a case study, and the location map of the study area is shown in Figure 2(b). The geometric and traffic characteristics were classified into homogeneous sections, and based on the available information, some independent variables were used to divide the road network into homogeneous sections as well.
 
Note. AADT: annual average daily traffic. 
(a)
(b)
The BorujerdKhorramabad road is a twolane, twoway road where the width of each lane and shoulder is constant and is equal to 3.65 and 1.85 meters, respectively, along the whole road and with no changes in lane or shoulder widths. Road pavement is in a relatively good condition along with road sections whose performance serviceability index (PSI) equals 3. The road sections are away from the zone of the influence of intersections, towns and so on. In addition, the value of side friction is considered 0.35 for the road sections according to AASHTO [52]. The value of the speed limit ranges from 40 km/hr to 95 km/hr with an average of 63 km/hr for road sections. Other geometric characteristics of the rural road including the characteristics of curvature and gradient sections are described in Table 1.
Therefore, based on the output of this approach, each road section was assigned a number of accidents varied from 0 to 13 per section. Considering the dynamic nature of traffic variables (i.e., operating speed and volume), traffic conditions were expressed by annual averages while road geometry was represented by categorical variables. The final dataset included 106 road sections (total length = 172 km) after the exclusion of sections applying missing traffic or geometry data.
3.3. Significance Analysis
The ANOVA test is one of the most applicable methods in transportation data analysis [53–56]. This method is used to evaluate whether the contributing factors have a significant impact on the accident frequency rate at the level of 0.05. Thus, the study examined the significance of the association between roadway and development factors and the accident frequency rate. The hypothesis was assumed as follows. H_{0}= there are no associations between roadway and development factors and the accident frequency rate H_{1}= there are associations between roadway and development factors and the accident frequency rate
Therefore, the hypothesis H_{0} was rejected, while the hypothesis H_{1} was accepted when the value was less than 0.000.
3.4. Clustering Analysis
Clustering technique is one of the most commonly used data mining methods, and there are many clustering algorithms such as Kmeans and Kmodes [21, 57]. Kmeans algorithm is based on a centroid technique, while Kmodes algorithm is based on the nominal data. The Kmeans algorithm is considered as one of the most popular data mining techniques for identifying the clusters based on accident frequencies [58, 59].
Using clustering techniques causes the problem of determining the best number of expected clusters. To solve this issue, the Kmeans algorithm is recommended to enter the number of K clusters. According to the framework of this method, the best and optimal number of clusters is determined by the Elbow method [60]. This method is one of the optimal methods that depend on both the measure of similarities within a cluster and the parameters that are used for partitioning. Therefore, the steps of identifying the optimal number of clusters are summarized as follows [61].(1)Computing the clustering algorithm (i.e., Kmeans) for different values of K, k = 2 to k = 15(2)Calculating the total withincluster sum of the square (wss) for each K cluster(3)Plotting the curve of wss according to the number of K clusters(4)Considering the location of a bend (knee) in the plot as a general indicator of the appropriate number of the clusters
3.5. Development of the Road Safety Risk Index
By the development of a risk index, it is vital to consider the fundamental elements that can contribute to road safety [62]. Ahmadinejad et al. [63] proposed a suitable index for road safety regarding deceleration numbers and safety parameters (e.g., crash rate and crash frequency rate). The results indicated that there is a significant correlation between safety parameters and deceleration numbers. Many studies defined safety risk by considering three variables including exposure, probability, and consequence [64, 65], which is shown in the following equation:where Exposure = measure to quantify the “exposure” of road users to potential roadway hazards. Probability = measure to quantify the chance of a vehicle being involved in a collision. Consequence = measure to quantify the severity level resulting from potential collisions.
4. Results and Discussion
4.1. Significance Analysis
To examine the effect of roadway and development factors on the accident frequency rate, the ANOVA test was run, the results of which are presented in Table 2. As shown in Table 2, operating speed and the difference between posted speed limits and operating speed have significant effects on the accident frequency rate due to Sig. (0.000) < 0.05. However, no significance is observed between the other factors and the accident frequency rate.
 
Note. All the italic coefficients are not statistically significant. It is significant at the 0.05 level. 
4.2. KMeans Clustering
The average linkage hierarchical clustering was used to determine the number of clusters although identifying the most optimal heterogeneous clusters has occasionally some limitations and deficiencies. Based on these limitations, the Kmeans cluster is applicable after determining the number of clusters. In this clustering method, using the centroids (i.e., the cluster center means) generated from the average linkage hierarchical clustering is a starting point [66, 67].
Cluster analysis applies algorithms to collate individual variables with similar scores [68]. Based on the squared Euclidean distance measure, the cluster analysis utilizes the scores derived from the grouping variables. In the current study, the grouped variables included the accident frequency rate, operating speed, the difference between posted speed limits and the operating speed, segment length, annual average daily traffic, the number of accessibility, and dominant land uses along the roadways, as well as the presence or absence of a speed control camera, curvature, and gradient.
The standardized scores (Zscores) of variables are used to avoid the problem of comparing Euclidean distances based on different measurement scales [69]. Based on Figure 3, the optimal number of a cluster is determined as six clusters based on the distinctive break (elbow) selected according to the squared Euclidean distance in comparison with agglomeration coefficients. Table 3 demonstrates the results of final cluster centers for independent and dependent variables.

Evaluating the ANOVA test of variables in the clusters for finding the most effective factors that play a role in the accident frequency rate, only the difference between posted speed limits and operating speed is specified as the most effective variable among the roadway and development factors due to the maximum statistical value or Fstatistic observed in Tables 4 and 5. Regarding the accident frequency rate, clusters are arranged in a specific order as highly risky, relatively high risky, moderately risky, relatively low risky, low risky, and not risky (safe) clusters (Figure 4(a)).
 
Note. All the italic coefficients are not statistically significant. It is significant at the 0.05 level. 
 
Note. All the italic coefficients are not statistically significant. It is significant at the 0.05 level. 
(a)
(b)
(c)
The F tests should be used only for descriptive purposes because the clusters are chosen to maximize the differences among the cases in different clusters. However, the observed significance levels are not corrected for this and, thus, cannot be interpreted as the tests of the hypothesis that the cluster means are equal.
Similarly, based on the results of the Chisquare (X^{2}) test (Table 5), the maximum X^{2} shows a difference between posted speed limits and the operating speed. Accordingly, the maximum X^{2} indicates how much this factor (i.e., the difference between posted speed limits and operating speed) affects the accident frequency rate. Hence, the maximum X^{2} was employed in the proposed model to discover the relationship between this variable and the accident frequency rate (Figure 4(b)). Additionally, the Chisquare distribution probability function was utilized to obtain the probability of each cluster (Figure 4(c)). As displayed, the maximum and minimum probability is determined for the fifth and the second clusters.
To understand the effect of the difference between posted speed limits and the operating speed on the accident frequency rate, the probability of the occurrence was obtained for each cluster. Based on Figure 4(b), when the difference between posted speed limits and the operating speed reduces from the safe cluster, the probability of accident occurrence risk in each cluster increases (Figure 4(c)). Therefore, the following results are obtained by comparing the difference between posted speed limits and the operating speed and the probability in each cluster (Figure 4).
As shown, the first cluster, namely, “relatively high risk,” is ranked the second based on the accident frequency rate, and its probability risk value is less than 10%. Hence, the occurrence of an accident is relatively low in this cluster.
The second cluster is ranked the fourth, “relatively low risk,” based on the observed accident frequency rate, and its probability risk value is less than 5%; thus, the incidence of a high accident frequency rate is very low in this cluster.
Likewise, the third cluster is ranked the sixth, “safe cluster,” based on the accident frequency rate. Identically, the probability risk value is less than 5%, which demonstrates that the accident occurrence is very low in this cluster.
The fourth cluster is ranked the fifth, “low risk,” based on the accident frequency rate. By comparing the probability risk value in this cluster with safe clusters, it can be found that the probability of accident occurrence in this cluster is 10% which might lead to a lower rate of accident.
In addition, the fifth cluster is put on the third, “moderately risk,” place considering the accident frequency rate. Based on the evaluation of the accident occurrence probability of this risky cluster and its comparison with the other cluster, the probability is 85%, which is high, and thus, the accident frequency rate is expected to demonstrate a significant increase.
Finally, the sixth cluster is ranked the first, “high risk,” based on the increasing accident frequency rate. Regarding the probability of accident occurrence in the cluster, the obtained probability is less than 5%, indicating that the frequency related to this kind of the cluster of accident might happen less than the other risky clusters.
Therefore, the probability of the occurrence of a moderate risky cluster is higher as compared to the other clusters, and more accident frequency rates occur in this cluster. Furthermore, the difference between the posted speed limits and operating speed in this cluster is nearly 18.69 km/hr which is near to the mean of the difference between the posted speed limits and operating speed. As a result, the accident frequency rate significantly increases by decreasing the difference between the posted speed limits and operating speed from the safe cluster (Figure 5).
4.3. Assessment of the Association of Posted Speed Limits and the Operating Speed on the Accident Frequency Rate
The relationship between difference posted speed limits and the operating speed and the accident frequency rate, as well as the behavior of the frequency of risky and unrisky clusters was evaluated using the Gaussian function. The findings (Figure 6) indicated that this function shows a better performance based on the considering coefficients (with 95% confidence bounds) and the goodness of fit parameters including the sum of the squared errors, Rsquare, adjusted Rsquare, and root mean square error presented in Table 6. According to the Gaussian function, the difference between posted speed limits and the operating speed can cause an increase and decrease trend in the accident frequency rate in each cluster. Therefore, the average reducing factor of the accident frequency rate is 0.99 by increasing per km/hr in the difference between posted speed limits and the operating speed among the safe clusters. This means that drivers in safe clusters maintain an operating speed lower than the posted speed limits. Hence, by increasing the difference between the posted speed limits and the operating speed, maximum difference is obtained, thereby decreasing the number of accidents per length by the rate of 0.99. From this finding, one can infer that drivers in safe clusters do not exceed the speed limits. However, in risky and unsafe clusters, drivers exceed the speed limits, and their operating speed is more than the speed limits, which, in turn, could lead to 1.17 rise, on average, the in accident frequency rate. In other words, a minimum difference is obtained, and the number of accidents per length went up by the rate of 1.17. Therefore, the growth factor in risky and unsafe clusters is 1.18 times and is as often as the accident frequency rate in low risky and safe clusters. These results are consistent with the findings of the probability of accident occurrence risk when the difference between posted speed limits and the operating speed reduces from the safe cluster in which drivers keep the minimum difference, and therefore, the probability of accident occurrence risk in each cluster increases.
 
Note. SSE: the sum of the squared errors; RMSE: root mean square error. 
As an example in Figure 6 and Table 7, when the difference between posted speed limits and the operating speed is 0, the accident frequency rate is 1.7. In such cases, drivers are categorized in the high risk cluster based on the proposed risk index. Based on Leur and Sayed’s study [62], when accident frequency rate is 11.1, drivers are categorized as the highrisk cluster. However, when the difference between posted speed limits and the operating speed is −20 km/hr, the accident frequency is 0.6, and drivers are categorized in the relatively highrisk cluster according to the proposed the risk index. Based on Leur and Sayed’s study [62], when the accident frequency rate is 12.12, drivers are categorized in the relatively highrisk cluster. Thus, by comparing the results of the present study with those of Leur and Sayed [62], it can be shown that the proposed method has categorized clusters appropriately similar to Leur and Sayed’s study [62] as the highrisk cluster and relatively highrisk cluster, while the accident frequency rate and risk index are different.

4.4. Safety Risk Index Model
Based on the findings of ANOVA and Chisquare tests, among the roadway and development factors and their effects on frequency accident rate, only operating speed and the difference between posted speed limits and the operating speed were employed to the safety risk index model in equation (2). The results of the safety risk index for each cluster are displayed in Table 8. Based on the obtained data, the third cluster with the lowest risky index is regarded as the safest cluster, while the sixth cluster is considered as an unsafe cluster with the maximum risk index among the six clusters.

Moreover, the ChiSquare distribution probability function was used as a probability generator for obtaining the probability of each cluster, the results of which are presented in Table 7. The final risk for the study by Leur and Sayed [70] was obtained according to the values of the accident frequency rate, probability values, and exposure or scores. As shown in Table 8, the findings of the ANOVA test also approved that the proposed model has a high prediction power of risk for clusters.
4.5. Future Research Works
Future works might consider investigating the effect of geometric factors such as road width, weather, and lightening conditions on accident frequency, and development of the rural risk index. In addition, data mining and multicriteria decision making approaches including decision tree techniques, fuzzy AHP, and fuzzy COPRAS could be noteworthy to expand this risk index for rural roads for drivers based on database and experts’ opinion in the field.
5. Sensitivity Analysis
To examine the reliability of the safety risk index for clusters, a sensitivity analysis was performed between the results of the proposed model and the findings of Leur and Sayed [70], as shown in Figure 7. Based on Figure 7, it is evident that the value of risk index from cluster 1 to 5 is close to the value of risk index in clusters in Leur and Sayed’s study [70] except the sixth cluster. In addition, the maximum risk index of the proposed study is observed in the sixth cluster which is a highrisk one. However, in the sixth cluster of Leur and Sayed [70], the risk index is 0.636 which is lower than the proposed study which makes it different. This difference is due to the use of the difference between posted speed limits and operating speed in the development of the rural risk index in the present study. This discrepancy can include more high risk drivers in the sixth cluster for the proposed study.
6. Conclusions
Given the fact that roadway and development factors are known as the most effective parameters contributing to road traffic accidents on roads, applying these factors in safety analysis could be instrumental in reducing the accident frequency rate and preventing the growth fatality and injury rate on rural roads. Therefore, this study evaluated the effect of roadway and development factors on accident frequency in order to develop a rural road safety risk index using the Kmeans clustering and Gaussian model. Relying on the obtained data and the results of the analysis, the main findings of the study and the evaluation of the rural accident risk index among roadway and development factors are summarized based on the ANOVA test, as well as clustering and risk analyses as follows.(1)Based on the results of the ANOVA test, among roadway and development factors, only operating speed and the difference between posted speed limits and the operating speed had significant effects on the accident frequency rate. Furthermore, the results of the Chisquare test demonstrated that the maximum chisquare of the operating speed in the risky index has a lower effect on the accident frequency rate compared to the difference between posted speed limits and the operating speed.(2)Based on the Kmeans clustering analysis of roadway and development factors respecting the accident frequency rate, six easily understandable clusters were investigated as high risky, relatively high risky, moderately risky, relatively low risky, low risky, and not risky (safe) drivers for each cluster. The comparison of the clusters regarding the accident frequency rate revealed that the sixth cluster was categorized as the high risky cluster, whereas the third cluster was considered as a safe cluster.(3)The risky index model was proposed based on the Gaussian model to analyze the behavior of the accident frequency rate for clusters and to obtain the risk value. Therefore, the average reducing factor of the accident frequency rate was achieved by 0.99 through increasing (per km/hr) the difference between the posted speed limits and the operating speed among the safe clusters. However, in unsafe clusters, the average increasing factor of the accident frequency rate was obtained as 1.17. Therefore, the growth factor in risky and unsafe clusters was 1.18 times the accident frequency rate in low risky and safe clusters.(4)Based on the comparison of the difference between posted speed limits and the operating speed and the probability of accident occurrence, it is concluded that, by decreasing the difference of posted speed limits and the operating speed from the safe cluster, the probability of accident occurrence risk in each cluster increases, followed by an increase in the accident frequency rate. As a result, the maximum probability of the accident occurrence was observed in the fifth cluster, which was achieved by 85%. The probability of accidents in the fifth cluster increased as well.(5)Sensitivity analysis showed that the proposed safety risk index has a better performance regarding predicting the risk values for the clusters when compared to the other study.(6)The proposed risk index model is considered as a useful tool for obtaining the safety risk value for studies concerning the accident rate and clustering analysis of drivers on rural roads. Finally, this study can be useful for safety research organizations such as governmental institutes and police centers to consider the maximum risk value in order to accurately present their plans and strategies toward minimizing accidents.
Data Availability
The data used to support the findings of this study are available from the corresponding author upon request.
Conflicts of Interest
The authors declare no conflicts of interest.
References
 World Health Organization (WHO), “Road traffic injuries,” 2018, http://www.who.int/newsroom/factsheets/detail/roadtrafficinjuries. View at: Google Scholar
 M. Ahadi, M. Hassanpour, P. Bashiri, and P. Bashiri, “Strategies to promote safety to prevent pedestrian accidents in the city of Qazvin,” Safety Promotion and Injury Prevention, vol. 4, no. 3, pp. 143–150, 2017. View at: Google Scholar
 World Health Organization, Violence, Injury Prevention. Global Status Report on Road Safety 2013: Supporting a Decade of Action, World Health Organization, Geneva, Switzerland, 2013.
 M.I. M. Imprialou, M. Quddus, D. E. Pitfield, and D. Lord, “Revisiting crashspeed relationships: A new perspective in crash modelling,” Accident Analysis & Prevention, vol. 86, pp. 173–185, 2016. View at: Publisher Site  Google Scholar
 I. B. Anderson and R. A. Krammes, “Speed reduction as a surrogate for accident experience at horizontal curves on rural twolane highways,” Transportation Research Record: Journal of the Transportation Research Board, vol. 1701, no. 1, pp. 86–94, 2000. View at: Publisher Site  Google Scholar
 C. Caliendo and R. Lamberti, “Relationships between accidents and geometric characteristics for four lanes median separated roads,” in Proceeding of the Road Safety on Three Continents, Moscow, Russia, Moscow, Russia, September 2001. View at: Google Scholar
 P. D. Cenek, R. B. Davies, and R. J. Henderson, “Crash risk relationships for improved road safety management (no. 488),” 2012. View at: Google Scholar
 E. Hauer, “Traffic conflicts and exposure,” Accident Analysis & Prevention, vol. 14, no. 5, pp. 359–364, 1982. View at: Publisher Site  Google Scholar
 E. Hauer, “Safety and the choice of degree of curve,” Transportation Research Record, vol. 1665, no. 1, pp. 22–27, 1999. View at: Publisher Site  Google Scholar
 M. G. Karlaftis and I. Golias, “Effects of road geometry and traffic volumes on rural roadway accident rates,” Accident Analysis & Prevention, vol. 34, no. 3, pp. 357–365, 2002. View at: Publisher Site  Google Scholar
 V. Andjus and M. Maletin, “Speeds of cars on horizontal curves,” Transportation Research Record, vol. 1612, no. 1, pp. 42–47, 1998. View at: Publisher Site  Google Scholar
 J. Collins, K. Fitzpatrick, K. M. Bauer, and D. W. Harwood, “Speed variability on rural twolane highways,” Transportation Research Record, vol. 1658, no. 1, pp. 60–69, 1999. View at: Publisher Site  Google Scholar
 F. Tate and S. Turner, “Road geometry and drivers’ speed choice,” Road & Transport Research: A Journal of Australian and New Zealand Research and Practice, vol. 16, no. 4, p. 53, 2007. View at: Google Scholar
 R. Bird and I. Hashim, “Exploring relationship between safety and consistency of geometry and speed on british roads (no. 061509),” 2006. View at: Google Scholar
 X. Wang, T. Fan, M. Chen, B. Deng, B. Wu, and P. Tremont, “Safety modeling of urban arterials in Shanghai, China,” Accident Analysis & Prevention, vol. 83, pp. 57–66, 2015. View at: Publisher Site  Google Scholar
 S. Dissanayake and I. Ratnayake, “Identification of factors leading to high severity of crashes in rural areas using ordered probit modeling,” Journal of the Transportation Research Forum, vol. 45, no. 2, pp. 87–101, 2006. View at: Publisher Site  Google Scholar
 N. V. Malyshkina, F. L. Mannering, and S. A. Labi, Influence of Speed Limits on Roadway Safety in Indiana, Joint Transportation Research Program, West Lafayette, IN, USA, 2007.
 I. Thomas, “Spatial data aggregation: exploratory analysis of road accidents,” Accident Analysis & Prevention, vol. 28, no. 2, pp. 251–264, 1996. View at: Publisher Site  Google Scholar
 G. Koorey, “Road data aggregation and sectioning considerations for crash analysis,” Transportation Research Record, vol. 2103, no. 1, pp. 61–68, 2009. View at: Publisher Site  Google Scholar
 J. M. P. Mayora, “Relevant variables for crashrate prediction on Spain’s twolane rural roads,” in Proceedings of 82nd Annual Meeting, Transportation Research Board, Washington, DC, USA, January 2003. View at: Google Scholar
 J. Han, M. Kamber, and J. Pei, Data Mining Concepts and Techniques: The Morgan Kaufmann Series in Data Management Systems, Elsevier, Amsterdam, Netherlands, 3rd edition, 2011.
 P. T. Savolainen, F. L. Mannering, D. Lord, and M. A. Quddus, “The statistical analysis of highway crashinjury severities: a review and assessment of methodological alternatives,” Accident Analysis & Prevention, vol. 43, no. 5, pp. 1666–1676, 2011. View at: Publisher Site  Google Scholar
 F. L. Mannering, V. Shankar, and C. R. Bhat, “Unobserved heterogeneity and the statistical analysis of highway accident data,” Analytic Methods in Accident Research, vol. 11, pp. 1–16, 2016. View at: Publisher Site  Google Scholar
 G. Janani and N. R. Devi, “Road traffic accidents analysis using data mining techniques,” JITAJournal of Information Technology and Applications, vol. 14, no. 2, 2016. View at: Publisher Site  Google Scholar
 C. Lee, F. Saccomanno, and B. Hellinga, “Analysis of crash precursors on instrumented freeways,” Transportation Research Record, vol. 1784, no. 1, pp. 1–8, 2002. View at: Publisher Site  Google Scholar
 M. J. Sullman, M. L. Meadows, and K. B. Pajo, “Aberrant driving behaviours amongst New Zealand truck drivers,” Transportation Research Part F: Traffic Psychology and Behaviour, vol. 5, no. 3, pp. 217–232, 2002. View at: Publisher Site  Google Scholar
 C. Wang, M. A. Quddus, and S. G. Ison, “The effect of traffic and road characteristics on road safety: a review and future research direction,” Safety Science, vol. 57, pp. 264–275, 2013. View at: Publisher Site  Google Scholar
 M. Mohanty and A. Gupta, “Factors affecting road crash modeling,” Journal of Transport Literature, vol. 9, no. 2, pp. 15–19, 2015. View at: Publisher Site  Google Scholar
 H. Shirmohammadi, A. S. Najib, and F. Hadadi, “Identification of road critical segments using wavelet theory and multicriteria decisionmaking method,” European Transporttrasporti Europei, vol. 68, no. 2, 2018. View at: Google Scholar
 T. Litman, “Measuring transportation: traffic, mobility and accessibility,” ITE Journal, vol. 73, no. 10, p. 28, 2003. View at: Google Scholar
 J. Withanaarachchi, S. Setunge, and S. Bajwa, “Traffic impact assessment and land use development and decision making,” in Proceedings of International Conference on Disaster Management, pp. 256–273, Kumamoto, Japan, August 2012. View at: Google Scholar
 A. Bako and I. Musa, “Effect of land use on road traffic accidents in urban zaria area, Nigeria,” BEST: International Journal of Humanities, Arts, Medicine and Sciences (BEST: IJHAMS), vol. 2, no. 1, pp. 35–42, 2014. View at: Google Scholar
 C. Berthod, “Land use planning measures promoting road safety,” in Proceedings of TAC 2016: Efficient TransportationManaging the Demand2016 Conference and Exhibition of the Transportation Association of Canada, Transportation Association of Canada (TAC), Ottawa, Ontario, Canada, November 2016. View at: Google Scholar
 S. B. Kusselson, Investigating How Land Use Patterns Affect Traffic Accident Rates Near Frontage Road CrossSections: A Case Study on Interstate 610 in Houston, Texas, Oklahoma State University, Stillwater, OK, USA, 2013.
 H. Shirmohammadi, F. Hadadi, and M. Saeedian, “Clustering analysis of drivers based on behavioral characteristics regarding road safety,” International Journal of Civil Engineering, vol. 17, no. 8, pp. 1–14, 2019. View at: Publisher Site  Google Scholar
 L. Shen, J. Lu, M. Long, and T. Chen, “Identification of accident blackspots on rural roads using grid clustering and principal component clustering,” Mathematical Problems in Engineering, vol. 4, pp. 1–12, 2019. View at: Publisher Site  Google Scholar
 A. S. Alotaibi, “Densitybased clustering for road accident data analysis,” International Journal of Advanced and Applied Sciences, vol. 5, no. 8, pp. 113–121, 2018. View at: Publisher Site  Google Scholar
 S. K. Barai, “Data mining applications in transportation engineering,” Transport, vol. 18, no. 5, pp. 216–223, 2003. View at: Publisher Site  Google Scholar
 P. C. Srividhya, “A comparative analysis of clustering approach for predicting road traffic accident dataset,” International Journal of Advanced Research in Computer and Communication Engineering, vol. 6, no. 6, pp. 468–473, 2017. View at: Publisher Site  Google Scholar
 B. Depaire, G. Wets, and K. Vanhoof, “Traffic accident segmentation by means of latent class clustering,” Accident Analysis & Prevention, vol. 40, no. 4, pp. 1257–1266, 2008. View at: Publisher Site  Google Scholar
 S. Kumar and D. Toshniwal, “A data mining framework to analyze road accident data,” Journal of Big Data, vol. 2, no. 1, p. 26, 2015. View at: Publisher Site  Google Scholar
 J. Ma and K. Kockelman, “Crash frequency and severity modeling using clustered data from Washington state,” in Proceedings of 2006 IEEE Intelligent Transportation Systems Conference, pp. 1621–1626, IEEE, Toronto, Canada, 2006 September. View at: Publisher Site  Google Scholar
 S. Y. Sohn, “Quality function deployment applied to local traffic accident reduction,” Accident Analysis & Prevention, vol. 31, no. 6, pp. 751–761, 1999. View at: Publisher Site  Google Scholar
 T. F. Golob and W. W. Recker, “A method for relating type of crash to traffic flow characteristics on urban freeways,” Transportation Research Part A: Policy and Practice, vol. 38, no. 1, pp. 53–80, 2004. View at: Publisher Site  Google Scholar
 S. C. Wong, B. S. Y. Leung, B. P. Loo, W. T. Hung, and H. K. Lo, “A qualitative assessment methodology for road safety policy strategies,” Accident Analysis & Prevention, vol. 36, no. 2, pp. 281–293, 2004. View at: Publisher Site  Google Scholar
 P. Sekuła, Z. Vander Laan, K. Farokhi Sadabadi, and M. J. Skibniewski, “Predicting work zone collision probabilities via clustering: application in optimal deployment of highway response teams,” Journal of Advanced Transportation, vol. 1, no. 1529, pp. 1–16, 2018. View at: Publisher Site  Google Scholar
 L. Y. Chang and W. C. Chen, “Data mining of treebased models to analyze freeway accident frequency,” Journal of Safety Research, vol. 36, no. 4, pp. 365–375, 2005. View at: Publisher Site  Google Scholar
 S. Kumar and D. Toshniwal, “Analysing road accident data using association rule mining,” in Proceedings of 2015 International Conference on Computing, Communication and Security (ICCCS), pp. 1–6, IEEE, Pamplemousses, Mauritius, 2015, December. View at: Publisher Site  Google Scholar
 A. Tavakoli Kashani, A. ShariatMohaymany, and A. Ranjbari, “A data mining approach to identify key factors of traffic injury severity,” PrometTraffic&Transportation, vol. 23, no. 1, pp. 11–17, 2011. View at: Publisher Site  Google Scholar
 J. Abellán, G. López, and J. De OñA, “Analysis of traffic accident severity using decision rules via decision trees,” Expert Systems with Applications, vol. 40, no. 15, pp. 6047–6054, 2013. View at: Publisher Site  Google Scholar
 S. Kumar and D. Toshniwal, “A data mining approach to characterize road accident locations,” Journal of Modern Transportation, vol. 24, no. 1, pp. 62–72, 2016. View at: Publisher Site  Google Scholar
 AASHTO, A Policy on Geometric Design of Highways and Strees, AASHTO, Washington, DC, USA, 1984.
 X. Qu, Q. Meng, and S. Li, “Analyses and implications of accidents in Singapore Strait,” Transportation Research Record, vol. 2273, no. 1, pp. 106–111, 2012. View at: Publisher Site  Google Scholar
 X. Qu, Y. Yang, Z. Liu, S. Jin, and J. Weng, “Potential crash risks of expressway onramps and offramps: a case study in Beijing, China,” Safety Science, vol. 70, pp. 58–62, 2014. View at: Publisher Site  Google Scholar
 S. Jin, X. Qu, and D. Wang, “Assessment of expressway traffic safety using Gaussian mixture model based on time to collision,” International Journal of Computational Intelligence Systems, vol. 4, no. 6, pp. 1122–1130, 2011. View at: Publisher Site  Google Scholar
 Z. Liu, Y. Yan, X. Qu, and Y. Zhang, “Bus stopskipping scheme with random travel time,” Transportation Research Part C: Emerging Technologies, vol. 35, pp. 46–56, 2013. View at: Publisher Site  Google Scholar
 P. N. Tan, M. Steinbach, and V. Kumar, “Cluster analysis: basic concepts and algorithms,” Introduction to Data Mining, vol. 8, pp. 487–568, 2006. View at: Google Scholar
 A. M. Aljofey and K. Alwagih, “Analysis of accident times for highway locations using Kmeans clustering and decision rules extracted from decision trees,” International Journal of Computer Applications Technology and Research, vol. 7, no. 01, pp. 001–011, 2018. View at: Publisher Site  Google Scholar
 W. Budiawan, S. Saptadi, A. Arvianto, and P. Andarani, “Implementation Kmeans clustering analysis of traffic accident in semarang city using weka interface,” International Journal of Science and Engineering Investigations, vol. 7, no. 81, pp. 83–86, 2018. View at: Google Scholar
 L. Kaufman and P. J. Rousseeuw, Finding Groups in Data: an Introduction to Cluster Analysis, John Wiley & Sons, Hoboken, NY, USA, 2009.
 A. Kassamara, “Determining the optimal number of clusters: 3 must known methods – unsupervised machine learning,” 2015, http://www.sthda.com/english/wiki/determiningthe optimalnubmerofclusters3mustknown methodsunsupervisedmachinelearning. View at: Google Scholar
 U. D. Hasmukhrai, K. V. Ganeshbabu, and P. J. Gundaliya, “Identification of crash risk index for urban road: a case study of ahmedabad city,” International Journal of Innovative Research in Technology, vol. 2, no. 12, pp. 134–140, 2016. View at: Google Scholar
 M. Ahmadinejad, S. Afandizadeh Zargari, and R. Jalalkamali, “Are deceleration numbers a suitable index for road safety?” Proceedings of the Institution of Civil EngineersTransport, vol. 171, no. 5, pp. 247–252, 2017. View at: Google Scholar
 W. Haddon, “Advances in the epidemiology of injuries as a basis for public policy,” Public Health Reports, vol. 95, no. 5, p. 411, 1980. View at: Google Scholar
 M. J. Koornstra, “The evolution of road safety and mobility,” IATSS Research, vol. 16, no. 2, pp. 129–148, 1992. View at: Google Scholar
 M. Sarstedt and E. Mooi, Cluster Analysis in: A Concise Guide to Market Research, Springer, Berlin, Germany, 2014.
 R. C. De Amorim, “Constrained clustering with minkowski weighted kmeans,” in Proceedings of 2012 IEEE 13th International Symposium on Computational Intelligence and Informatics (CINTI), pp. 13–17, IEEE, Budapest, Hungary, 2012 November. View at: Publisher Site  Google Scholar
 B. S. Everitt, Cluster Analysis, Halsted Press, New York, NY, USA, 3rd edition, 1993.
 G. W. Milligan and L. M. Sokol, “A twostage clustering algorithm with robust recovery characteristics,” Educational and Psychological Measurement, vol. 40, no. 3, pp. 755–759, 1980. View at: Publisher Site  Google Scholar
 P. D. Leur and T. Sayed, “Development of a road safety risk index,” Transportation Research Record, vol. 1784, no. 1, pp. 33–42, 2002. View at: Publisher Site  Google Scholar
Copyright
Copyright © 2020 Shahriar Afandizadeh and Shahab Hassanpour. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.