New technologies and traffic data sources provide great potential to extend advanced strategies in freeway safety research. The High Definition Monitoring System (HDMS) data contribute comprehensive and precise individual vehicle information. This paper proposes an innovative Variable Speed Limit (VSL) based approach to manage crash risks by intervening in traffic flow dynamics on freeways using HDMS data. We first conducted an empirical analysis on real-time crash risk estimation using a binary logistic regression model. Then, intensive microscopic simulations based on AIMSUN were carried out to explore the effects of various intervention strategies with respect to a 3-lane freeway stretch in China. Different speed limits with distinct compliance rates under specified traffic conditions have been simulated. By taking into account the trade-off between safety benefits and delay in travel time, the speed limit strategies were optimized under various traffic conditions and the model with gradient feedback produces more satisfactory performance in controlling real-time crash risks. Last, the results were integrated into lane management strategies. This research can provide new ideas and methods to reveal the freeway crash risk evolution and active traffic management.

1. Introduction

There is a growing body of evidence confirming a positive relationship between the road safety benefits and vehicle speed enforcement, especially on freeways. In China, for example, Shanghai and Jiangsu with intensive freeway networks are actively employing intelligent technology systems for coordinating traffic flow and improving road safety. Previous studies have highlighted the higher vehicle speed on freeways associated with increased crash risk and injury severity [1, 2]. Meanwhile, speed variation among vehicles can disturb traffic flow and create more conflict situations [3]. Active traffic management (ATM) has been emerging in recent years aiming to provide traffic control to improve traffic flow and reduce congestion on freeways. Proper traffic control can significantly reduce delays and improve traffic distribution at a bottleneck, especially under congested [4] and work zone conditions [5, 6]. As a key application of ATM, Variable Speed Limit (VSL) systems aim to dynamically regulate freeway speeds based on real-time traffic flow information.

In the last decades, VSL has been intensively investigated on two main directions: traffic enhancement and safety improvement [7]. For instance, Hegyi et al. [8] proposed the macroscopic traffic flow model METANET with coordinated control of ramp metering and VSL to minimize the total time vehicles spent on the road; the method significantly reduced congestion. Naïve and Empirical Bayes are used to evaluate the effects of the VSL system and results indicate that VSL reduces crashes by 4.5% to 8% [9]. However, instead of mandatory, the advisory VSL does not show significant impact on traffic flow [10]. Especially under low speed limits, some drivers tend to violate the limits in pursuit of their personal benefits. Hence low speed limits may widen the range of flows under homogeneous traffic and contribute to a raise in lane changes [11]. Recent studies have shown that lane-changing maneuvers are a major source of traffic disturbance on a multilane freeway. Therefore, a proper setting is essential in VSL strategies. Instead of just using VSL before or during periods of high congestion, it can be applicable during off-peak periods as well [12]. So far, limited studies have been focusing on enhancing freeway safety by intervening in traffic flow dynamics based on VSL.

This study aims to apply real-time crash prediction in traffic control management. Previous studies on crash precursors have employed different kinds of traffic data such as loop detectors [1, 2, 1315], automatic vehicle identification [16, 17], traffic counter data [18], and weather [19, 20] as well as road geometry data [21]. Different data mining and detection methods have been utilized to fully investigate the interrelationship between crash risk and traffic operation data. Abdel-Aty et al. [1] developed a neural network-based classifier to evaluate rear-end crash risk with traffic parameters from five stations: it can identify 75% of the crashes, with false positives of 34%. Ahmed et al. [16] exploited automatic vehicle identification (AVI) data and the model achieves an accuracy of 75.93% and 72.92% for rear-end and all crashes, respectively. As large numbers of false positives might affect drivers’ compliance with the system and reduce its effectiveness, various refining approaches have been employed to optimize the evaluation algorithm. Traffic data with high resolution and multiple sources are needed for a better evaluation model with higher predictive accuracy and robustness. Ahmed et al. [17] enhance the AVI data with remote traffic microwave sensor data and their model can successfully identify 88.9% of the crashes with a false positive rate of only 6.5%. Kwak et al. [22] found that traffic flow characteristics leading to crashes differ by segment type and traffic flow state.

However, with respect to data type and resolution, the detectors are often limited and traffic data from continuous detectors cannot be collected or the collected data do not meet the requirements of the models. For instance, in China, detectors are installed far apart on freeways and most segments have not been equipped with detectors or surveillance devices. Regarding to method, generalized linear models could provide direct evidence of the traffic parameters’ impacts on crash risk. When dealing with highly nonlinear relationships between traffic flow and crash, it requires more computational, flexible, and nonlinear models [23]. Recent studies found that nonlinear models are capable of achieving higher crash prediction accuracy with less false positives. However, the limitations of the available nonlinear models include heavy computations to reveal deeper connections between the traffic parameters as well as for model calibration. Meanwhile, few studies have thoroughly investigated the application of real-time crash risk prediction. Based on an accurately quantified crash risk evaluation, proper traffic management strategies can be applied and therefore improve road safety. For example, Yu et al. [7] proposed a VSL control algorithm in mountainous freeways and the result indicated a positive outcome in crash risk control. However, most VSL control studies are not safety-oriented and so the only parameter utilized in the crash risk model is speed. Meanwhile, generally it is hard to obtain accurate speed variation data, and the effect of speed dispersion on traffic safety has not been intensively investigated [3].

New technologies and traffic data sources provide great potential to extend advanced strategies in freeway safety research. For example, the High Definition Monitoring System (HDMS) data contribute comprehensive and precise individual vehicle information, including vehicle type, speed, lane number, and plate number, as well as high quality photos captured by advanced vehicle license plate recognition systems. In China, HDMS have been installed on major freeways for public security management. The major contributions of this paper consist of the following aspects (Figure 1): (1) to employ HDMS data with individual vehicle information to study the crash mechanism on freeways; (2) to develop a Logistic Regression model for real-time crash risk estimation; (3) to evaluate safety benefits of the optimized VSL based on enhanced AIMSUN simulations on a 100 km freeway stretch; (4) to investigate sensitivities of VSL impacts on driver compliance.

2. Crash Risk Model

2.1. Data Preparation

The study area is G15 Freeway in Nantong, Jiangsu Province, China, with a total length of approximately 100 km, from Sutong Bridge to Fuan Toll (as shown in Figure 2).

Data are obtained from the Public Security Traffic Managing System of the Traffic Management Research Institute, Ministry of Public Security. The freeway is a 6-lane one (3 lanes in each direction). The primary dataset includes all crash data and HDMS data from January to October 2016. The extracted HDMS data cover the lane number, direction, vehicle type and speed, recorded time of vehicle passing, etc. The study area includes five pairs of HD cameras.

The raw crash dataset includes 5924 crashes. However, the majority of crashes are not recorded with detailed location or direction information. 88% of the crashes are involved with multivehicles. Among them, 96% of the crashes are recorded with causes of hitting the fixed objects such as the guardrails and the medians, or hitting the unfixed objects such as the crash barriers. In order to investigate the impact of traffic dynamics on crashes, the traffic status prior to crashes has been examined as Figure 3. In Figure 3, the raw HDMS data are aggregated with the interval of 5 minutes. The datasets from April 1st to May 31st in 2016 have been utilized to show the relationship between traffic volume and speed. The solid curve “max” shows the margin of the data during this period and the dash curve shows the 85th margin of the data during this period. The dash curve demonstrates that, under most conditions, the speed/volume distribution is within the curve. The speed/volume data 5-10 minutes prior to each crash have been plotted in Figure 3 as well. The symbol “o” refers to the crashes involved with single vehicle and the symbol “x” refers to the crashes involved with multivehicles.

First, the “max” curve shows the similar trend with that of the traditional capacity/speed curve. However, as under most conditions, the traffic state is normal and it is difficult to obtain the saturated flow state with different speeds. Hence, the “max” curve mainly reflects the nonfree flow state, in which the volume is approaching the maximum capacity. The area beyond the “max” curve reflects the chaos flow state or congestion state. Second, comparing the state of single-vehicle crashes and the state of multivehicles we could find that the single-vehicle crashes are more likely to occur within the 85th curve; i.e., the crashes are likely to occur under free flow conditions. This is consistent with several existing studies [24]. Another finding is that the number of multivehicle crashes is in majority in the area beyond the “max” curve. It indicates that when the traffic state approaches congestion state, the crashes are likely to involve multivehicles; especially when the speed is below 60 km/h, only multivehicle crashes are detected. Meanwhile, under all conditions, when the speed is below 40km/h, no single crashes have been recorded.

Additionally, as single-vehicle crashes are usually caused by random effects, such as driving distraction and breaking down, only multivehicles crashes with detailed temporal and spatial information are utilized in this study to investigate the relationship within traffic dynamics and crash risk. The data 5-10 minutes prior to the crashes are utilized to represent the traffic status prior to crashes. The method is commonly used in existing studies [15] as well.

A matched case-control method is utilized to extract the related samples for each sampled crash. A 4:1 control-case ratio is used, as recommended in several previous studies [2]. Each crash case and noncrash case is matched with corresponding traffic data on the same road segment. The four control samples are selected from 14 days before the recorded crash time, 7 days before, 7 days after, and 14 days after. Considering the transferability of the model, the HDMS data 5-10 minutes prior to the samples are aggregated as traffic flow, mean speed, and speed dispersion, which are labeled as Q, V, and DV, respectively.

Some filtering rules are also applied to select the available samples. Due to occasional HDMS system failure, some samples would be matched with invalid HDMS data or missing HDMS data. Noise and outliers are removed from the final dataset. Finally the crash dataset contains 171 samples and the control dataset has 618 non-crash samples. The summary statistics of variables are listed in Table 1.

2.2. Binary Logistic Regression Model

Logistic regression analysis is commonly used to quantify the crash risk in real-time crash analysis. The traffic condition can be divided into two parts, crash cases ( =1) and noncrash cases ( =0) with respective probability and 1-. The probability of a crash occurrence is estimated by The odds of crash occurrence can be calculated as where denotes input variables series; denotes the constant in the logistic model; denotes coefficients for the independent input variables. Here can be estimated by solving the log-likelihood function in The crash risk model is estimated with the binary logistic regression procedure in SPSS 19. Backward LR (likelihood ratio) variables selection is applied to select the significant parameters in the proposed model. As shown in Table 2, results indicate that flow and speed dispersion are significant variables for crash risk estimation. The larger values of Q and DV indicate a higher crash risk. The AUC (area under the receiver operating characteristic curve, which illustrates the performance of the classifier) value indicates that the logistic regression model could successfully classify most of the crash and noncrash cases.

In order to compare the predicting ability of the HDMS data with the ability of the other kinds of traffic data, such as loop detector data [22] or Automatic Vehicle Identification (AVI) data [16], several existing studies are listed in Table 3. Table 3 reveals that the HDMS data provide better evidence for crash risk estimation and the corresponding model has a relatively better prediction accuracy despite of the simple form of model and the discrete distribution of the HDMS devices.

The spatial issue should be addressed for the implementation of VSL. Hence, another two models have been formulated to investigate the spatial effect, a downstream model and an upstream model. As shown in Figure 4, samples with crashes located downstream of the HDMS are the downstream samples and vice versa. Finally the downstream dataset contains 105 crashes and 360 noncrashes, and the upstream dataset contains 68 crashes and 266 noncrashes (the samples located just at the HDMS station are classified into both downstream and upstream samples).

As before, binary logistic regression has been used to estimate the crash risk models. The results are shown in Table 4.

Results indicate that the performance of the crash risk models considering the spatial effects is similar to the performance of the crash risk model for the whole segment. The reason for this is that the crash risk is stable on each segment and the traffic parameters of adjacent locations on the same segment are highly correlated, which has been shown by Fang et al. [25]. Hence, the model for the whole segment is used subsequently to estimate the real-time crash risk.

3. VSL Based on Microsimulation

3.1. Aimsun API

In order to verify the method based on dynamic VSL control of crash risk, a sub-segment of the G15 Freeway segment utilized in Section 2 is employed in the simulation (Segment 1 in Figure 2). The segment extends from “Nantong Toll Interchange” to “Sutong Bridge” and the total length is approximately 6 km.

Aimsun API (Application Programming Interface) can be a helpful platform to evaluate certain traffic management strategies. We can obtain the necessary real-time traffic data (flow, speed, occupancy, etc.) with required aggregation levels or detailed vehicle information. The project is built with Visual C++ 6.0 based on Visual Studio 2005. Using Aimsun API functions, the detectors, VMS, and traffic control plans are modeled and the attributes are defined in our in-depth simulations.

To simulate the real HDMS data, AIMSUN API function is used to gather the real-time vehicle information. The scenarios are tested on the G15 Freeway with a design speed of 120km/h. In order to code the freeway segment in the simulation, Baidu Map GIS data source is utilized to build the freeway network. The drivers are assumed to comply with the speed limit, with a certain compliance rate, when the VSL starts to function on the segment.

The step size is 1 second. The Aimsun software development kit has been utilized to develop a module to extract the parameters for the crash risk evaluation model during the simulation process. The values for , , with a 5 minutes aggregation are recorded every minute. Thus the crash risk could be estimated every minute and exported as a report for final analysis as well as the average delay of all vehicles in the whole simulation process.

The real-time crash risk probability () for time period () is estimated with (4) using the parameters in Table 2:The average delay (AD) for each simulation application can be calculated with where denotes the real time vehicle spent on the freeway, denotes the expected time vehicle spent on the freeway, and n denotes the number of vehicles passing by.

3.2. Simulation Calibration
3.2.1. Speed Distribution and Compliance Calibration

The speed distribution and compliance level are calibrated before the simulations. The original speed limit for the G15 Freeway is 120 km/h and the proportion of vehicles with speed above 120 km/h is set as the non-compliance level. Drivers tend to speeding on the freeway as the freeway is designed with better alignments especially long stretch of straight line. The traffic on the freeway mainly comprises private cars and trucks. All the speed data for May 2016 is used to calibrate the parameters. Figure 5 present the speed distributions for different vehicle types. The calibration includes the speed distribution and the compliance rate. The compliance rate for cars and trucks is 88.37% and 79.05% respectively. Other parameters such as lane width, lane number and road type are also calibrated. Part of the final scenario settings are listed in Table 5.

3.2.2. Calibration of Traffic Temporal Distribution

Traffic spatial distribution should also be addressed to validate the simulations.

In existing studies, aggregate statistics have been validated such as the GEH statistics by FHWA [7]. In this study, HDMS data have been utilized to calibrate the simulations. 1-hour HDMS data starting at 2016-05-05, 13:50:00, are used to validate the simulation. The distribution diagram of time headway in Figure 6 indicates that the temporal distribution of the real data is similar to the temporal distribution of the simulation data, especially when the value of time headway exceeds 10 seconds. The significant difference appears on the distribution when the time headway is less than 4 seconds. The reason is that in real world some drivers may drive aggressively with a relatively low following distance especially under congestions, or try to change the lane with a relatively low transverse distance, leading to the relatively high proportion of low time headway. While in the simulation scenarios, the vehicles follow the car-following and lane-changing rules and they would never exceed the limit values. The result of an additional Pearson test shows that the coefficient value of the two curves is 0.75, which suggests moderate fidelity of the simulation.

3.3. VSL Performance Assessment under Different Flow Conditions

A set of six speed limits has been tested to evaluate the VSL performance under different flow conditions, namely 90, 80, 70, 60, 50, and 40 km/h. The traffic demand ranges from 2,000 to 5,000 veh/h. The Aimsun simulation results depend on the random seeds, reflecting the impact of random factors, and simulations were replicated five times to account for the variability. Each replication has 20 minutes to warm up with the traffic demand 2,000 veh/h and 60 minutes more to simulate the whole process with a different flow.

As the traffic is dynamic, the real traffic flow varies over time, as does the crash risk. Thus the objective is to keep the crash risk within an acceptable limit. In this study, the commonly used 85th percentile index in traffic safety is selected as the crash risk threshold for each replication; i.e., the traffic is evaluated as safe below that threshold. Once the crash risk exceeds the limit, proper strategies should be implemented to minimize the risk. As inappropriate speed limits would decrease the capacity and increase traffic delay, a comprehensive analysis should be made to achieve an optimal cost benefit ratio. Figure 7 shows plots of crash risk (the possibility of crash occurrence) and average delay in relation to speed limit under different flow conditions.

Figure 7 indicates that a lower speed limit would achieve lower crash risk; however the benefits vary at different flow levels. The average delay increases more and more as the speed limit is set lower. Hence, the crash risk benefit and average delay loss under different speed limits can be related to the original crash risk and the average delay of the replication without VSL, so that trade-offs can be made as suggested in Table 6. The results above are based on the hypothesis that all drivers would comply with the VSL; i.e., compliance rate equals 100%. However, when the VSL is advisory and also inappropriate, most of the drivers would not comply with it; thus the VSL strategy does not have any significant impact on traffic conditions [10]. Hence, simulations with a lower compliance rate are also examined, the rate being set to 50%. The results are shown in Figure 8. Figure 8 indicates that at flows of 2,000 to 3,000 veh/h and speed limits below 70 km/h, with a compliance rate of 50% the crash risk increases as the speed limit decreases. The compliance rate of 50% is low, with half of the drivers driving at the VSL and the other half driving as they prefer. At low flow levels, drivers would pursue higher driving speeds, so that speed dispersion would increase and make accidents more likely.

4. Application of VSL Strategies in Simulation

4.1. VSL Control Strategy

The objective of the VSL control strategy is to manage the traffic within an acceptable crash risk level and feedback is needed to adapt the strategy to the real-time traffic condition. Two kinds of strategies have been implemented in the simulations. The first is implementing and withdrawing the optimal VSL gradually (Strategy A) and the other is implementing and withdrawing the optimal VSL rapidly (Strategy B). Strategy A can be described as in Figure 9. When the real-time crash risk exceeds the acceptable risk threshold ), the corresponding target VSL is set and the VSL value is decreased at the gradient of 5 km/h every minute towards the target VSL . In Strategy B, when the crash risk exceeds , the VSL is set to immediately. The HDMS data are extracted from the HDMS database starting at 2016-05-02, 13:50:00. The corresponding traffic demand and speed distribution in the simulation are calibrated again to fit the real HDMS data. The compliance rates of the simulations is set at 88.37% for cars and 79.05% for trucks, being the statistical results in Section 3.2. It took 15 minutes to warm up and the whole simulation time is 75 minutes. The simulations, and the comparison with raw data, are labeled as “raw,” “Strategy A,” and “Strategy B.” The simulations are started by the same random seed.

4.2. Application Results

The simulations results with different strategies are shown in Figure 10. The crash risk curve of the raw HDMS data displays two crash risk peaks after 40 minutes, exceeding the crash risk threshold significantly. Compared with the raw curve, Strategy B could decrease the average crash risk by 10.15%, but the risk is still high after the simulation time of 50 minutes. The crash risk curve indicates that there are no significant crash risk peaks under the control of Strategy A. The average crash risk has been decreased by 22.63% by Strategy A compared with the Raw strategy curve. The crash risk remains at a low level. Strategy A outperforms Strategy B in which the speed limit is set to the target speed immediately. When drivers pass by the speed limit sign, they have to decelerate rapidly to comply with the speed limit. As a result, the speed dispersion increases rapidly and the crash risk increases as well.

Average delays are 56.31s, 24.87s, and 91.84s for Raw, Strategy A, and Strategy B, respectively. Thus Strategy A generates the shortest travel time and this strategy could control the traffic condition efficiently and steadily, whereas improper speed limit implementation may lead to unexpected traffic congestions.

5. Discussion

Results of this study demonstrate that the proposed VSL method could improve traffic safety, but more developments are required to produce integrated control strategies that are efficient and also applicable in real-time to large-scale networks [26]. Integrated control with ramp metering and VSL has been used to improve traffic flow efficiency based on an optimal VSL rate [27]. Lane management (LM) method has been employed in most countries, such as US, Europe, and China. Lord et al. [28] indicated that truck-free freeways would have a better safety record than mixed traffic and separating truck traffic from passenger cars improves safety. Toll lanes have been suggested to separate Lights and Heavies and the method could reduce total travel costs [29]. In this study, lane management has also been simulated. To separate the cars and the trucks, a solid line rule has been applied to restrict the lane-changing behavior on the mainline. The truck can only access the outer lane while the cars can access the other two lanes. Each condition with lane management has been simulated with different VSL under different traffic conditions. The results are shown in Figure 11.

Figure 11 indicates that, under the traffic conditions of 2,000 veh/h and 3,000 veh/h, lane management has little impact on the crash risk. However, under traffic conditions of 4,000 veh/h or 5,000 veh/h, lane management has significant impacts on ameliorating the crash risk. By implementing lane management, the crash risk could be substantially controlled despite high speed limits.

As shown in Figure 11(b), lane management at lower traffic levels would have no impact on the average delay. Thus, by implementing lane management properly with VSL, traffic management administration would get significant improvement in crash risk without affecting the level of service of the freeway.

6. Conclusions

The study proposes an innovative dynamic variable speed approach through intervening in traffic flow dynamics. A binary logistic regression model based on HDMS data is built to estimate crash risk. HDMS data provide detailed vehicle information instead of aggregated data from loop detectors or other detectors. They provide better evidence on the crash mechanism. Microsimulations have been conducted with the AIMSUN simulation software. AIMSUN API is utilized to extract the detailed real-time vehicle information to calculate the crash risk. Different speed limits with several compliance rates under certain traffic conditions have been simulated. Considering the trade-off between safety benefits and travel time delay, we aim to optimize speed limit strategies under various traffic conditions.

Two kinds of VSL strategies have been applied to control the real-time crash risk in the simulated conditions of real traffic accidents; the strategy of implementing and withdrawing the optimal VSL gradually (gradient control) could provide better control effects and keep the crash risk at a lower level. Furthermore, lane management control has also been assessed. Results indicate that such integrated control could significantly reduce the crash risk without increasing average traffic delay. The trends in optimal integrated traffic control to reduce real-time crash risk prove to be promising.

Several potential directions are open for future exploration. For example, further work is being conducted to study the performance of applying the strategies into various road types or road network. In future studies, as more and more surveillance devices and vehicle on-board devices are installed, real-time data such as weather condition as well as driving behavior could be obtained. Meanwhile, with the continuous spatial distribution of surveillance devices and detectors, the aggregated traffic control of multiple segments could be investigated to achieve balanced traffic conditions in the road network as more driver-friendly integrated control strategies are developed to fit the new era of ITS.

Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare that they have no conflicts of interest.


The research is supported by the National Key Research and Development Program (2016YFC0802701), the National Natural Science Foundation (71301119, 71871161, and 51708421), Shanghai Pujiang Program, and the Fundamental Research Funds for the Central Universities. The authors are indebted to Professor Xiaobo Qu for his insightful suggestions to improve this manuscript.