Abstract

The time gap is defined as the time difference between the rear of a vehicle and the front of its follower, which affects both safety and the saturation flow rate of a roadway segment. In this study, naturalistic driving data were examined to measure time gaps from seven different drivers in a car-following scenario within steady-state conditions. The measurements were taken from a 13-km section of a Dulles Airport access road in Washington, DC. In total, 168,053 time gap samples were obtained covering seven speed intervals. Analysis of the data revealed a large variation in time gaps within individual drivers’ driving data, with coefficients of variation as high as 63.8% observed for some drivers. Results also showed that the variability within drivers was more significant at speeds higher than 54 km/h. In addition, there was a large variability between drivers. At speeds above 108 km/h, minimum time gaps left by some drivers could be 1.6 times longer than those left by others. Several statistical distributions were used to fit the data of the seven drivers as well as the data for all drivers combined for each speed interval. The selected distributions passed the goodness-of-fit (Kolmogorov-Smirnov, Chi-square, and Anderson-Darling) criteria only when the number of samples was reduced. Data reduction was not performed randomly, but rather in a manner intended to maintain the same observed distribution when all the samples were used. It is therefore recommended that empirical measures of distributions be used in traffic microsimulation software rather than theoretically fit distributions obtained based on statistical tests. This will lead to better naturalistic traffic behavior simulations, resulting in more precise predicted measures of performance (travel time, fuel consumption, and gas emissions).

1. Introduction

Time headway between vehicles is an important microscopic traffic flow property. It is used in many areas of traffic engineering, as it is directly related to traffic safety, the level of service of a roadway segment, and the capacity of transportation facilities. In addition, distributions of time headways are essential in many traffic microsimulation software packages since the generation of vehicles using these tools is usually based on these distributions. Two techniques, both of which depend on where the measuring device is installed, are used to take empirical measurements of time headways on roadway infrastructure. In point measuring techniques, a sensor, such as an induction loop, is installed at a certain position on the roadway to record the time when a vehicle crosses its path. The arrival times of successive vehicles are used to calculate time headways. With this procedure, the time headway used by several drivers is estimated at a certain location of the roadway. In the second technique, a distance sensor, such as radar, is installed in a vehicle to measure the distance between the following vehicle and its leader. The measured distance and the speed of the following vehicle are used to compute time headways. With this technique, the following vehicle’s time headway is calculated at any time along the roadway. Even though both techniques measure the time gap between vehicles, the results are rather different. With the first technique, the behavior of several drivers is identified at a certain location of the roadway, while the second technique identifies the behavior of one driver all along the roadway.

Using either of the measuring techniques described above, two different concepts of time headway can be identified: normal time headway (i.e., time headway) and time gap. Time headway is the time between the moment the front (or rear) bumper of a vehicle passes a designated point and the moment the front (or rear) bumper of its follower reaches that same point, which means that the length of the leading (or follower) vehicle is taken into account in computing the time headway. Time gap is measured from the rear of a vehicle to the front of its follower. The point measuring technique calculates time headway as the difference in arrival times between successive vehicles, making its computation much more accurate and precise than calculating the time gap. Using these types of measurements makes calculation of the time gap more complex and prone to accumulating more error, however, as it involves the estimation of the vehicle’s length and speed. A more accurate way to calculate time gap is the floating vehicle technique, where the time headway is accurately derived using a known test vehicle’s length and speed, with the latter measured accurately using the vehicle’s on-board diagnostic (OBD) system.

Traffic engineering researchers are usually more interested in time headway than time gap for the simple reason that it represents a link between the micro- and macroscopic scales. Indeed, the inverse of the average measured headway at a certain location is nothing but the flow traversing that location. As a result, since the start of traffic modeling in the 1930s, several researchers have measured time headways and fitted mathematical distributions to the collected data. However, time gap is more meaningful to drivers, as time gap is the driver-dependent component of time headway. For instance, in a car-following state, when a driver is constrained from overtaking a leading vehicle, the driver adapts their driving behavior depending on their perceived safety margins.

The subject of time headway and/or gap distribution continues to be of interest to traffic engineers, mainly because car technologies and human behavior change over time. For example, driving behavior in the 1960s or even in the 1990s was different than it is today. In the present study, measured time gaps, using the floating car technique, adopted by seven different drivers in car-following conditions (spacing less than 100 m) and within steady-state conditions (absolute speed difference between subject and leading vehicle does not exceed 5% with acceleration/deceleration less than 0.2 m/s2) are presented. This study is novel in that the data used were naturalistic, as drivers used the test vehicles during their normal day-to-day driving routine for an entire year; no experimenter was present to dictate their driving behavior. This paper is organized as follows. First, a brief summary of available literature that deals with time headway/time gap distribution is provided. The collected naturalistic data used in the study is then described. Finally, the findings of the study and their interpretations are presented, followed by the conclusions reached at the end of the study.

2. Literature Review

Even though the present study deals with observed time gap distributions in a steady-state car-following scenario, the literature review included both time headway and time gap studies for all traffic conditions, with most existing literature dealing with the former.

In his 1990 publication, May presented a chapter related to time headways and their distributions [1], reporting that the subject of time headways and related mathematical distributions has been studied since the 1930s. He cited the works of many researchers who dealt with this subject during the period from 1930 to 1990 [222]. Based on his work and those he cited, May proposed a classification scheme for time headway distributions consisting of a random distribution state for low flow levels, an intermediate distribution state for moderate flow levels, and a constant distribution state for high flow levels. May described the different statistical distributions used for each classification scheme, including negative exponential, shifted negative exponential, normal, lognormal, Pearson type III, gamma, Erlang, and composite (combination of two different statistical distributions), such as the hyperlang.

During the 1990s, research continued in the area of vehicle time headway and time gap statistical analyses. Mei and Bullen collected time headway data on the two southbound lanes of a four-lane freeway near downtown Pittsburgh during morning rush hour [23]. During their measurements, the traffic was smooth with an average traveling speed of about 75 km/h and an average flow rate of about 2,400 vehicles per hour per lane. They concluded that shifted lognormal distribution with 0.3 or 0.4 sec shifts fitted well the measured data. Griffiths and Hunt proposed the double-displaced negative exponential (DDNE) distribution to model vehicle headways in urban areas [24]. Sullivan and Troutbeck suggested the use of Cowan M3 headway distribution for modeling urban traffic flow [25]. In 1996, Luttinen proposed different scientific procedures to identify and estimate statistical models for time headway data and to test their goodness of fit [26]. He used his proposed procedures for data collected from Finnish two-lane two-way roads. Hoogerndoorn and Bovy developed a statistical procedure for estimating composite time headway distributions such as the generalized queueing model (GQM) [27]. Their proposed procedure was applied to traffic data collected on a two-lane rural road in the Netherlands and they claimed that headway distributions could be realistically replicated with the Pearson-III-based mixed-vehicle-type GQM.

In the 2000s, more researchers worked on measuring time headways and time gaps and tried to fit distributions to the collected data. Al-Ghamdi collected time headway data from 20 urban sites (13 freeways and 7 arterials) in Saudi Arabia [28]. He found that negative exponential, shifted exponential, and gamma distributions reasonably fitted the low (less than 400 vehicles/h) and medium (between 400 and 1200 vehicles/h) states of flow on freeways and the Erlang distribution properly fitted the high traffic flow state (more than 1200 vehicles/h). He also found that the gamma distribution gave a decent fit for a large range of flows on arterials (around 60–1,200 vehicles/h). Chandra and Kumar used five different sections of uninterrupted flow in six-lane divided urban highways in New Delhi, India, to measure headways [29]. They found the hyperlang distribution to be the best fit for the collected measurements under mixed traffic conditions and within a traffic flow ranging from 900 to 1,600 vehicles/h. Bham and Ancha used four data sets collected using aerial photography over a ramp weave, ramp merge, lane drop, and basic freeway sections to measure time headway and time gaps [30]. They found that the shifted lognormal distribution gave a better fit to measured data for both properties at all observed sites as compared to the shifted Gamma distribution. Zhang et al. evaluated existing distribution models on headway observations obtained from regular lanes and high-occupancy-vehicle (HOV) lanes from different periods of the day on interstate I-5 in Seattle, Washington [31]. They found that the DDNE model provided the best fit to their headway data, especially for HOV lanes for different traffic flow levels. Yin et al. used digital cameras to collect time headway data from the busiest expressway in Beijing, China [32]. They concluded that the lognormal distribution was adequate for free-flowing traffic conditions, while the loglogistic distribution was more suitable in congested traffic conditions. Ha sampled time headway measurements from Road RN118 and freeway A6 databases in France [33], finding that the double gamma and gamma-GQM models were best in terms of fitting the collected measurements. Jang et al. collected time headway data from a suburban arterial in South Korea [34]. The modeling revealed that the Johnson model was the best fit for headway data for a flow of 600–840 vehicles/h, while the Johnson model provided the best fit for traffic flows of 900–1,740 vehicles/h. Habtemichael et al. used the gamma-GQM model to fit time headway data collected from the fast lane of freeway A1 in Switzerland under dry weather, light rain, medium rain, and heavy rain intensities [35]. They found that the four parameters of the used model changed depending on the weather condition and concluded that traffic characteristics and driver behavior were more inconsistent under medium rain than with all other intensities. Dubey et al. measured time gap data on the entire width of a major arterial in the city of Chennai, India [36]. They found the composite Weibull/Extreme Value distribution to be the best fit for their data. Pascucci analyzed data recorded by both inductive loops and radar sensors on two-lane two-way rural roads in Northern Italy to study the effects of traffic parameters (flow rate and percentage of heavy vehicles) on headway distributions [37]. He found that the gamma-GQM model was the best fitting model for time headway data, but he concluded that its parameters lacked physical interpretations, as they do not allow a direct comparison of traffic parameters. Moridpour studied trajectory data provided for a section of highway I-80 in Berkeley, California, to evaluate headway distribution models for passenger cars and heavy vehicles [38]. She found that lognormal distribution models with shifts ranging from 0.26 to 0.40 s were suited to heavy vehicle time headways, while lognormal distribution models with shifts ranging from 0.08 to 0.16 s were suited to passenger car time headways.

One of the objectives of the study described in this paper was to evaluate time gap distribution as a function of vehicle speed. Few researchers have evaluated such a relationship. Taieb-Maimon and Shinar performed a field study to evaluate drivers’ actual headways in car-following situations [39]. In their experiment, participants were asked to maintain what they referred to as “minimum safe distance” and “comfortable, normal distance with no intention to pass” behind a lead car, in which speed was varied from 50 to 100 km/h. Their results showed that participating drivers adjusted their distance headways in relation to speed with an average of 9.5 m at 50 km/h to an average 19 m at 100 km/h, which resulted in an average minimum time headway nearly constant with speed and ranging from 0.64 s to 0.69 s. Dey and Chandra analyzed headway data obtained from simulation runs to develop relationships between desired time gap and speed for five categories of vehicles (car, heavy vehicle, motorized two-wheeler, three-wheeler, and tractor) [40]. They found that the gamma distribution was the best model for the desired space gap during the steady-state car-following situation under mixed traffic conditions. They also reported that desired time gap decreased as the steady-state vehicle speed increased with a higher rate of decrease at lower speeds. In addition, they stated that the desired time gap was more in the case of heavy vehicles than for cars. Brackstone et al. collected data using an instrumented testing vehicle for two groups of drivers: 6 active participants and 123 passive subjects who were observed following the instrumented vehicle during data collection runs [41]. They revealed a large degree of variation in time headway between or within subjects. When they averaged the results across participants, they found that average headway decreased with an increase in speed up to a speed of 15 m/s, after which it remained constant. Zou collected a dataset from loop detector data on interstate I-35 at Austin, Texas, in order to construct bivariate distributions describing the characteristics of speed and headway [42]. He showed that there was a weak dependence between speed and headway and the correlation could vary depending on the traffic condition, with the strongest dependence achieved under congested traffic states.

3. Materials and Methods: Collected Naturalistic Data

Recent technological developments in terms of data acquisition and storage have made the collection of naturalistic driving data feasible. In 2002, the Virginia Tech Transportation Institute (VTTI) initiated a study where 100 cars were instrumented and driven by designated drivers around the Washington, DC area. The conclusion of the study resulted in a database containing 207,000 trips completed by 108 drivers. This translates into 337,000 hours of data and approximately 12 billion database observations (the sampling period was equal to one-tenth of a second).

For the purpose of this study, only the data set related to 1,180 car-following events spanning around 10 hours was used. The car-following data were collected on an approximately 13-km long segment of a Dulles Airport access road. The choice of only one section for this study was established to maintain facility homogeneity in terms of free-flow speed, speed-at-capacity, saturation flow rate, and jam density. The evaluated data was collected from seven different drivers. For each event, the naturalistic data used in this study included the instantaneous speed of the following vehicle, the instantaneous speed of the leading vehicle, and the gap distance between the two vehicles. The instantaneous speed of and gap distance from the leading vehicle were measured using a radar system installed in the instrumented following vehicle. The instantaneous speed of the following vehicle was measured using its OBD system.

The data were then filtered to ensure that only steady-state car-following events were analyzed. This was done by extracting from the raw data only the events meeting the four criteria presented below. Note that the thresholds used with these criteria were based on the authors’ experience with car-following/steady-state conditions.(1)Absolute value of the speed difference between the leader and follower vehicle < 5%(2)Acceleration and/or deceleration during the considered event < 0.2 m/s2(3)Headway (calculated as the sum of the measured distance gap and the instrumented car length, the whole divided by the follower speed) < 4 s(4)Distance spacing < 100 m

Events were then filtered according to the follower speed. Seven speed intervals were chosen in steps of 5 m/s (18 km/h). The considered speed intervals in km/h were , , , , , , and higher than 108. The 5 m/s step was chosen for convenience rather than any physical reason. Figure 1 shows a filtered event for one of the drivers. In this event, where the follower car is travelling at a speed (98–102 km/h) a little higher than the leader (95–98 km/h), the time gap decreased from 0.82 to 0.67 s. All the time gap data points for this event and driver would therefore be saved as individual time gap measurements for the speed interval .

Once all the events were filtered, time gap data per driver and per speed interval were obtained.

Table 1 shows the number of collected samples. In total 168,053-time gap samples were obtained and most of the data belongs to the speed interval . A commercially available software program was then used for each dataset (per driver and per class of speed) to find the descriptive statistics as well as the best statistical distributions fitting the dataset for each driver. The data were then analyzed per speed interval grouping all the drivers. This was done in spite of the fact that the number of samples per speed interval differed from one driver to another. For example, in the speed interval more than 50% of the total data belongs to Driver 363 and therefore their behavior would have had the greatest effect on the results for this speed interval.

4. Results and Discussion

The collected time gap data were analyzed using descriptive statistics per driver and for all drivers combined. In addition, statistical distributions were used to fit the data.

4.1. Descriptive Statistics per Driver

Driver behavior is the most influential and the most difficult factor to model in traffic engineering because driving decisions mainly depend on the driver’s temperament and attitude toward risk. Even the same driver’s temperament can change from day-to-day or even hour-to-hour. This is illustrated by the large amount of variation both within and between drivers in the time gap data. For instance, Table 2 presents the mean and coefficient of variation (COV) calculated for all drivers and all speed intervals. The COV is defined as the standard deviation divided by the mean and is expressed as a percentage. COVs as high as 63.8% were calculated for Driver 462, demonstrating their inconsistency while driving. This driver, while driving at speeds ≥ 108 km/h, left, at some instances, a time gap as low as 0.48 s, while at other instances they left a time gap as high as 2.99 s. Table 2 also shows that the variability of most drivers was more significant at speeds higher than 54 km/h. At these high speeds, most drivers had a mean time gap greater than the median, indicating a larger concentration of shorter time gaps.

The between-drivers’ variability is also significantly noticeable in Table 2 (looking at the mean values for different drivers) and Figure 2. In this figure, the time gap box and whisker plots for Driver 358 (Figure 2(a)) and Driver 462 (Figure 2(b)) are shown. For all speed intervals, the median for Driver 358 is above 1.5 s, whereas Driver 462’s median is below 1.5 s. This indicates that the latter driver drove more aggressively compared to the former. At speeds above 108 km/h, the minimum time gap left by Driver 358 was 0.76 s, which is 1.6 times that left by Driver 462.

4.2. Descriptive Statistics for Combined Drivers

The data for all seven drivers was combined for each speed interval in order to accommodate for the general behavior of different drivers. Figure 3 shows the box and whisker plot for all drivers. From this figure, it appears that, in general, drivers leave longer time gaps at lower speeds than at higher speeds. This is observed by looking at the variation in the median across the speed intervals. For instance, at a speed interval of , the median is about 2.5 s but drops to 1.2 s at a speed interval of . This indicates that, in general, the driver leaves a longer distance gap at higher speeds; however, since the speed is higher, the time gap is reduced.

4.3. Statistical Distributions

Commercially available software was used to find the best distributions that fit the data per driver and for all combined drivers. The software uses the maximum likelihood method [43] to estimate the parameters of the statistical models. Three goodness-of-fit methods are also used by the software, namely, the Kolmogorov-Smirnov (K-S), the Chi-square (), and the Anderson-Darling (A-D) [43]. There is no reliable method amongst these methods, but they are based on different approaches. For more details about these methods and their test statistics, refer to [43]. Several known distributions (about 50) were tested using the software and a ranking was established according to the used goodness-of-fit method. Figure 4 shows the results of the best fitting, as obtained using the K-S method, for five different drivers and for all drivers combined for the speed interval . As mentioned earlier, since each driver’s behavior was different, different types of distributions were found more suitable for different drivers. For example, as Figure 4 shows, Johnson SB, Dagum, log-Pearson 3, Burr, General Extreme Value, and Pearson 5 were found to be the best statistical distributions to fit the time gap data for speed interval for Driver 304, Driver 316, Driver 350, Driver 358, Driver 363, and for all drivers, respectively. The term “best” describing a distribution fitting was based on the goodness-of-fit method. In fact, in most cases, what is described as a “best” fit using one method could be a badly ranked fit using other methods. This can be seen in Table 3 where the 10 best ranked distributions, using the K-S test, fitting the data for “all drivers” and the speed interval were found to be differently ranked by other methods. For instance, the “Wakeby” distribution is ranked second using the K-S test, while it is not even ranked using the test and is ranked 45 using the A-D test.

Other types of distributions were found for other speed intervals.

For instance, Table 4 shows the best distributions for all speed intervals for all drivers when using the K-S test. Equations for the probability density functions (pdf) as well as the values of the different parameters used with these distributions are also presented in Table 4.

When working with the data, it was observed that the number of samples might affect the decisions as to whether or not to reject a defined distribution. Other research efforts presented in the Literature section of this paper had a very small number of samples compared to this study. The theoretical distributions found in this study did not pass any goodness-of-fit criteria when all the samples were used. For that reason, a Matlab reduction algorithm was developed to reduce the number of samples, while keeping the initially observed distribution. This means that if x% of the data was found in bin y, the reduction algorithm would keep the same proportion x for the same bin y but using only a percentage of the actual data. As an example, the results of the rejection decision for the specific data (all drivers—speed interval ) are shown in Table 5. The number of samples was reduced to keep only 50%, 25%, 10%, 5%, and 1% of the data while maintaining the actual distribution using 100% of the data points. Starting at 10% of the actual data (8,861 points), the decision to reject at the 0.01 significance level changed to “No” using the K-S method. Results for the other methods are also presented in this table. Results show that it is better to use empirically observed time gap distributions in traffic simulation software rather than theoretically deduced distributions selected based on a statistical test. This may be infeasible if storing all the measurements but may be more accurate than assuming a specific distribution.

5. Conclusions and Recommendations for Future Research

Time gap is an important property in traffic engineering since it is directly related to safety. For that reason, time gaps were measured, using naturalistic driving data, for seven different drivers in a car-following scenario within steady-state conditions. Because the data were naturalistic, they provided a direct measure of real driver behavior on a typical US divided urban highway. Research findings led to the following conclusions:(1)Time gap variability within drivers was found for each speed interval(2)Time gap variability between drivers was found for each speed interval(3)Time gap seems to decrease with an increase in vehicle speed(4)Several statistical distributions could be used to fit time gap data(5)The number of samples was found to affect the decision on whether or not to reject a statistical distribution at a certain level of significance(6)Mathematically defined statistical distributions work well as theoretical tools, though empirically observed distributions could be used in traffic microsimulation software

The analysis described in this paper is based on the results of seven different drivers in the Washington, DC area. It is therefore recommended that more studies, similar to the one described in this paper, be performed to include more drivers and roadway facilities (freeway, arterials, collectors, etc.). The results of such studies could be aggregated by drivers’ age, climatic conditions, and other variables. Since human behavior differs from region to region, it is also recommended that the study described in this paper be performed on a regional basis. Although the data were gathered in 2002, we believe the results presented are still reflective of current day drivers given that the saturation flow rates of roadways have remained fairly constant over the past 30 years. Furthermore, the settings of adaptive cruise control (ACC) systems that were introduced and tested in the mid to late 1990s had a setting of 1s [44], which is consistent with current day ACCs.

Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.

Acknowledgments

The authors would like to thank the efforts spent by ENIT students Mohamed Houssem Harizi and Mohamed Wesleti in analyzing some of the data presented in this paper. The authors acknowledge the financial support provided by the Mid-Atlantic Transportation Sustainability University Transportation Center (MATS-UTC) and the University Mobility and Equity Centre (UMEC) in funding this research effort.