The accurate identification of recurrent bottlenecks has been an important assumption of many studies on traffic congestion analysis and management. As one of the most widely used traffic detection devices, loop detectors can provide reliable multidimensional data for traffic bottleneck identification. Although great efforts have been put on developing bottleneck identification methods based on loop detector data, the existing studies are less informative with respect to providing accurate position of the bottlenecks and discussing the algorithm efficiency when facing with large amount of real-time data. This paper aims at improving the quality of bottleneck identification as well as avoiding excessive data processing burden. A fusion method of loop detector data with different collection cycles is proposed. It firstly determines the occurrence and the approximate locations of bottlenecks using large cycle data considering its high accuracy in determining bottlenecks occurrence. Then, the small cycle data are used to determine the accurate location and the duration time of the bottlenecks. A case study is introduced to verify the proposed method. A large set of 30 s raw loop detector data from a selected urban expressway segment in California is used. Also, the identification result is compared with the classical transformed cumulative curves method. The results show that the fusion method is valid with bottleneck identification and location positioning. We finally conclude by discussing some future improvements and potential applications.

1. Introduction

Accurate traffic bottleneck identification plays an important role in calibrating fundamental diagrams and improving the efficiency of congestion mitigation. For the past decades, numerous efforts have been put to study traffic bottlenecks. The research subjects cover from the characteristics of traffic bottlenecks [13], bottleneck identification [46], and its treatment [7, 8]. According to the trigger mechanism, researchers divided traffic bottlenecks into two categories, named as recurrent bottleneck and nonrecurrent bottleneck [911]. Recurrent bottlenecks, usually caused by normal (stationary) congestion, are relatively fixed in both temporal and spatial dimensions. For urban expressway, they usually occur at locations where the roadway capacity suddenly changes, such as on-ramps, off-ramps, and work zones. For nonrecurrent bottlenecks, they are usually caused by some unexpected reduction of capacity, such as traffic incidents, weather, and special events. Although the locations and time durations of nonrecurrent traffic bottlenecks are random, making it difficult to predict and control, the scope of this paper is limited to recurring bottlenecks. In addition, both observations and theories have confirmed that traffic states are relatively stationary during peak periods [12]. Thus, many studies on analysis, operations, control, and management of transportation networks are taken under the assumption of recurrent bottlenecks [1317].

To identify traffic bottlenecks, it is necessary to determine their spatial locations as well as estimating durations over time. Thus, reliable data sources that can represent the spatial-temporal variation characteristics of traffic flow are crucial to the accuracy of the identification results. Inductive loop detectors, which count traffic flow for each lane in short consecutive cycles such as 30 s, can provide appropriate data for identifying bottlenecks. It is also one of the most widely used traffic flow detection devices currently in the worldwide [1820]. Taking the PeMS (performance measurement system) project, for example, as a comprehensive traffic information collection and evaluation system across all major metropolitan areas of California in United States, many traffic evaluation reports in this system are generated based on loop detector data. Although probe/GPS data are very popular in identifying short-term traffic conditions (including bottleneck identification) in recent years [2123], using probe data faces multiple problems according to the feedbacks from the practice area. The biggest challenge lies in the sample bias. For agencies, either traffic department or companies who can obtain the users’ GPS data (e.g., map apps), their sampled floating cars can only cover a part (or even a very small proportion) of the vehicles on the road. This will lead to high frequencies of misinterpreting bottlenecks [24]. Moreover, the accuracy of bottleneck identification via probe data will also be reduced under the impact of the unstable transmission efficiency [25]. On the contrary, loop detectors, which have been in use for detection of vehicles since 1970s, can continually capture traffic flow, occupancy rate, and speed from all the passing vehicles [26]. Thus, the value of loop detector data remains to be further explored, and bottleneck identification methods based on loop detector data are still needed.

Seeing from the literature, two streams of researchers have made major contributions to study bottleneck identification methods based on loop detector data. Their accomplishments are known as the TCC (transformed cumulative curves) method [4, 27, 28] and AIA (automatic identification algorithm) [5, 6, 29]. However, both TCC method and AIA can only locate traffic bottlenecks at the positions of the nearest detector stations. When a bottleneck occurs at somewhere between two detectors, the reported location is either the coordinate of the upstream detector or that of the downstream, causing errors in the identification results. More efforts need to be put on how to locate the accurate position where the bottleneck occurs. Meanwhile, the efficiency of the identification method should be paid more attention to if we hope to apply this method in the real world. Taking the PeMS project, for example, the data have been accumulated since 1999 and over 35,000 detectors report data every 30 seconds. It is a huge database, and any operation on this database may increase the workload on the system.

In addition, we noticed that researchers in the literature usually assumed/used only one uniform value of data collection cycle (e.g., 30 s, 3 min, or 5 min). This invites many discussions on the “optimal data collection cycle” problem since using small cycles like 30 s can determine the accurate location of the closest detector but will increase the likelihood of misinterpreting small fluctuations in traffic flow as bottlenecks. Qiao reported that there are a wide range of fluctuations when using the raw data collected at 15 s–30 s intervals. Data aggregation is a proper way to solve this problem, helping save archiving space by hundreds if not thousands of times [30]. Park proposed a set of statistical methods based on loop detector data to identify the optimal aggregation interval sizes for travel time estimation and forecasting. When taking the error into account, the results show that the optimal solution of the aggregation size is not the smallest cycle [31]. While using large cycles like 5 mins which performs better in detecting the occurrence of bottlenecks, it will increase the difficulty of locating the accurate positions where bottlenecks occur. In this case, why not use data based on a large cycle and a small cycle at the same time?

In this paper, we propose a fusion method of loop detector data using different collection cycles so as to achieve higher quality of bottleneck identification. The occurrence of traffic bottlenecks and the positions of the relating loop detector stations can be determined firstly using the large cycle data. Then, the accurate coordinates and duration time of the bottlenecks can be identified based on the small cycle data. This fusion method takes the advantages of the different loop detector collection cycles and combines the speed data with the occupancy rates at upstream and downstream of traffic bottlenecks. It contributes to the literature by improving the accuracy of positioning while avoiding data processing burden since the small cycle data will only be introduced when the occurrence of bottlenecks are identified by the large cycle data. Meanwhile, the analysis of the bottleneck spatial-temporal stability can help local traffic departments effectively identify traffic bottlenecks and make mitigation strategies.

The rest of the paper is organized as follows: Section 2 outlines related literature for identifying traffic bottlenecks with loop detector data. Section 3 presents the proposed method in the order of introducing how to identify the location of a bottleneck, its impact area, and duration. Section 4 provides a case study of the fusion method, presenting bottleneck identification results and relating discussion. Finally, Section 5 concludes the paper and provides recommendations for future work.

2. Literature Review

The literature review presents a number of themes relevant to bottleneck identification methods reported in this paper. A critical review of algorithms using loop detector data is provided in order to contextualize the proposed fusion method.

The TCC method, presented by Cassidy and Windover in 1995 [4], uses loop detector data to determine bottleneck locations according to the tendency of cumulative vehicle arrival curves at the detector stations. It is also capable of analyzing the details of flow features. The TCC method has received much attention since it was proposed. Many researchers use TCC to study traffic characteristics at bottlenecks [3234]. In 1996, Lawson et al. [27] proposed an approach that modifies the input-output (or queuing) diagram to measure travel time of vehicles and length distribution in a queue. Their method can report the maximum length of the queue and the time when the maximum queue occurs. In 2005, Bertini and Myton [28] used the inductive loop detector data from the PeMS to analyze freeway bottlenecks in Orange County in California. Using cumulative curves of vehicle count and occupancy rate, ten bottlenecks in time and space over one morning peak period were identified. With the understanding of the trigger mechanism of bottlenecks, their research work provides a foundation for determining ramp metering rates and addresses characteristics that formulate bottlenecks. After decades of development, the TCC method has relatively been mature. However, the cumulative arrival curve is a visual representation of observations collected directly from the roadways without specific mathematical functions. In practice, it is difficult to apply the TCC method to large amounts of data and produce real-time identification results of traffic bottlenecks.

On the contrary, the AIA methods, including threshold method and ASDA-FOTO (automatic tracking of moving traffic jams-forecasting of traffic objects) model, are more popular in recent years. The threshold method determines the location of traffic bottlenecks by comparing the magnitude of speed or occupancy rate with the thresholds. Zhang and Levinson [29] proposed an identification method based on occupancy rate. When the minimum occupancy rate is higher than 25%, it indicates that a traffic bottleneck is generating around the location where the lowest occupancy rate is detected. Meanwhile, when the maximum occupancy rate is less than 20%, it indicates that the traffic condition is free flow. Chen et al. [5] proposed an identification algorithm based on speed, which calculates the speed differences of adjacent sections. When bottleneck occurs, the speed in the bottleneck will drop rapidly below the critical speed thus causing a relatively large speed difference between the bottleneck and nonbottleneck area. If the above conditions are detected, then the bottleneck is identified. Liu and Fei [35] modified the identification model based on Chen’s method with consideration of the possibility of misjudgment in special traffic conditions, which improves the accuracy of bottleneck identification. Meanwhile, the ASDA-FOTO model proposed by Kerner et al. [6] is a method based on the three-phase traffic flow theory. According to this theory, there are three forms of road traffic flow: free flow, synchronous flow, and blocking flow. Therefore, the purpose of traffic bottleneck identification is to distinguish the synchronous flow and the blocking flow in road traffic, which can be achieved by the FOTO model. Afterwards, the ASDA model can be used to predict the spatial and temporal variation of the blocking flow. Overall speaking, the threshold method works in the way of comparing the values of critical indexes (speed or occupancy rate) with their thresholds. When using it, the critical values of the parameters need to be determined firstly. However, the critical values of traffic parameters vary with environment and traffic conditions, making the calibration of the thresholds cumbersome and difficult. Furthermore, although the ASDA-FOTO model overcomes the difficulty in parameter calibration of the threshold method and achieves high reliability with fewer loop detectors in practice, this method still cannot effectively filter the random fluctuation of detection data. Some researchers have also tried to take full advantage of the above methods and combine them for bottleneck identification. Li's work is one of the examples [36], who proposed an automatic deformation accumulation curve method by combing TCC, threshold method and ASDA-FOTO model. The result shows that it has a better performance on the identification of traffic congestion in terms of receiving a high accuracy and effectively reducing the impact of random fluctuations of traffic flow. However, when using it to analyze the characteristics of traffic flow, this method cannot reveal the traffic features in more detail compared with the traditional methods.

In addition, there is another stream of researchers who developed indicators relating to bottlenecks to better study traffic flow theory. For example, Treiber et al. in 2000 introduced the notion of “bottleneck strength,” defined by a local drop of capacity, to study different kinds of congested traffic forming using data from several German freeways [37]. Later, Treiber and Kesting analyzed the loop detector data from several hundred traffic jams in 2011. They gave data-driven estimates of the “bottleneck strength” for the quantitative estimates of the severity of the associated bottlenecks to confirm the convective instability in congested traffic flow [38].

This paper concerns with improving the efficiency of bottleneck identification. The literature provides insights into critical traffic parameters but is less informative with respect to the in-deep use of them. This paper contributes to the research literature by proposing a fusion method of loop detector data using different collection cycles. It firstly determines the occurrence and the approximate locations of bottlenecks using large cycle data considering its high accuracy in determining bottlenecks occurrence. Then, it introduces the small cycle data to determine the accurate locations and the duration time of the bottlenecks.

3. Fusion of Loop Detector Data with Different Collection Cycles

3.1. Identifying Bottleneck Location

The fusion method is established based on two postulations. Postulation 1: loop detector data from at least four stations are needed to identify a bottleneck. Postulation 2: traffic flow on multiple lanes on a same road section is homogeneous since a bottleneck occurred on a specific lane will lead to the capacity drop for all lanes. And finally, the congestion will be extended to the whole section.

In order to identify the occurrence of a bottleneck and its specific location accurately, two types of loop detector data based on different collection cycles (one is large and the other one is small) will be used in the fusion method. Firstly, data collected with the large cycle (e.g., data updated every 5 minutes) are used to determine the approximate location of the bottleneck by positioning the nearest loop detector station according to the characteristics of the speed variation at the bottleneck. When traffic bottleneck occurs, travel speed at the bottleneck center decreases rapidly below the critical speed. At the same time, car speed gradually decreases from bottleneck upstream to the center place due to the bottleneck flow chopping effect. Meanwhile, speed at the downstream area still remains at a relatively high level.

Figure 1 shows a typical layout of loop detector stations on an urban expressway with an on-ramp, where represents the abscissa at the place of which the center position of the loop detector station locates. As we can see from Figure 1, there are two lanes on the mainline, and thus two loop detectors are placed at each detection station. Here, we define that the traffic data collected by the station are the average value of that collected by the loop detectors at station based on Postulation 2. Assume that the driving direction is from left to right, the spatial relationship of the loop detector stations in Figure 1 can be expressed as . It can be determined that traffic bottlenecks occur around loop detector station , if the following conditions of equations (1) to (3) are observed simultaneously:where , , , and represent the traffic speed at loop detector station , , , and in a cycle of , respectively; represents the critical speed that is used to determine whether a bottleneck occurs; and represents the set of loop detector stations.

Furthermore, the spatial relationship between the bottleneck and its nearest loop detector station can be determined based on speed and occupancy rate. Figure 2 shows the situation when the bottleneck occurs closest to and upstream of station . In this situation, the speed of the vehicle coming out from the bottleneck and detected by station will be lower than the critical speed. Thus, it will report that there is a bottleneck happened at the location , which is also the location of the station . Due to the bottleneck, few cars can drive out, and thus the occupancy rate detected at station will be relatively small.

Figure 3 shows the situation when the bottleneck occurs closest to and downstream of station . At this point, the speed of the vehicle driven out from the bottleneck to its downstream loop detector station will not be lower than the critical speed since the vehicle close to station is out the range of the bottleneck. Meanwhile, the occupancy rate detected at the upstream station will be relatively higher.

Let be the critical occupancy rate. Therefore, the relative position of the bottleneck can be identified as upstream or downstream of its closest loop detector station according to the occupancy rate. Two situations are included as follows:Situation 1. If , the traffic bottleneck is located upstream of loop detector station Situation 2. If , the traffic bottleneck is located downstream of loop detector station

Secondly, data collected with a smaller cycle (e.g., data updated every 30 seconds) are used to determine the specific location of the bottleneck. It is assumed that a large data collection cycle consists of small cycle . When a bottleneck is detected within one of the large collection cycles , it must can be detected in the () small cycle by satisfying equations (1) to (3). Then, the accurate time when the bottleneck occurs can be identified, and also the accurate location can be determined based on equations (4) and (5), expressed as follows:

If ,

If ,where is the actual location of the bottleneck within , is the position coordinate of loop detector station , is the traffic flow in the cycle within when the bottleneck is detected, is the density in the cycle within when detecting the bottleneck, is the traffic flow at the bottleneck location, is the density at the bottleneck detected in the cycle within , and is the speed when the bottleneck spreads upstream or dissipates downstream.

When the bottleneck locates between two loop detectors, the flow and the density of the bottleneck in equations (4) and (5) cannot be directly obtained by loop detectors. One of the possible solutions is calibrating the values with the traffic data of bottlenecks collected at loop detector stations. Another way is to estimate flow and density values directly based on the historical data.

Since loop detectors can only collect data including flow, speed, and occupancy rate, the density in equations (4) and (5) cannot be obtained directly. But it can be calculated based on the relationship between occupancy rate and density, which is expressed as follows:where is the length of the vehicle and is the width of the loop detector.

3.2. Identifying Impact Area and Duration of the Bottleneck

While Section 3.1 introduces how to identify the start location of a bottleneck, it still needs to study its spatial and temporal range since the vehicles will accumulate and the congestion will be spread upstream from the start point. Given the method of identifying the starting location that has been discussed in Section 3.1, herein we only need to determine the maximum spread location upstream the bottleneck to obtain the spatial range of the traffic congestion.

Assume that the loop detector station detects a bottleneck occurring within a cycle . When the traffic congestion formed by the bottleneck spreads to another upstream station , the speed at station quickly falls down below the critical speed, whilst the speed at station is still larger than that. This relationship can be expressed as follows:where , , and are the speed at the loop detector stations , , and in cycle , respectively. The upstream location of the bottleneck can be calculated bywhere and are the traffic volumes at station and station from the to the small cycle within the large cycle , respectively; and are the density at stations and from the to the within , respectively which can be calculated by using equation (6); and is the position coordinate of station . It should be noticed that even if the congestion is caused by the bottleneck which is detected by station , it should not spread to the upstream station in the next few cycles of and equation (8) is still true (in this case, ). Herein, the length of traffic congestion caused by the bottleneck can be determined by using equations (4), (5), and (8), which is expressed as

Furthermore, define is the time when the bottleneck is detected and is the time when the bottleneck disappears. and can be determined by examining the truth of equations (1) to (3). The bottleneck duration time can be calculated as

Finally, we also hope to report delay information caused by the bottleneck for better evaluating its impact. Here, assume that all vehicles that are in the middle of a bottleneck between two loop detector stations will drive at the speed that must be smaller than the critical speed. If the road section between two stations are marked as the congestion section in a cycle of , thenwhere represents the set of upstream congested road sections after the bottleneck is detected by the loop detector station and represents the speed detected by the loop detector station during cycle . If , it indicates that the section between station and its upstream station is a congested section.

According to equation (11), the congestion section of the whole process from bottleneck occurrence to bottleneck dissipation can be obtained, and the delay caused by each congestion section can be calculated. The delay caused by the bottleneck in cycle is then defined as , which is expressed aswhere is the delay of road section in cycle . It can be calculated aswhere is the length of road , is the traffic volume of road section in cycle , and is the free flow speed. After the bottleneck occurs, the total delay caused by the bottleneck can be expressed as

4. Case Study

In this section, we apply the proposed fusion method to a real-world study field. Bottlenecks on an urban expressway are identified using a raw loop detector dataset provided by the PeMS data warehouse (http://pems.dot.ca.gov). Raw loop detector data are reported every 30 s, including vehicle counts, occupancy rates, and speed for all lanes.

We choose the urban expressway segment on the northbound State Route 73 passing through Laguna Hills and Laguna Niguel in Orange County, California, for the case study. Within the study area, there are 12 mainline loop detector stations with ID numbers as 1208789, 1208942, 1208944, 1210477, 1210696, 1210687, 1210679, 1210661, 1210505, 1210487, 1210474, and 1210465 (for convenience, we rename the station ID as ). Based on our daily observation of the real-time traffic condition on the PeMS map, traffic bottlenecks usually appear during peak periods on weekdays. Thus, our data sample used for method validation is collected from 6:00 AM to 21:00 PM on Wednesday May 30, 2018.

Figures 46 show the spatial-temporal speed distributions of the chosen stations. We can see from the figures that there are different congestion degrees of bottlenecks detected by stations , , and . Taking Figure 4, for example, it can be directly observed that during the time period of 08:00–11:00 and 13:00–14:00, several bottlenecks occurred around the location of loop detector station . Among them, the bottlenecks around 08:00 and 09:30 are relatively heavy, causing spreading of congestion continuously to the upstream expressway. Then, during the time period of 08:20–09:00 and 13:00–14:00, station also detected some bottlenecks. But their impact areas are relatively small so that congestion is reported by the nearby stations.

Table 1 shows the results including the occurrence time of bottlenecks, the upstream/downstream of the bottlenecks (represented as O-UP and O-DOWN), bottleneck duration, and delay caused by the bottleneck using the proposed method under the assumption that the large cycle equals to 5 min and the small cycle equals to 30 s. As we can see from the results, 21 bottlenecks were detected. Among them, eight bottlenecks are detected at the locations downstream of , 12 bottlenecks are detected at the locations downstream of , and one bottleneck is detected at the location downstream of , respectively. The heaviest congestion happens from 15:50 to 15:55 with a maximum length of 3 190 m located from to . By the time of 15:55, the longest delay reaches 7.27 hours at station . Comparing the result in Table 1 with what we observed based on Figures 46, the validity of our proposed method in identifying bottlenecks can be proven from the improved consistency.

Figure 7 shows the cumulative vehicle arrival curve when applying the classical transformed cumulative curve (TCC) method. It can be observed that the bottleneck identification results using the TCC method generally match with the proposed method. Table 2 outputs the calculation time of the two methods for five runs in MATLAB. Each operation gives a consistent result in bottleneck identification with a slight difference in calculation time for both methods. But in any event, the proposed method only takes about half the time of the TCC method. Although the time difference seems tiny in our case, when it comes to a larger scale, this time saving will be significant. In summary, we can conclude that the identified bottlenecks are valid with high quality in terms of accurate positioning and computation efficiency.

5. Conclusion

This paper studies the traffic bottleneck identification method. Based on a review of the existing bottleneck identification methods, a novel traffic bottleneck identification method is proposed by fusing different collection cycles of loop detector data. The algorithm makes full use of the features of different loop detector data scales in the bottleneck identification. On the one hand, it uses the large cycle data to achieve a fuzzy positioning while avoiding the error caused by the volatility of the small cycle data. On the other hand, it realizes the accurate positioning by using small cycle data after determining the occurrence of bottleneck. Then, the methods of calculating the spatial range, duration time, and delay of the bottleneck are also elaborated. Finally, a case study is applied to verify the proposed algorithm.

The purpose of this paper is to identify bottlenecks on urban expressway with a high efficiency using loop detector data and produce more accurate results. It is also the major motivation of proposing the fusion method. However, the accuracy of the identification result is largely dependent on the quality of loop detector data. Since now we can obtain a large amount of traffic data in multiple ways, such as GPS and floating car, the fusion of multisource data is expected to have a broader application prospect. Besides, although the main idea of this paper is bottleneck identification on urban expressway, the proposed method can also be applied to other roads if loop detectors are available.

Data Availability

The data used in this study can be obtained from the PeMS data warehouse, which is available online (http://pems.dot.ca.gov).

Conflicts of Interest

The authors declare no conflicts of interest.


This study was supported by the 2016 Xihua University Young Scholar Reserve Talent Project, Sichuan Provincial Department of Education Scientific Program (no. 18ZB0565), 2015 Natural Science Key Foundation of Xihua University (no. Z1520315), and the Open Research Subject of Key Laboratory of Vehicle Measurement, Control and Safety, Xihua University (no. szjj2016-014).