Abstract

Accessibility plays a crucial role in evaluating and optimizing the service quality of public transport systems. Traditionally, bus accessibility research studies mainly focus on the evaluation from the perspective of the aggregate model and analyze the service radius of the site considering the front end of travel. This paper aims to model microlevel bus accessibility calculation and accurate bottleneck station identification according to the opportunity method with AVL (Automatic Vehicle Location) data. The research featured the following control elements: (1) redefining bus accessibility with bus station as the research object; (2) running time between interstation sections is selected as the control index; (3) A new measure index of accessibility. The farthest direct reachable distance is proposed. Moreover, by introducing the quality control algorithm, the study establishes a model of upper and lower limits of accessibility based on the probability index of bilateral norms, which can be employed to identify the bottleneck stations according to the significance level. Harbin City is selected as the case study for methodology verification and shows that the accessibility method is effective and reliable. Additionally, by taking the travel demand in the region as the adjustment parameter of the quality control algorithm, the study demonstrates the ideal accessibility interval of stations and accurately locates all bottlenecks. The study presented here adds to our understanding of accessibility in the construction and management of buses and provides an essential reference for the following optimization.

1. Introduction

With the development of the city, the urban spatial structure may change from a stable state to a constantly updated state continuously, and the urban design may be multicentered [1, 2]. Nevertheless, expansion into a marginal area will result in an evident increase in employment and housing imbalances and bring about greater travel obstacles, including the larger spatial-time block [3]. Accessibility is one of the most important abilities to decompress time and space. Its limitations would affect the balance of development between the central areas and others [4] and may even induce serious social problems [5]. Under normal circumstances, ensuring the balance of public transport accessibility plays a significant role in solving social equity [6]. Hansen first proposed the concept of accessibility in 1959, defined as the degree and difficulty of the interaction between nodes in the network [7]. Similarly, bus accessibility can be defined as the difficulty for travelers to get from one part of the city to another employing a bus system [8]. The evaluation of the bus accessibility index can comprehensively consider the interactive relationship between land use and bus to realize the guiding role of the bus on urban spatial structure and functional layout. Additionally, bus accessibility evaluation can guide urban construction by public transit, especially high-capacity rail transit systems. Urban construction can determine the type of land and the degree of intensification based on the distribution of bus accessibility, thus balancing the travel demand and reducing the pressure on the transportation system.

The evaluation of bus accessibility is varied and complex, involving many elements. Bus network is one of the modeling bases for measuring bus accessibility. The main method to build the topology of the bus network is to regard bus stations as nodes in the network and route sections or sections of two stations as edges in the network. Based on related theories, many scholars have studied the performance of bus networks, which provides a good breakthrough for accessibility evaluation. For example, two parameters, line function loss and connectivity, are often employed to measure the traffic function and connectivity of subway lines [9]. Some studies split a complex network with multiple weights into a complex network with single weights, investigated the global synchronization problem of the new multiweight network model, and gave simulation results of network equilibrium [10]. Wei et al. constructed a new type of bus route network and studied the spatial characteristics of urban bus routes using community detection [11]. Additionally, with the progress of graph theory and network theory, many novel networks such as weighted space-L networks [12], directed space-L networks [13], a hybrid of random network and scale-free network [14], and so on have been employed to model for the bus network.

Although the indexes and targets needed to evaluate the accessibility models in the existing literature are diversified and complicated, generally speaking, these can be divided into four types: Space Separation Model, Gravity Model, Cumulative Opportunities Model, and Utility Measure Mode. Among them, based on the general model framework, the concrete evaluation methods of public transport accessibility mainly include TTSAT (time-based traffic service area tools) [15], LUPTAL (land use and public transport accessibility index) [16], GTFS (general transport feed specifications) [17], PTAL (public transport accessibility level) [18] and so on. For example, Ding et al. employed principal component analysis to measure accessibility based on the gravity model [19]. Hyun et al. calculated the accessibility of three different stages based on time distance and proved its spatial transfer effect [20]. Mavoa et al. provided a measure of bus services through the bus and walking accessibility index and transit frequency [21]. Furthermore, accessibility-based location selection of urban POIs (Points of Interest) also has become a research hotspot in recent years, such as medical facilities [22], educational facilities [23], and leisure facilities [24]. For instance, Oviedo et al. deployed multisource data to study the impact of accessibility on employment, especially the contribution to the relatively poor groups [25]. As researches continue, the spatial-temporal dynamics of travel have been of increasing interest to scholars. By using an integrated commuter mode, Owen et al. focused on accessibility in continuous time and estimated a binomial logistic model of bus commuting [26]. Guida et al. studied the phenomenon of accessibility and equity by analyzing the travel behavior of the ageing population [27]. Jrv et al. presented a conceptual framework for location-based dynamic accessibility modeling that captures the temporal correlation of people, traffic, and social activity locations [28]. Significant data sources offer new possibilities for urban mobility and accessibility studies, especially mobile cellular data from cell phones simplify group travel patterns and traffic geography studies. Garcia et al. used the Google API to construct OD matrixes, ultimately analyzing the independent impact of dynamic accessibility and its associated components [29]. However, little attention has been paid to systematically analyzing the effect of temporal resolution on the results. Based on this, Marcin et al. addressed the loss of accuracy due to progressively lower temporal resolution, aiming to guide selecting the appropriate temporal resolution in accessibility studies [30].

Most of the studies are focused on the connotation of accessibility, the establishment of quantitative models, determination of evaluation indexes, and the optimization of accessibility. The theory of accessibility has been widely used in the actual process of urban planning and construction. However, most of the existing studies analyze bus accessibility from the macroscopic point of view, and few theories are developed at the microscopic level (such as stations). For the station, most studies only consider the front end of the trip to analyze the service radius of the station, that is, evaluating the difficulty level to reach the station. For passengers, besides the accessibility from the departure point (home, workplace, school, etc.) to the bus station, the spatial distance range that the bus can reach from the station is the key index to measure the service capacity of the station. Additionally, the models mainly limited evaluating the accessibility of public transport without identification of weak points and bottlenecks in the public transport network. The travel bottleneck is a relatively new and important concept applied to traffic congestion, but few theories on bottlenecks of public transport are researched. Bottleneck identification models for public transport can help us understand the operation mode and optimize service efficiency to the greatest extent possible. This paper aims to respond to the above-mentioned challenges by applying a chance model based on AVL and spatial data to address the following research questions:(1)How to consider the time factor and measure the space range that can be reached from a bus station within the bus network?(2)How to accurately locate bottlenecks in public transportation networks based on accessibility?

The paper is structured as follows. The methodology to model bus accessibility and identify the bottleneck stations is discussed in Section 2. The opportunity method with AVL data is employed to establish time and space limits. We use POI (point of interest) data to describe the land use that has been considered to calibrate the relevant parameters of the subsequent model. Additionally, we expound on the framework of using quality control to analyze specification range and probability indexes for bottlenecks identification. Section 3 presents the case study to demonstrate the results of accessibility calculation and bottlenecks identification. The paper ends with the discussion and conclusions in Section 4.

2. Methodology

2.1. Methodology Framework

The paper investigates the bus accessibility and bottleneck stations with the time-space variation. The methodology framework is mainly constructed based on data processing, accessibility calculation, bottleneck stations identification, and case verification, respectively, as shown in Figure 1.

2.2. Accessibility Model
2.2.1. Theoretical Basis

Based on the opportunity method, this paper employs a geographic information system to measure the accessibility of public transport from both micro and macroperspectives [31, 32]. The public transportation system includes three levels: station, line, and area network. Obviously, the smaller the research scale, the higher the accuracy of the research results. In addition, from the perspective of travelers’ needs, time and distance are the most critical factors affecting travel decisions. Therefore, the single station is selected as the research object in this study. By considering the measurement of time, the accessibility of public transport is defined as the maximum sum of the directly available distance at a certain point within a reliable time. The sum of the accessibility of all stations constitutes the accessibility level of the whole network. The prominent advantage of this definition lies in ① taking a single station as the research foundation, indirectly reflecting the service level of the station by measuring the travel opportunities available of the station; ② providing a basis for finely identifying the bottleneck points of the public transport network and giving some reference for travel choices of users. According to the definition, three factors affect the spatial accessibility of bus stations: the number of stops, the longest distance, and the travel time of each shift. Therefore, each station’s spatial accessibility calculation model is calculated by the following equations:where MSDAD is the spatial accessibility level of a station; is the maximum distance that the bus for the j-th line that stops at the station can reach from the station to continue driving; f (t) is a discriminant function, which is used to determine whether the running time of public transport vehicles exceeds the limit; tlimit is the time limit, that is, the reliable travel time of the j-th line bus that stops at the station and continues to travel from the station to the destination, which belongs to the ideal time and takes the average level of the whole city; is the number of inter-station sections passed by the j-th line bus that stops at the station and continues to drive from the station to the destination; is the actual travel time required by the j-th line bus that stops at the station to continue driving from the station to the destination; is the actual interval running time required for the bus on the j-th line stopping at the station to continue driving and pass through the i-th inter-station section; and is average running time of all inter-station sections in the city.

The larger the value of MSDAD, the greater the travel opportunity obtained from the station, which means the higher the possibility of meeting the travel demand of passengers, and thus the higher the accessibility level. At the same time, for lines or networks, if the covered stations own great accessibility, their accessibility is also at a relatively good level. Therefore, this paper employs the accessibility of the stations to get the accessibility calculation model of the line, as shown in the following equation:

Abl is the accessibility of bus line. Numerically, it is equal to the sum of all the accessibility levels of stations of this line. Similarly, if the accessibility level of a region needs to be calculated, the accessibility values of all stations in the region are summed up.

2.2.2. Calculation Procedure

Taking the running time between stations as the control index and dividing a day into three periods (morning peak, 6:00–9:30; Evening peak, 16:00–19:30), we calculate the bus accessibility of the whole city within a week. The running time between stations is a crucial index to determine the reliability of bus running time. In this study, the running time between stations refers to the time from entering the last station to the next station, including the stay time at the last station and the driving time before reaching the next station. The total calculation steps are as follows.

Step 1. Match the station number with the actual station according to the line name.

Step 2. Establish two lists for each interstation section, and the list of running time of interstation section starting from i station is it = [], and ia = [] is used to store the accessibility of the i-th station. Traverse the inbound and outbound data of all routes and shifts and match the “O_TIME” index with stations to obtain the inbound time of all routes and shifts at each station. For a single shift of a single line, the arrival time of i-th station is Ti, and the arrival time of (i + 1)-th station is Ti + 1, so Ti + 1 − Ti is added to list it, as shown in Figure 2.

Step 3. According to the list it obtained from each inter-station section, calculate the average running time of i-th interstation section . Further, calculate the running time of all inter-station sections in the whole city, which is the in equation (4).

Step 4. According to equation (3), calculate the theoretical time tj required for the bus of the j-th line reaching the terminal from i-th station. If tj < tlimit, go to step 5; otherwise, another line is calculated.

Step 5. Calculate the distance between i-th station and the terminal station of the j-th line according to the latitude and longitude of the station [33] and add the calculation results to ia.

Step 6. Repeat all routes and all stations to calculate the accessibility of all stations.

2.3. Identification for Bottleneck Stations

The ultimate goal of accessibility research is to identify unreliable units (stations or lines) and provide a basis for the next step of optimizing the layout of stations or lines. Because of the large amount of bus station data, this paper employs QCA (Quality Control Algorithm) to judge the degree to which the test data meet the requirements of quality standards (such as specification range) through the probability indexes of bilateral specifications, to identify the bottleneck stations with low accessibility [34, 35]. The QCA compares the average accessibility of all stations on the same line and establishes the upper and lower limits for evaluating the accessibility of public transportation according to the significance level. When the accessibility of the evaluated station is less than the limit value, the station is the reachable bottleneck station. The specific calculation model is as follows:where AC is the critical accessibility value, “+” represents the upper bound and “−” represents the lower bound; is the average accessibility of the station; k is a statistical constant, corresponding to different confidence levels; and M is an adjustment parameter, which is the travel demand within the service radius of the station.

3. Case Study and Result Analysis

3.1. Study Data and Area

Part of the research data comes from the AVL data provided by Harbin Bus Company from May 1st to May 7th, 2017. The file size of single-day data is about 30M, containing about 300,000 inbound and outbound data. The relevant data attribute records are shown in Table 1. The other part comes from vector map data of Harbin provided by open street map, and crawler tools capture the latitude and longitude data of all stations in the city.

The Harbin City is employed for verification (as shown in Figure 3). As the capital of Heilongjiang province, it is located in the extreme northeast of China. As of February 2020, there are 256 bus lines in Harbin (the main urban area); among them, the length of the bus network reached 997.4 kilometers, the density of the line network reaches 2.47 km/km2, and the average daily passenger volume of buses and trams is 3,676,200 (https://kns.cnki.net/kcms/detail/detail.aspx?dbcode=CMFD&dbname=CMFD202102&filename=1021030779.nh&uniplatform=NZKPT&v=lc4N3PNCTp3TIbtOIgb1Doj9ikTLNXOVYLgIFoDQc2wms1oyRiuep2ZRWJBeY4QU). From the traffic analysis report released by the Amap, however, it can be seen that Harbin has been listed as one of the top ten traffic jam cities in China for many years (https://report.amap.com/share.do?id=a187527876d07ac50177142eba987ce0; https://baijiahao.baidu.com/s?id=1622808967853154807; https://wenku.baidu.com/view/95cddad46bdc5022aaea998fcc22bcd126ff42ca.html). At the same time, Harbin City is known as one of the most unreasonable cities in the country in terms of road planning. The unreasonable and unscientific road planning has a direct impact on the development of the city. Traffic congestion is increasing and induces a negative impact on the economic-social development and the function of the city. The most direct, essential, and obvious harm is the inconvenience of urban residents and the significant increase in travel costs. Therefore, compared with other cities, the application of the above-mentioned model to Harbin shows more urgent demand and practical value.

According to Figure 3, six districts are selected as research areas including Nangang district, Pingfang district, Daoli district, Xiangfang district, Daowai district, and Songbei district. The government has invested more in public transport in these areas, making them more representative as they have better infrastructure. However, due to the low level of economic growth, the traffic department has never recorded AVL data for other regions, which are relatively backward in bus development.

3.2. Analysis of Validity Based on Spatial-Temporal Difference

According to the calculation results of the proposed accessibility model, it can be seen that the performance of bus accessibility varies significantly in different time periods (flat peak and peak) and different areas (downtown and surrounding areas). Based on existing calculation results, the validity of the model is further analyzed. Referring to Wei et al.’s [36] modeling ideas in the accessibility calculation of rail transit stations, this paper verifies the model by comparing the horizontal distribution and vertical changes of accessibility values.

3.2.1. Validation Analysis Based on Time Difference of Accessibility

According to the calculated results, the accessibility level is classified by using Jenks natural breaks method: Grade 5: 0–1 500 m; Grade 4: 1500–3000 m; Grade 3: 3000–6000 m; Grade 2: 6000–12,000 m; Garde 1: greater than 12,000 m. Figures 49 show the accessibility performance in different time periods of a certain day. There is a significant difference in the accessibility of public transportation between morning and evening peak hours and off-peak hours. Only the proportion of stations with Grade 5 accessibility increases significantly during the morning and evening peak hours, while others do not. As mentioned above in the traffic analysis report, traffic congestion in Harbin is very serious, especially during peak hours. During peak hours, after all, the disruption to bus running will also reach the highest due to the surge in traffic volume. As a result, the maximum distance that buses can reach on most routes within a reliable time will drop steeply to a minimum, resulting in an increase in the number of stations with Grade 5 accessibility. By contrast, the number of stations with other Grade accessibility will decrease.

Combined with Figures 49, according to the calculation results of average accessibility in a week (Figure 10), there is little difference between morning and evening peak hours. The average accessibility in off-peak hours is far greater than that in peak hours. The average accessibility during off-peak hours is about 5.76 times that during peak hours, which means that the average distance that can be reached by bus from a station during peak hours may be less than one-fifth of that during off-peak hours in the ideal time range. According to the above analysis, the time difference of bus accessibility is basically consistent with the performance of the Harbin urban traffic congestion index in different periods. In addition, the average off-peak accessibility shows the trend as “falling down first and rising up later in one week,” especially the lowest point on Wednesday. It is an episodic condition that can be linked to a day of the week when something happens, such as bad weather or a big event, but the exact cause is unknown due to data limitations, which will be discussed in future research. The main purpose of this graph is to show the difference in accessibility between peak and off-peak hours, not the fluctuation of accessibility within a week. Furthermore, the one-week average of accessibility is used for model validation and bottleneck site identification below, which weakens the unanticipated event’s effect.

3.2.2. Validity Analysis Based on Differences in Accessibility of Line

In this paper, equation (5) is employed to calculate the accessibility of all bus lines in urban areas, which helps evaluate the service level of all lines. The model’s validity is also analyzed according to the accessibility difference between different lines and different stations on the same line. Bus lines No. 58 and No. 31 in Harbin are selected as examples, both of which run at Harbin’s relatively prosperous downtown area, with large travel demand, large time, space span, and high research value. According to the calculation results of the model, the accessibility of the No. 58 bus line is better than that of the No. 31 bus line. For example, from Nanzhi Road Station to Haxi Wanda plaza Station, the accessibility values of the two lines are 162,839.2 and 135,167.5, respectively. This result is basically consistent with the comparison result of the time impedance provided by the Gaode map. The time impedance of the former is 46 minutes, which is obviously superior to the latter’s 63 minutes. For bus line No. 31, its accessibility distribution is bounded by Harbin Railway Bureau located near the center of Harbin with a dense population and heavy traffic, which has a more prominent influence on public transportation, which is consistent with the traffic congestion in Harbin.

3.3. Bottleneck Stations Identification Based on Accessibility

The ultimate goal of this study is to employ the quality control algorithm to find the bottleneck stations in the line according to the accessibility of the same line station and provide the focus for the subsequent optimization work. As a public welfare infrastructure, public transportation serves the largest users. Therefore, the evaluation of public transportation services must be based on the size of travel demand. Because the number of POI facilities reflects the level of travel demand in this area to a certain extent, the station is used as a buffer with a radius of 500 m, and the number of POI facilities is used as an adjustment parameter the quality control algorithm. According to equation (5), this paper takes Harbin No. 31 bus as an example for bottlenecks analysis. Among them, the statistical constant is 1.96, corresponding to a 95% confidence level.

As shown in Table 2, all bottleneck stations of the No. 31 bus are identified by judging the degree to which test data meet the requirements of quality standards. For example, in terms of Haxi Wanda plaza, its accessibility is 3366.1, significantly lower than the lower limit of 4465.98, so it is defined as the bottleneck station. Among them, the normalized travel demand of Heilongjiang University Station is 0, which cannot be calculated through the above model. Because it corresponds to the lowest travel demand at this time, the upper and lower limits are the minimum values of other stations. At the confidence level of 95%, 17 bottleneck stations are located, which exceeds 50% of the total stations and explains the operation unreliability of the No. 31 bus from the side. Similarly, according to the above method, whether the stations inside all bus lines in Harbin are abnormal or not are identified. Starting from the bottleneck stations, optimizing the bus shifts and other related indicators will help to improve the accessibility of the whole public transportation system more accurately.

4. Discussion and Prospect

4.1. Discussion

(1)This paper explored a novel model for evaluating the urban bus accessibility from the microperspective by using GIS software based on the AVL data, which realized the calculation of bus accessibility on different scales and accurate identification for the bottleneck stations with the worst balance between demand and supply in order to effectively optimize the level and fairness of urban public transport service. The paper employed the chance method and bus trajectory data to describe the buses’ ability to meet travel demand, dynamically monitoring and tracking the variation of service level temporally and spatially. Furthermore, this paper took bus stations as the research object, took the interstation section time as the control index, and put forward a new measure index of station accessibility, called the farthest direct reachable distance. This index indirectly reflected the service level of the station by measuring the travel opportunities available, provided a basis for finely identifying the bottleneck points of the public transport network, and gave some reference for travel choices of travelers. The Tanimoto coefficient was finally used to discuss the coupling relationship between public transport accessibility and external development factors and describe the unbalancedness of the layout of bus facilities and bus service.(2)Quality control theory was selected to establish a multiparameter evaluation probability model for identifying bottlenecks because the model systematically considered all stations of one line, which could solve the problem of confidence level and dynamically guarantee the implementation of the identification system. Furthermore, the paper improved the evaluation method of travel demand to reduce the difficulty of data collection and calculation, based on the point of interest instead of bus ridership data.(3)Taking Harbin as an example, based on the AVL data within one week, we calculated the accessibility of stations and routes in the whole city during the morning and evening peak periods and the flat peak periods. ① Based on the spatial-temporal difference of accessibility and the coupling with congestion state, we verified the effectiveness of the refined bus accessibility model based on AVL data at the station scale. In engineering practice, the accessibility of different stations and lines could be deployed to optimize the public transport network, which provided some reference for actual traffic planning. ② Through comprehensive analysis of bus accessibility in different dimensions, there was little difference in bus accessibility between morning and evening rush hours, while the average accessibility in peak hours was about five times that in peak hours. There were significant differences in the accessibility of public transport in different areas of Harbin, with the maximum difference exceeding 30 times. The development of public transport performed obvious regional imbalance, but this imbalance was related to social and economic development. ③ For the direct quality control of bus stations, the statistical probability was 1.96, and then it could be known that there were 17 bottleneck stations in the No. 31 bus route of Harbin at a 95% confidence level. After joining the quality control of public transport accessibility, the bottleneck stations could be identified under different confidence levels, and the error range could be given.(4)In general, this method is superior compared to other accessibility methods and its importance in urban traffic network or bus stations planning. ① Finer scale: Previous studies mainly evaluated the accessibility of public transport from the regional perspective. In this paper, the accessibility of a single station is modeled, forming a complete evaluation system from point to line to surface. In this way, the planning and management department can clearly grasp the rationality of the infrastructure layout and provide a reference for the next optimization strategy. ② More practical: There are few models about bus bottlenecks in existing research, and most of them stop at accessibility evaluation. With the help of a quality control algorithm, this paper realizes the location of weak points in public transport networks from the perspective of accessibility. Researchers can make more accurate transportation policies and management plans based on this. In particular, the probability value of each bottleneck site is given, which makes the identification model more reliable and flexible. More importantly, users can plan their travel strategies more reasonably according to the accessibility of each stop and whether it is a bottleneck or not and even change the selected departure bus stop.③ Smaller amounts of data: We only need to obtain the AVL data within the research area without assistance from other multisource spatial data, which can be conveniently complicated through ArcGIS. In addition, the model in this article does not need to forecast traffic congestion and other socioeconomic factors, which makes the application process more straightforward and saves time. ④ No complicated algorithm: We just need to calculate the departure and arrival times for each bus station, which are the simplest indexes in evaluating buses. Compared to other intricate algorithms, including nested logit “logsum” [8], comparative assessment [17], 3SFCA [22], online route planning algorithm [29], and so on, the model proposed in this manuscript does not require any special design algorithm and has a low threshold, which is suitable for different cities and has better application prospects.

4.2. Prospect

With the help of urban computing and big spatial data, it is easy to get all the data sources of bus accessibility evaluation. Every certain period, the repeated accessibility calculation process can dynamically evaluate and track the improvement degree of bus system services in time and space. By promoting the development of urban public transport, this method can realize the optimal allocation of urban public transport resources, thus improving the attractiveness of public transport to the greatest extent and increasing the sharing rate.

There are still some problems to be further discussed: (1) The relevant conclusions of this paper can only be verified by the temporal and spatial differences of urban public transport accessibility and the general characteristics of coupling. Compared with the traditional accessibility calculation model based on bus trajectory data, the accuracy and effectiveness of this method need to be further discussed. (2) Our primary concern is the accessibility of traditional public transportation while ignoring the influence of other shared transportation, such as bike-sharing and the subway under construction. (3) Although the peak and off-peak hours are distinguished, the influence of other objective factors (nonworking days, weather, etc.) on accessibility and bottleneck stations is not considered. Future research should consider the internal relationship between different vehicles in detail and attach importance to connection accessibility to find out the causes of bottlenecks. In addition, future research should be analyzed in different demand scenarios to establish a more stable and reliable bottleneck identification model.

Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.

Acknowledgments

This work was supported by the National Natural Science Foundation of China under Grant 42171451.