#### Abstract

Performance analysis of transport systems usually requires transfer of passengers between trains, cars, or buses, among others. This paper proposes a methodology for modeling and analysis of bus transportation using link streams. Link streams are particular cases of stream graphs whose cliques provide information about available time intervals for connecting buses. These cliques are obtained by algorithms of the literature with good scalability. They are used to quantify performance indicators as transfer time, bunching, congestion, and number of transferred passengers. The results are obtained for real-world data of a bus terminal in the city of Curitiba, Brazil. They reveal important issues regarding transfer delays and available capacity for transport. The proposed performance analysis can be used to support urban planners on planning and improving transport operation.

#### 1. Introduction

The operation of urban transport systems has been considered in the literature using different approaches [1–6]. These works model and simulate transport systems, particularly for planning schedules to improve system efficiency. The authors of [1, 2] developed a computational framework for modeling a transport system based on planning, station design, integration, and access and evaluated their effects on system performance. In [3] a parallel operation management system has been developed to detect the number of passengers at stations as well as traffic flow and queuing length of vehicles in private lanes. This system provides information to deal with transportation management by improving and optimizing emergency situations or rescheduling vehicles based on conditions detected in videos of traffic. In [4–6] simulation and intelligent support based on artificial intelligence techniques are used to help control center operators to take strategic decisions about a fleet of buses in an urban network.

Recently, transportation companies have invested in technologies for online supervision of operations and resources [7, 8]. For instance, a two-way communication in [7] between bus drivers and operators allows solutions using negotiation-based strategies. In [8] a global positioning system (GPS) is integrated with wireless communication to track vehicles in real time. All these approaches utilize information and communication technology (ICT) to predict, simulate, and control different situations in transport system. In this sense, the available online information can be used for planning, controlling, and coordinating transit of buses in a reactive manner. However, formal methodologies are still necessary to analyze and detect issues in advance.

This paper aims to analyze multiple connections in a bus terminal. By means of an innovative performance analysis method based on link streams [9], we model a transport system of multiple bus line connections in a bus terminal (or hub). Link streams are particular cases of stream graphs which are graphs whose nodes and edges (links) have a time to live. A link stream is a stream graph in which only edges have a time to live; i.e., nodes are permanent. A detailed introduction of stream graphs and link streams can be found in [10]. Links streams have been used in many applications. For instance, nodes and edges in a graph may represent individuals connected by a phone call which persists for a given period of time. The authors of [11] use link streams to capture dynamics of contacts between individuals. They consider activities as link streams to predict the number of links that occur during a given period of time and show that a combination of structural and temporal characteristics of the link stream leads to performance improvements. In [12] a trace of real-world interactions between individuals captured with RFID sensors technology was collected in a high school in France during approximately 8 days of 2012. The study of links created and destroyed during these days has revealed new patterns of human interaction in a different time scale. Another important issue in link streams is computing cliques. A clique is a set of nodes such that any 2-combination of these nodes is connected by an edge. A first algorithm to compute maximal cliques in a link stream was proposed in [9] and implemented in [13]. Recently, it was improved in [14, 15] by adapting the Bron-Kerbosch algorithm. Moreover, the authors developed an algorithm that detects maximal time intervals during which interactions occur. This notion was initially introduced by [9] using instantaneous link streams and extended to link streams with duration by [16].

Our main contribution is to define a set of procedures to represent arrivals and departures of buses in a link stream representation whose cliques provide the necessary information to compute performance indicators as transfer times, bunching, congestion, and number of passengers transferred between buses. To the best of our knowledge, this is the first time link streams are used to model and analyze transport systems.

The paper is organized as follows. Section 2 presents the background on link streams and cliques. Section 3 shows how link streams and cliques model a bus transportation system and defines indicators for the performance analysis. Section 4 presents the case study of Curitiba’s public transportation with conclusions in Section 5.

#### 2. Link Streams

A simple undirected graph is an ordered pair with a set of nodes and a set of links with for , and means a set of unordered pairs of nodes). In this case, nodes and are linked together in .

Stream graphs and link streams are graphs with time information. A comprehensive study on stream graphs and link streams can be found in [10]. A stream graph is a tuple such that is a time interval, is a set of temporal nodes, and is a set of links. If then and . This means that nodes and and a link exist at time . A stream graph is then a graph whose nodes and links exist at a given time . If then all nodes exist for all times in , and is a link stream.

Links streams are used to model interactions between individuals or objects over time as meetings in social networks or email exchanges [12]. In this paper, we use link streams for modeling transport systems because nodes (bus stops) do exist during the whole period of time considered for analysis. However, links between nodes only occur at given time intervals during which passengers are transferred.

In a simple undirected graph , a clique of is a set of nodes which are connected to each other; i.e., for all , there exists a link . A clique is maximal if there is no other clique such that .

A similar definition is used for a stream graph . According to [10], a clique of is a subset of such that all pairs of nodes involved in are linked in . Because nodes in stream graphs are not present all the time in general, it is necessary to distinguish compact cliques which are subsets of whose nodes are linked all together during a given time interval. This is the case for link streams which are compact stream graphs [10] as all nodes are present within . Therefore, a (compact) clique in a link stream is given by a time interval and a subset of nodes which are linked all together during this time interval. Similarly, a clique is maximal if there is no other clique such that . Figure 1 shows examples of cliques in a link stream .

An algorithm has been proposed in [17] to detect maximal cliques in link streams. It is used here to compute operational performance metrics of bus transport in a terminal as described in the following. It is also important to note that can be a continuous or discrete time interval without loss of generality [10].

#### 3. Modeling Bus Transportation Systems with Link Streams

Based on bus timetables we identify time intervals during which buses are in the terminal for boarding and alighting of passengers. Several buses arrive at and departure from the terminal within this time interval, and a common time interval can be built to characterize the transfer time between connecting buses. This relationship between buses at the terminal is the basic information we need to build a link stream and compute maximal cliques according to Figure 2. It is important to emphasize that the relationship between buses of a terminal is established since a bus is detected at the terminal in a given time interval. Link streams capture only the relationship between buses present in the terminal at the same time. The GPS data of buses (in the case of online monitoring systems) is only used to filter the time intervals when a bus is at the terminal as explained in Section 4.

The first two steps of Figure 2 generate Tables 1 and 2. They represent an example of arrivals and departures of buses for lines 1, 2, and 3 during a time interval of 30 minutes in the terminal. The data set of arrivals and departures is shown in Table 1.

The information of Table 1 is then used to build Table 2 which contains only minutes when more than one bus is in the terminal. According to Table 2, there are buses from lines 1 and 2 in minutes 3 to 4, 16 to 18, and 26 to 30. A similar reasoning is made for lines 1 and 3. Actually, all three lines have buses in the terminal during minutes 26 and 27. This is exactly the information of link stream needed to compute cliques. The information of Table 2 is then used as input to the algorithm of computing cliques.

The corresponding link stream can be visualized in Figure 3 using a computational tool [18]. Nodes in Figure 3 correspond to buses of lines 1 to 3 (vertical axis), and edges connect two buses in the terminal at the same time (horizontal axis). The correspondence between the information of Table 2 and the visualization of Figure 3 is straightforward.

The maximal cliques shown in Table 3 are then obtained by the algorithm presented in [17]. The clique is the largest time interval having buses from lines 1, 2, and 3 all together in the terminal. There is no other maximal clique in* L* involving 3 nodes (lines), but there are other maximal cliques with less than three nodes as or , for instance.

It should be noted that the time interval of clique is the intersection of intervals of cliques and . It means that cliques are useful for both local analysis involving transfers between two lines and global analysis with more than two lines as detailed below.

A clique is the right structure to capture all relationships between buses in the terminal from a common structure (link stream) that contains only information about a 2-combination of buses during a given time interval. A clique provides information about not only which buses are present in the terminal but also during which time interval. It is important to emphasize that computing cliques is computationally intensive, and some of the performance metrics proposed below could be obtained with less complex structures (congestion, for example) than cliques. However, cliques are computed only once for a given set of lines and time horizon. The proposed performance measures are then filtered from clique information as explained below.

##### 3.1. Performance Metrics

The performance metrics we are interested in for evaluating a transport system are (i) time interval for transfers between two particular lines, (ii) bunching of buses of the same line, (iii) congestion of buses in the terminal, and (iv) number of passengers who are successfully transferred between local and express lines. These performance metrics can be computed using link streams and cliques as follows.

###### 3.1.1. Time Interval for Transfers

Cliques provide the necessary information about how many and during how much time buses from different lines are in the terminal. They involve the computation of the maximal common interval during which a meeting of one or more buses occurs and allows transfer of passengers. Table 4 shows cliques for all eight lines operating in the terminal between 7:00 and 7:30 am. This is the basic information necessary for the performance analysis accomplished here.

The transfer times between lines 1 and 2, for instance, are obtained from Table 4 in a two-step procedure. The first step consists of selecting cliques of Table 4 containing only lines 1 and 2 as shown in Table 5.

The second step obtains maximal nonoverlapping time intervals containing nodes . According to Table 6, these intervals are , , and with durations of 2, 3, and 4 min, respectively. They represent the corresponding transfer times for these lines.

This procedure can also be done for lines 1 and 3, or any set of lines. In other words, transfer times for connecting buses can be obtained by filtering cliques computed for all bus lines to select lines of interest and nonoverlapping time intervals.

###### 3.1.2. Bunching

Bunching is a phenomenon characterized by a concentration of buses in a single area. It is detected here when the headway between two consecutive buses of the same line is less than 1 minute. Usually it occurs due to delays in departures or arrivals of buses and reduces the transport efficiency. When bunching occurs, buses eventually leave the terminal almost empty or completely full of passengers. At certain times of the day (peak traffic of buses or passengers), it is observed that bunching events occur due to a large number of passengers to be transported or high frequency of buses. This effect causes performance issues as poor quality of service and excessive waiting time as pointed out by [19–22].

In this work, bunching detection is done by distinguishing buses of the same line and assuming that link streams are built using a sampling time rate of 1 min or less. This way, headways equal to or less than 1 min can be captured. A bunching event is then identified by taking cliques of the same line buses.

For instance, the link stream shown in Figure 4 considers only buses of line 1 in the terminal from 7:00 to 7:30 am with different labels assigned to different buses. Labels begin with a line number followed by two digits of the bus number. For instance, label 104 means bus 04 of line 1. Similarly, labels 107, 109, 111, 120, 121, 126, and 130 represent eight different buses of line 1.

According to Figure 4, buses 104, 120, and 126 are in the terminal at 7:08 am causing a bunching event in line 1 during 1 min. All these bunching events are detected by maximal cliques as shown in Table 7. The greatest bunching events are composed by buses 111, 120, and 126 during 4 min between 7:09 and 7:12 am (in bold).

###### 3.1.3. Congestion Analysis

The congestion analysis is obtained by the number of buses that are in the terminal at a given time interval. This information is also given by cliques by using different labels for different buses of the same line and computing the maximum number of buses at each time. Table 8 shows the number of buses in the terminal minute by minute from 7:00 to 7:30 am by taking the cliques of Table 4 but marking buses of the same line with different labels. This information is used to monitor the terminal occupation as detailed in Section 4.

###### 3.1.4. Number of Transferred Passengers

In order to estimate the number of passengers that are successfully transferred from local lines to the express line it is necessary to have a model for passengers’ mobility. This model should take into account the time interval necessary for transferring of passengers between bus stops (required transfer time) as well as the time interval during which buses are currently in the terminal (available transfer time).

The required transfer time of passengers is based on three components: (i) passengers’ boarding time, (ii) passengers’ alighting time, and (iii) walking time necessary for passengers moving between alighting and boarding stops inside the terminal.

The walking time is computed from the distance between drop-off and boarding areas divided by the standard walking speed of passengers. According to [23], the average reference for walking speed is 1.20 m/s for pedestrians. Boarding and alighting times per passenger depend on bus model [24] as shown in Table 9.

The transfer time* tt* is then given by the boarding time* tb* plus walking time* tw* and alighting time* ta* as shown inThe total number of passengers transferred in a given time interval is computed in four steps:

(i) Identify the time interval when the express line is in the terminal

(ii) Identify all local lines that are able to transfer passengers to the express line within this time interval

(iii) Calculate the required transfer time

(iv) Verify how many passengers have been transferred within the available transfer time

The first step is performed from maximal cliques involving all local and express lines in a given time interval as shown in Table 10 for the express line 1 (buses 101 and 110).

The goal of step 1 is to find the time interval during which the express line is at the platform and the corresponding transfer time available for each local line arriving within this time interval. According to Table 10, line 1 (101 in boldface) starts a transfer at time 6:02 and ends at time 6:07 (the first five rows of Table 10). It means that line 1 (using bus number 101) is available for transfer during a total time of 6 min (interval ). Moreover, three events representing arrivals of local lines occur at times 6:02, 6:04, and 6:05 (second column of Table 10 in boldface). They are important to compute how much time each local bus has to transfer passengers until express line departure. This is captured in step 2 according to Table 11.

In Table 11, time intervals are extended to the corresponding express departure time (6:07 am for bus 101 and 6:12 for 110, respectively, in boldface). Moreover, only cliques with the greatest number of nodes are kept among cliques with same initial time. This is equivalent to compute all buses that arrive at same time. For instance, data from cliques and reduce to set of buses arriving at 6:04. They have time until 6:07 am to make transfers (second row of Table 11).

The final step simply extracts the maximum time interval for each transfer according to Table 12. For instance, the maximum transfer time between buses 201 and 101 is 6 min (first row of Table 12).

Therefore, Table 12 shows all available transfers (lines and transfer times) between local and express lines occurring from 6:00 to 6:10 am. This information will be used in Table 13 to compute how many passengers will be successfully transferred to the express line. It is done based on the current number of passengers that arrive at the terminal from local lines, the required transfer time computed by (1), the available transfer time given by Table 12, and the express bus capacity.

The first column of Table 13 contains the express bus number waiting at the boarding platform. The second column shows the local buses that arrive at the same time with the respective number of passengers in the third column. The next two columns show the required and available times for transferring passengers between local and express lines, respectively. The last three columns show how many passengers need to be transferred to the express line, the express bus capacity, and the corresponding balance of passengers (negative values mean that the express bus is full), respectively.

According to Table 13, while the express bus 101 is waiting at the boarding platform, a total of 151 passengers from buses 201, 601, 808, and 404 need to be transferred. The next express bus 110 needs to board 134 passengers from buses 502, 703, 806, and 604. Note that both express buses depart with available places. In addition, all passengers are successfully transferred as the available transfer time is greater than the required transfer time. It can be seen from Table 13 that transfers from buses 404 to 101 and 502 to 110 (in boldface) are very sensitive to delays. If bus 404 is delayed for about 1 min, for instance, the available transfer time decreases to 2 min and approximately only 33 passengers will be transferred. The other 16 passengers should wait for the next express bus.

#### 4. Case Study

Our methodology models the transport system by means of link streams considering bus lines running all day long. Cliques are then obtained from link streams which allow carrying out several analyses. This is an advantage in terms of scalability as described in the following.

Our case study is taken from Curitiba’s public transportation. Curitiba is a city in Southern Brazil with approximately 2 million people and around 2,200,000 passengers transported daily. Curitiba’s public transportation has become known due to its Bus Rapid Transit (BRT) system [3]. In BRT, buses run in private lanes connecting terminals which are served by local bus lines connecting neighborhoods. There are five types of bus lines: fast express lines, express lines, direct lines, interneighborhood lines, and local lines. Express lines are mainly used to connect terminals and local lines connect the neighborhoods of terminals. About 2,145 buses run every day in the city serving 615 lines and 9,684 stops. Terminal Centenário was taken for analysis. Eight lines and 61 buses operate in this terminal every weekday according to Table 14. Link streams are obtained for June 06, 2016, from bus schedules provided by [25]. In Table 14, line number 1 is an express line which connects terminal Centenário to terminal Campo Comprido. The other lines 2 to 8 are local lines connecting the neighborhood of terminal Centenário.

##### 4.1. Available Transfer Time for Connecting Buses

Figure 5 shows the corresponding link streams generated by lines 1 to 8 for the peak periods of 6:00 to 8:00 am, 12:00 to 2:00 pm, and 5:00 to 7:00 pm, corresponding to 120 min in Figures 5(a)–5(c), respectively. It can be seen from Figure 5 that buses are more concentrated in the morning than in the evening or afternoon in the terminal. This behavior is also highlighted in Section 4.3.

**(a)**

**(b)**

**(c)**

The corresponding cliques obtained for all bus lines during the day according to Section 3.1.1 provide the necessary information to build the transfer time distribution between connecting buses (from 2 to 8 buses) as shown in Figure 6.

According to Figure 6, most connections occur with transfer time of 1 min involving 2 to 6 lines. There are two transfers of 1 and 2 min involving eight lines and 41 transfers of 1 min involving 5 lines. Transfers of more than 7 min occur but they are not relevant. Table 15 shows the average transfer time and respective standard deviation of data in Figure 6.

Two particular transfers of passengers are important as they connect the current terminal with two other terminals. Figure 7 shows the distribution of transfer times between lines 1 and 6 for the period of all day long. It can be seen that transfer times of 3 min are more often with mean of 4.8 min and standard deviation of 2.26 min. The cumulative distribution shows that 50% of transfer times are below 4 min.

Figure 8 shows the distribution of transfer times between lines 1 and 8 for the period of all day long. It can be seen that transfer times of 3 and 7 min are more often with mean of 6 min and standard deviation of 3.29 min. The cumulative distribution shows that 50% of transfer times are below 6 min.

These transfer times will be used to compute the number of transferred passengers between local and express lines as described in Section 4.4.

##### 4.2. Bunching

The analysis of bunching was made for the express line 1. This line serves passengers who are transferring from neighborhood lines to the central city. Moreover, bunching in line 1 should be avoided as the platform does not allow more than one express bus at the same time. It means that if an express bus is boarding/dropping off passengers and a new bus arrives then it must wait for using the platform. Although there are 31 buses in line 1 according to Table 14, only 24 buses are running between 6:00 and 8:00 am. Figure 9 shows the corresponding link stream for this time interval of 120 min, and Table 16 presents the bunching events according to Section 3.1.2.

According to Table 16, two bunching events last for 9 min and a bunching of three buses lasts for 4 min (in bold). According to timetables provided by URBS [25], buses 101, 128, 104, 120, 130, 125, and 102 are extra buses; i.e., they are scheduled to increase the transport capacity at given times as shown in Section 4.4.

By removing extra buses 101, 128, 104, 120, 130, 125, and 102, Table 17 shows the remaining bunching events. This might indicate that a reschedule of timetables is necessary.

##### 4.3. Congestion Analysis

The congestion analysis is made by counting the number of buses in the terminal at certain times according to the procedure of Section 3.1.3. Figures 10–12 show the congestion generated by buses of lines 1 to 8 for the peak periods of 6:00 to 8:00 am, 12:00 am to 2:00 pm, and 5:00 to 7:00 pm (periods of 120 min), respectively.

As mentioned in Section 4.1, buses are more concentrated in the morning with a maximum of 10 buses at 06:25 am. This behavior does not repeat in the afternoon or evening when maximums of 6 and 8 buses are observed, respectively. Moreover, a total of 674, 371, and 537 buses pass through the terminal in the morning, afternoon, and evening, respectively.

##### 4.4. Number of Transferred Passengers

In this section we check if the number of buses at the terminal can handle the passenger demand arriving from local bus lines. This is accomplished by computing the difference between the number of passengers coming from local buses and the capacity provided by the express line as explained in Section 3.1.4. The express bus capacity is 250 passengers. We are considering 20 m of distance between the boarding and alighting points. The intervals for analysis are also set according to peak times of 6:00 to 8:00 am, 12:00 to 2:00 pm, and 5:00 to 7:00 pm. The number of passengers in local lines is given by historical data provided by URBS. Passengers arriving at the terminal by walking are not considered in this analysis as they represent a small part of passengers passing through the terminal.

Figure 13 shows the link stream for buses between 6:00 and 8:00 am. It is possible to visualize all connections between buses minute by minute during this time interval.

Figure 14 presents an illustrative detail (marked by an arrow in Figure 13) to show the information provided within time interval min after 6:00 am. During this time interval, passengers are able to make transfers among six buses (123, 401, 501, 601, 703, and 809) as shown in Figure 14.

Table 18 shows the number of passengers transferred from local to express lines between 6:00 and 8:00 am. It is obtained from the link stream and corresponding cliques of Figure 13 as presented in Section 3.1.4.

According to Table 18, some passengers cannot board the express bus as it is full (negative numbers in bold). The most critical event occurs when bus 115 is ready for boarding 365 passengers. In this case, 115 passengers (in bold) must wait for the next express bus 105. When it arrives, 115 plus 67 passengers that newly arrived from local buses should be boarded. This also occurs for express bus 123 which cannot board all 312 passengers, and 62 passengers (negative number in bold) must wait for the next bus 127.

Table 19 shows connecting buses in detail for express buses 115 and 127. Note that not all passengers from local buses 502 and 805 are successful as the required transfer time is greater than the available time (in bold). Only 36 passengers are successful, and 26 passengers from bus 502 and 23 passengers from bus 805 cannot make the transfer, according to (1). A similar case occurs for transferring passengers from buses 502 to 127 (in bold) when 35 passengers cannot make the transfer.

Table 20 shows the number of passengers transferred from local to express lines between 12:00 am and 2:00 pm. There are no capacity issues; i.e., no passengers must wait for the next express bus due to lack of available places. However, the required transfer time is an issue for connecting buses 807 with 112 and 808 with 116 according to Table 21.

Table 21 shows in detail that not all passengers of buses 807 and 808 are successful as the required transfer time is greater than the available time (in bold). In both cases, the available transfer time of 1 minute allows 16 passengers to be transferred. Therefore, 22 passengers from bus 807 and 37 from 808 must wait for the next express bus.

Similarly to the previous peak period of Table 20, there are no capacity issues for transferring passengers between local and express lines between 5:00 and 7:00 pm according to Table 22. However, the required transfer time is an issue for connecting buses 601 with 109 as shown in Table 23.

Table 23 shows in detail that not all passengers of bus 601 are successful (in bold). Thus, the available transfer time of 1 minute allows 16 passengers to be transferred, and 42 passengers cannot make the transfer to bus 109 according to (1) as 2.45 min longer would be necessary.

#### 5. Conclusions

In this work, we proposed a methodology for modeling and analysis of bus transportation using link streams. Link streams are temporal graphs whose cliques provide information about sets of nodes linked at given times. This concept was used to model transfers between bus lines. Basically, the methodology is divided into three steps: (i) timetables of arrivals and departures of buses in a terminal (or hub) provide information about pairs of buses waiting for boarding or dropping off passengers in a given time interval; (ii) the corresponding link stream is then built; (iii) an efficient algorithm computes the maximal cliques of the link stream. Performance metrics as transfer time, bunching, congestion, and number of transferred passengers are proposed and obtained from cliques. Particularly, the results were obtained from real-world data of a bus terminal in the city of Curitiba, Brazil. This case study revealed transfer times around 1 to 4 min which seem to be suitable for most passengers. However, the available transfer time between local and express lines can be an issue at certain times as not all passengers are successful and must wait for the next express bus. This effect can be combined with excess of passengers from previous arrivals causing a cascade effect which delays passengers’ departures and it takes time to resume the normal operation of the terminal. Bunching events are also evaluated and should be carefully considered as they can be caused either by bad bus schedules or intentionally by extra buses which increase the transport capacity during certain periods of time. It is also clear from congestion metrics of the terminal that there is more traffic of buses and passengers in the morning than in the afternoon or evening. The relevant result is that these performance metrics were obtained from cliques of link streams generated from bus timetables. This methodology has good scalability properties and can also be applied to other modes of transport.

#### Data Availability

The Curitiba public bus transport data used to support the findings of this study are available through .csv files from the corresponding author upon request.

#### Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.

#### Acknowledgments

This study was financed in part by CAPES, Brazil (Edital 013/2012-DINTER UTFPR/IFSC). The authors also thank KTH, URBS, Setransp, and the Municipality of Curitiba for technical support.