#### Abstract

To better understand the mechanism of air traffic delay propagation at the system level, an efficient modeling approach based on the epidemic model for delay propagation in airport networks is developed. The normal release rate (NRR) and average flight delay (AFD) are considered to measure airport delay. Through fluctuation analysis of the average flight delay based on complex network theory, we find that the long-term dynamic of airport delay is dominated by the propagation factor (PF), which reveals that the long-term dynamic of airport delay should be studied from the perspective of propagation. An integrated airport-based Susceptible-Infected-Recovered-Susceptible (ASIRS) epidemic model for air traffic delay propagation is developed from the network-level perspective, to create a simulator for reproducing the delay propagation in airport networks. The evolution of airport delay propagation is obtained by analyzing the phase trajectory of the model. The simulator is run using the empirical data of China. The simulation results show that the model can reproduce the evolution of the delay propagation in the long term and its accuracy for predicting the number of delayed airports in the short term is much higher than the probabilistic prediction method. The model can thus help managers as a tool to effectively predict the temporal and spatial evolution of air traffic delay.

#### 1. Introduction

Flight delays are one of the most important performance indicators of air transportation system. It has become increasingly more serious, which directly causes huge damage to the quality of civil aviation services, such as declines in operational safety, increases in operating costs, and more serious environmental pollution. Notably, commercial aviation players understand delays as the difference between scheduled and real times of departing or arriving fights [1]. According to the Federal Aviation Administration (FAA), a flight can be considered as delayed if the operation takes place 15 minutes after schedule [2].

The delay of an individual flight seems to be random at a glance. A flight delay may be transferred and amplified by consequent operations. Some delays that originate from upstream flights spread to downstream flights, which is particularly evident when an aircraft flies multiple flight legs. This phenomenon is defined as delay propagation (DP) [3–6]. Actually, DP causes delays to obey certain statistical laws [7] when long-term delay records for a large number of flights are taken into account, which form airport delay propagation. Consequently, a congested airport may propagate delays to connecting airports through the delayed flights, which eventually has an impact on the performance of a significant part of the entire network [8]. The delay can be magnified when it is examined in multiairport networks [9].

There are many factors affecting airport delays. The factors can be divided into two categories: propagation factors (PF) (airport delays caused by connected delayed airports) and nonpropagation factors (NPF) (airport delays caused by original factors, such as extreme weather and equipment trouble) [10]. As the number of flights increases, increasingly more airport delays are caused by the PF [11]. If there are many delayed flights in one airport, the connected airports may become delayed, which can affect further operations in a cascade-like effect. Due to the complexity of air transportation, the mechanisms of delay propagation at the airport level are not fully understood. Therefore, research on the mechanism of delay propagation is timely yet challenging.

Delay spreading has received lots of attention from the Air Traffic Management (ATM) community during the last decade. Some studies [12–14] have established flight delay propagation models based on Bayesian Networks and analyzed the internal factors influencing air traffic delay propagation. Pablo Fleurquin [15] introduced an agent-based model that reproduces the delay propagation patterns observed in U.S. performance data and identified passenger and crew connectivity as the most relevant internal factor contributing to delay spreading. Qiu [16] constructed a joint distribution of continuous flight delays by using the 2D copula function. Wong and Tsai [17] established a cox proportional hazard model for flight delay propagation based on survival analysis theory. Kalfe and Zou [3] promoted a delay propagation pattern based on an econometric method and analyzed factors by using the Heckman two-step method.

To the best of our knowledge, gaps still remain in understandings of the delay propagation in airport networks. The process of delay propagation needs to be analyzed from a broader and network-based perspective because flight scheduling for airlines and airport operations is increasingly synchronized from the perspective of network operation. The linkage in the airport networks is the direct operation by the airlines linking the airports [18]. It is an essential feature of the network structure of the air traffic system. Therefore, the propagation dynamics cannot be understood without referring to the underlying complex network structure. Some scholars have used complex network theory to characterize transportation [19–22]. The initial studies [23, 24] have identified a high heterogeneity in the traffic sustained by each edge. Most of the airport network exhibits a heavy-tailed degree distribution, which is often well approximated for a significant range of values of degree by a power-law behavior (), from which the name “scale-free network” originated [25–29]. Further studies have used the complex network characteristics to explain the propagation of air traffic congestion and flight delay [29–34]. From a macroperspective, the airport delay is usually caused by air traffic congestion. Thus, the application of the complex network theory to the air traffic problems is feasible. And most of them focused on the delay propagation between sequence flights from the perspective of single airport operation.

As aforementioned, due to the large number of airports and complex interactions, the features of delay propagation cannot be understood based on the information of an individual airport. Complex network theory and its associated metrics and tools present an opposite approach to study the air transport system beyond what is offered by classical techniques. To further understand the effects of delay propagation, most of the existing achievements focused on the delay propagation between sequence flights from the perspective of single airport operation. In this paper, we propose a network-based approach to modeling airport delay propagation. There are several classical epidemic models in complex network theory, such as Susceptible-Infected (SI) model, Susceptible-Infected-Susceptible model (SIS), Susceptible-Infected-Recovered (SIR) model, and Susceptible-Infected-Recovered-Susceptible (SIRS) model. Because the propagation mechanisms of SIRS is the most similar to that of airport delay propagation(the details can be seen in Section 4.1), the SIRS model is utilized to understand the process of air traffic delay propagation in the context of an airport network and explain the spreading characteristics between different airports in this paper. The SIRS model has been normally used to simulate the process of how diseases [35], safety risk [36], or computer virus [37] spread.

To the best of the authors’ knowledge, this study is the first to apply the SIRS model to air traffic delay propagation. Firstly, the metrics for measuring airport delays are introduced. Then, the fluctuation of airport delay is studied from different time scales based on complex network theory in order to find out the propagation factors, PF or NPF. Then, an integrated airport-based SIRS (ASIRS) model is developed. At last, the effective and accuracy of this model is demonstrated using empirical data of China.

The outline of the remainder of this paper is as follows. The data sources and a measurement of airport delay are provided in Section 2. Section 3 is devoted to fluctuations of the average flight delay and the determination of airport delay status. The ASIRS model is established in Section 4. The data-driven description of the ASIRS model is presented in Section 5. Finally, Section 6 contains the conclusions.

#### 2. Data Description and Airport Delay Measurement

The dataset analyzed in this paper is provided by the Civil Aviation Administration of China (CAAC) and consists of all flight information of China from June to December in 2015. There are totally 93630 records of flights. The number of carriers in the data is 295. The information from the historical flight data consists of flight ID, date of flight, real and scheduled departure (arrival) times (Beijing Time), origin, and destination. A short sample of the original database is shown in Table 1.

There are 205 domestic airports in the database. A sample of airports are shown in Table 2.

The identification of the airport delay state is the first problem to be solved. As we know, the delay state of an airport is the concentrated performance of individual flight delay. Thus, the airport delay is measured by the delays of arriving and departing flights.

In our study, the normal release rate (NRR) and average flight delay (AFD) are considered to measure airport delay:(1)The NRR of an airport is the ratio of the number of normal released flights to the total number of departure flights. According to the Federal Aviation Administration (FAA), a flight is considered as abnormal if the departure operation takes place more than 15 minutes after schedule: where represents the number of departure flights and represents the number of abnormal flights.(2)The AFD of an airport is the ratio of the total delay time to the total number of all the departure and arrival flights of the airport: where represents the number of arrival flights and represents the delay of flight .

The NRR and AFD are counted every 1 hour using the above database. Partial statistical results are shown in Tables 3 and 4.

#### 3. Fluctuations of Airport Delay and Determination of Airport Delay Status

##### 3.1. Fluctuations of Airport Delay

To explore the propagation laws of airport delay, the delay fluctuations should be studied from the perspective of airport networks. Fluctuations can be considered by investigating the coupling between the average flux and the fluctuations, which is actually the mean and standard deviation analysis, as developed in [38–42]. It is found that the standard deviation and the average flux on individual nodes obey a unique scaling law aswhere denotes the flux of node in time interval .

As the strength of the external driving force increases, the value of gradually increases.

A method to separate the internal dynamics from the external fluctuations of complex systems is also promoted in [38–42].

The dynamical variable can be separated into two components:where is generated by external factors and is generated by internal factors. They can be described as follows:

Furthermore, whether or not the fluctuations are mainly internally or externally imposed can be determined:

If , the system dynamics are dominated by the network-wide factors, while for , local dynamics overshadow the network-imposed changes.

We aggregate the data and carry out the scaling law analysis at different time scales. In our study, represents the AFD in the time scale of airport and represents the standard deviation.

Figure 1 shows the relationship between and , with time scale = 1 h, 3 h, and 6 h. The scaling law between and can be clearly observed.

**(a)**

**(b)**

**(c)**

It can be seen that the value of the scaling exponent increases as increases, suggesting that the system may have an inhomogeneous influence, as pointed out by Eisler and Kertesz [43]. The reason for this result is that the fluctuations of AFD are due mainly to network-wide factors such as the PF when is much bigger; on the contrary, the fluctuations of AFD are due mainly to local factors such as the NPF when is much smaller. When is much bigger, airport delays may be caused by connected delayed airports. Some delays that originate from upstream flights spread to downstream flights, which is particularly evident when an aircraft flies multiple flight legs. When is much smaller, airport delays may be caused by original factors, such as extreme weather and equipment trouble. Additionally, the ratio for the 1 hour interval is calculated using the above method. The result reveals that the average is 5.665353, which shows that the dynamic of airport delay of every 1 hour is dominated by PF. And the larger the time scale is, the bigger the value of is. When the time scale is 3 hours, the value of is 6.431234. When the time scale is 6 hours, the value of is 8.534778. Thus, we have to study how does the delay originating from an airport propagate to other airports in a large time scale.

##### 3.2. Determination of Airport Delay Status

NRR and AFD are used to determine if the airport is in the delay state. The specific criteria is as follows: If , airport is in the delay state in the time interval . To explore the characteristics of airport delay propagation, the value of the time interval is 1 hour, as the dynamic of airport delay of every 1 hour is dominated by PF. When the delay propagates in the airport network, the airport delay is usually severe. The airport delay deduced by NPF is much small in the time scale of one hour. Thus, in order to eliminate the influence of NPF, the values of threshold should be relatively large and the values of threshold should be relatively small. Here, and .

#### 4. Epidemic Model of Airport Delay Propagation

##### 4.1. Airport-Based Susceptible-Infected-Recovered-Susceptible Model

From the discrimination of delay propagation, we find that the epidemic model in a complex network is a valuable research tool for the exploration of fundamental laws and trends of delay propagation in airport networks. There are three kinds of individuals in the SIRS model: susceptible ones (*S*), infected ones (*I*), and recovered ones (*R*). The susceptible ones are currently in a healthy state, and when they contact the source of the infection, they will become infected ones with an infection rate . The infected ones are unhealthy ones, and they can infect susceptible ones. The infected ones will be cured with a cure rate and become recovered ones, and the recovered ones become healthy ones with an immunity ability. The immunity will disappear under some certain situations, and the recovered ones will become susceptible ones with an immunity-loss rate . The infectious mechanism is described in Figure 2.

In an airport network, the original airport delay may be due to capacity reduction, airport equipment trouble, and extreme weather. In the process of delay propagation for resource-shared flights, delays are propagated from an upstream flight at the departure airport to the arrival airport. As shown in Figure 3, the airports with “delay root” represent the susceptible ones, the airports with “delay propagation” represent the infected ones, and the airports with “delay termination” represent the recovered ones. The propagation of airport delay has traditionally been described as graphs with vertices representing airports and edges representing connectivity. When the delay is serious in one airport, the delay of its connected airports may be increased due to the delay spreading. Furthermore, the delay of spread airports may be absorbed in the subsequent operations, and they would not be influenced again by the same initial airport delay. However, they may be affected by another original airport delay later. Because of the complexity of airport networks [32], the evolution of delay within them possesses the characteristics of propagation in complex networks.

From the above analysis, we find that the infectious mechanism of SIRS is similar to the propagation characteristics of airport delay discussed before. Suppose there are three kinds of airports in the network at time : non-delayed airports (*S*) which are easily infected, delayed airports (*I*), recovered airports (*R*) which used to be delayed but are back to normal. The recovered airports only have immunity to the current delay spread and may become susceptible ones later. As the stochastic process is applied to all flights, the airports are affected by probability.

The dynamics of ASIRS model can be written aswhere represent the fraction of susceptible airports, infected airports, and recovered airports, respectively, at time ; is the infection rate; is the cure rate; and is the immunity-loss rate.

Assume that the proportion of infected airports, susceptible airports, and recovered airports at the initial moment is , , and , respectively:

The phase trajectory of the ASIRS model is analyzed. The S-I plane is called the phase plane, and the domain of the phase trajectory is :

Let . The following equation can be obtained:

The phase trajectory diagram is shown in Figure 4.

When , the limit values of are , respectively:(1)No matter how the initial values of change, the airport delay situation will eventually disappear, .(2)In equation (10), let . The value of can be calculated, which is the root value of equation (11) in the range of . is the abscissa of the intersection point between the phase trajectory and transverse axis in the range of :(3)If , increases first; if , reaches its maximum and then decreases to zero. At the same time, is monotonically reduced to : where is the maximum of .

According to the above analysis, the following conclusions can be drawn:(1)If , increases and the airport delay will spread to more airports(2)If , decreases, the delay situation of the airport network will be alleviated, and the airport delay will not spread to others(3)If , reaches the maximum and the delay situation of the airport network is the most serious

##### 4.2. Parameter Analysis

As mentioned above, the ASIRS model contains three parameters: ,, and . To investigate the change of over time, suppose , , and . The change of is shown in Figure 5.

###### 4.2.1. Analysis of Parameter

First, the influence of on airport delay propagation is discussed. Figure 6 shows the changes of under different values of with the assumptions that .

**(a)**

**(b)**

**(c)**

As seen, the higher is, the earlier reach their peak value and the higher the peak value is. The higher is, the earlier reaches its minimum and the lower the minimum is. Thus, it can be concluded that, as the speed at which airport delay propagates increases, more airports will be infected more quickly, and more airports will be recovered.

###### 4.2.2. Analysis of Parameter

Figure 7 shows the changes of under different values of with the assumption that .

**(a)**

**(b)**

**(c)**

When the value of becomes smaller, the peak of appears later and the trough of appears later also. The trends of with = 0.5/1 and = 0.1/0.2 are not the same. Thus, is much more complex than .

###### 4.2.3. Analysis of Parameter

Figure 8 shows the changes of under different values of with the assumption that .

**(a)**

**(b)**

**(c)**

The higher is, the earlier reaches its peak value and the higher the peak value is. The lower is, the later reaches its minimum, the lower the minimum is, and the higher is. Parameter has little effect on in the early stage and mainly affects the later stage.

#### 5. Case Study

##### 5.1. Statistical Calculation of

According to Section 3.2, we calculate based on the following criteria:(1)If , airport is infectious at time (2)If , the airport is recovered at time and infectious at time (3)Apart from the above situations, the airport is susceptible and can be easily infected by infectious airport

##### 5.2. Determination of ASIRS Parameters

The hourly states of all the airports in the network can be identified based on the above criteria using Tables 1 and 2. Taking the airport ZBNY in October 1, 2015, for example, Figure 9(a) shows the time varying state of airport ZBNY. It can be seen that the airport ZBNY is infected in {6 : 00–10 : 00, 14 : 00–20 : 00}, susceptible in {0 : 00–6:00, 11 : 00–14 : 00, 21 : 00–24 : 00}, and recovered in {10 : 00–11 : 00, 20:0–21 : 00}. We also investigate the air traffic flow of ZBNY, which is shown in Figure 9(b). Comparing Figures 9(a) and 9(b), we find that the larger the traffic flow is, the more the airport tends to be infected. It should be noted that, although the airport has a small number of flights, the flight delay is serious, which may be deduced by the delay propagation in the airport network.

**(a)**

**(b)**

In addition, we find that there are 7 infectious airports in 4 : 00–5:00: ZBAA, ZGSZ, ZUCK, ZGGG, ZSHC, ZSPD, and ZHCC. Six of these are the top 10 airports by throughput in China (the throughput of airports is provided by the CAAC), as shown in Table 5. There are many flights in the six airports. Large departure delays may influence the operations of the arrival airports, and the flights will arrive at or depart this airport with big delays. Thus, the phenomenon of delay propagation appears.

The simulation model is ASIRS established in Section 4.1, as shown in equation (7). The simulation method is that we use the real information of flights in China to calculate the values of and then calibrate the parameters of , , and . Thus, the ASIRS model for simulating the real flight data can be obtained.

Next, we calculate the values of . Partial statistical results are shown in Table 6.

The parameters in the ASIRS model are calibrated based on the statistical values of by using the numerical simulation method. For every day's traffic situation, we can always build an excellent ASIRS model. Taking October 1, 2015 (there are 16804 scheduled flights connecting 199 different commercial airports), for example, we find that the model fits the actual operation situation to the highest level when .

##### 5.3. Accuracy of the Long-Term Simulation of Delay Propagation

The values of for October 1, 2015, in Figure 10 are the actual values, which are counted based on the database. The values of in Figure 11 are the simulations of the ASIRS model for that day.

Comparing Figure 10 with Figure 11, it can be seen that the long-term trends of the actual and predicted values of are similar, although there are some differences between local values. Thus, from a qualitative point of view, we can conclude that the ASIRS model is reasonable and can describe the real situation of airport delay propagation to a certain extent.

To further examine the application of the ASIRS model, we also study the characteristics of airport delay propagation derived from the degree distribution.

First, the ASISR models for the peak period of 12 : 00–19 : 00, October 1, 2015, and the nonpeak period of 00 : 00–07 : 00, October 1, 2015, are constructed. The parameters’ values of the two models are determined based on trial calculation using the actual data:(1)12 : 00–19 : 00: (2)00 : 00–07 : 00:

The values of 12 : 00–19 : 00 are bigger than those of 00 : 00–07 : 00, and the values of 12 : 00–19 : 00 are smaller than those of 00 : 00–07 : 00. According to the parameter analysis mentioned above, the conclusion can be obtained that airport delays propagate faster and wider in 12 : 00–19 : 00 than in 00 : 00–07 : 00.

The relative cure rate is the ratio of the infection rate to the cure rate :where reflects the recovery speed of delay propagation. The larger is, the more the delay propagation tends to alleviate. The value of for 12 : 00–19 : 00 is 0.625; the value of for 00 : 00–07 : 00 is 0.737. The relative recovery rate of delayed propagation in 00 : 00–07 : 00 is higher than that in 12 : 00–19 : 00. This is consistent with the actual situation.

Next, the degree and the degree distribution of the Chinese airport network in the above two periods are counted. The degree of airport A represents the number of airports with flights to or from A in the statistical time period. Thus, the same airport may have different degrees at different time periods. Partial results of the statistics are shown in Table 7.

As seen in Table 7, the maximum airport degrees of the nonpeak period and the peak period are 92 and 275, respectively. Figure 12 shows the relationship between and , with time periods 00 : 00–07 : 00 and 12 : 00–19 : 00, respectively. A scaling law between and can be observed clearly.

**(a)**

**(b)**

Figures 12(a) and 12(b) illustrate two segments that follow the power laws:(1)00 : 00–07 : 00 (2)12 : 00–19 : 00

The degree distributions are strikingly different from those of random graphs, small-world networks, and scale-free networks. First, the degree distributions of the Chinese airport network display two segments and follow the Double Pareto Law. The degree of the critical airports is approximately 53 for 00 : 00–07 : 00 and 83 for 12 : 00–19 : 00. The smaller the exponent is, the stronger the heterogeneous characteristics of the network are [44]. The exponents of 12 : 00–19 : 00 are smaller than those of 00 : 00–07 : 00, which indicates that the heterogeneous characteristics of the Chinese airport network in 12 : 00–19 : 00 are stronger than those in 00 : 00–07 : 00. Thus, airport delays during 12 : 00–19 : 00 spread over the airport network more easily, quickly, and widely.

Obviously, the results from the ASIRS model and the degree distribution are consistent, which further indicates that the ASIRS model is well suited for characterizing airport delay propagation.

##### 5.4. Accuracy of the Short-Term Prediction of Delay Propagation

As mentioned above, the ASIRS model can describe the characteristic of delay propagation in the long term, but the forecast accuracy is not very good (seen in Figures 10 and 11). How about the forecast accuracy of the ASIRS model in the short term?

To gauge the forecast accuracy, we introduce the probabilistic prediction method, which is common to delay forecast [45]. Intercepting the delay period (the period is usually no more than 4 hours) on a typical day, we forecast the number of delayed airports by constructing ASIRS models for every 4 hours. The prediction results are shown in Figure 13.

We use the Root Mean Square Error (RMSE) and Mean Absolute Percentage Error (MAPE) to compare the accuracy of the two methods.where is the actual value in the time interval , is the forecast value in the time interval , and is the number of time intervals.

Table 8 shows a comparison of the two methods. It can be seen that the MAPE and RMSE of the SIRS model are smaller than those of the probabilistic model. Thus, the ASIRS model is more accurate than the model based on probability in forecasting the number of delayed airports in the short term.

#### 6. Conclusion

Understanding the process and evolution of airport delay propagation is very important for both air traffic management and aviation planning. In this study, we investigated the mechanism of delay propagation among airports from a new perspective:(1)The delay fluctuations of airport networks are studied. To quantify the delay dynamics, we collected the airport delay at 199 Chinese airports and identified the existence of a certain scaling law, which indicates that the dynamic of airport delay is dominated by a propagation factor.(2)The ASIRS model for airport delay propagation is presented to reveal the macroscopic appearance of delay propagation. The modeling approach is data driven in the sense that it is based on real China performance data.(3)The long-term characteristics of delay propagation is described through building the ASIRS model. The accuracy of the short-term prediction of delay propagation is also examined.

It is remarkable that the airport delay is the result of the coupling of different factors, and there is no information on delay factors in the datasets. We cannot determine a delay is caused by which factors. Thus, we study the delay propagation from the overall delay data and simulate the overall delay without considering the specific factors.

Our ongoing work involves further calibration and validation of the ASIRS model. It is interesting to compare the epidemic model of airport delay propagation in different countries and investigate the practices of the countries. We will come up with insights for mitigating airport delay from such international comparisons.

#### Data Availability

The data used to support the findings of this study have not been made available because the data also form part of an ongoing study.

#### Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.

#### Acknowledgments

This study was supported by the National Natural Science Foundation of China (nos. 71801215, 71671014, and U1833103), Natural Science Foundation of Tianjin (nos. 18JCQNJC04300), and Fundamental Research Funds for the Central Universities (nos. 3122016C009, 3122019129, and 3122018D026).