Abstract

The introduction of complex network concepts in the study of transportation systems has supposed a paradigm shift and has allowed understanding different transport phenomena as the emergent result of the interactions between the elements composing them. In spite of several notable achievements, lurking pitfalls are undermining our understanding of the topological characteristics of transportation systems. In this study, we analyse four of the most common ones, specifically related to the assessment of the scale-freeness of networks, the interpretation and comparison of topological metrics, the definition of a node ranking, and the analysis of the resilience against random failures and targeted attacks. For each topic we present the problem from both a theoretical and operational perspective, for then reviewing how it has been tackled in the literature and finally proposing a set of solutions. We further use six real-world transportation networks as case studies and discuss the implications of these four pitfalls in their analysis. We present some future lines of work that are stemming from these pitfalls and that will allow a deeper understanding of transportation systems from a complex network perspective.

1. Introduction

In recent years, the topological structure of different transportation systems has become an important topic of research. This is the result of the convergence of two different lines of work. On one hand, the improvement in computational and data storage resources has allowed the transportation research community to gain access to large amount of real data, enabling the detailed description of those systems at different time and spatial scales. On the other hand, there has been a great effort from the statistical physics community in analysing the structure and dynamics of both theoretical and real complex networks [13]. It then became clear that most complex systems, i.e., those composed of multiple interacting elements, cannot be fully understood by a reductionist approach, in which the composing elements are studied in an independent fashion. In order to understand and predict the collective (or emergent) dynamics, it is instead necessary to include information about how those elements interact between them, and about how different connectivity patterns influence such dynamics.

The convergence of both research fields has resulted in a paradigm shift in the way transportation systems are conceptualised and analysed. It became clear that these are complex systems and that the focus ought to be moved from one transportation unit (e.g., an aircraft, a car, or a bus) to the global structure of interconnections that those units generate. Consequently, the generation and absorption of delays stop being local phenomena, i.e., the result of the dynamics of a single aircraft, for becoming a propagation process conceptually similar to disease spreading. Similarly, the cancellation of a flight or the closure of an airport can be studied for their global consequences, i.e., the changes in the mobility patterns across the whole system, instead of including just a quantification of the number of directly affected passengers.

Although fruitful, this convergence is also hiding pitfalls and difficulties. These come from two fronts. Firstly, complex network theory was not developed with a specific application in mind, but it is instead a general framework for understanding interacting systems. A statistical physicist must then take into account the fact that not all complex network concepts are applicable to the transportation context and that some adaptation may be required. Secondly, even if prima facie simple, complex network theory is based on a strong mathematical scaffolding that cannot be circumvented. The transportation scientist must then be aware of many theoretical requirements, such as the application of suitable statistical tests, to ensure the obtention of meaningful results.

Within the hundreds of contributions that have appeared in the last decade about the use of complex networks to understand transportation systems, a significant number of them presents one or more problems that make it difficult to interpret their results. These problems are not limited to trivial research works: on the contrary, they can be found in recent publications and in highly respected journals. In this work we aim at fostering a debate around them, by raising awareness in the scientific community and eventually at helping developing novel solutions. For the sake of compactness, this debate has been focused on the topological properties of transportation systems, for being the most basic and easily understandable application of complex network theory. These problems have been organised around two major topics: (i) the assessment of topological properties of the networks, including scale-freeness (Section 2) and other basic characteristics (Section 3), and (ii) the study of the robustness and resilience of transportation systems, in terms of both the used metrics (Section 4) and terminology (Section 5). Six real-world datasets are further used to illustrate these pitfalls. We finally draw conclusions in Section 6.

2. Assessing the Scale-Freeness of Transport Networks

2.1. Common Pitfalls and Misleading Interpretations

Originally, two types of graphs were extensively studied: regular ones, in which all nodes have the same degree (i.e., number of connections) and random graphs, whose connectivity is completely random and thus in which node degrees follow a Poisson distribution. One of the most important discoveries in complex networks theory and the one that distinguished it from the mathematics’ graph theory is the realisation that nodes in real-world networks are not homogeneous: on the contrary, they usually display richer connectivity patterns. Specifically, it has been found that many nodes only have a handful of connections, while a few of them (called hubs) may be connected with the majority of their peers. The result is a scale-free distribution of the degree of nodes, which can be approximated by a power law [4].

Such heterogeneity in the nodes’ importance is also present in transport networks. Nodes are not all the same, with some of them being much more important than others. On one hand, this may be due to the way the network is constructed, with some nodes designed to connect different parts of the system. But it can also be the result of economical reasons, as, e.g., in the case of airports serving big cities and thus collecting a larger demand and of historical reasons, as the case of ports or of specific maritime routes, being important because of their past [5]. It is thus natural to pose the question of whether transportation networks are also scale-free.

An open topic of discussion within statistical physics is when we can confidently define a network to be scale-free (see, for instance, [6, 7]) and how this can be translated to fields like, for instance, biology [810]. Historically, such analysis has been performed by plotting the degree distribution in a log-log scale and by verifying that such distribution approximately follows a straight line. This may nevertheless be misleading, as a log-log scale flattens most perturbations, such that many different distributions may therefore seem power laws. On the other hand, a more statistically sound analysis requires two conditions: a network size large enough to span several orders of magnitude in the node degrees and the execution of a statistical test, as will be discussed in Section 2.2.

With respect to the size requirement, it is easy to see that most of the air transport networks do not fulfil it, as the number of airports in a country or even in a supranational region seldom reaches the thousands. In spite of this, scale-freeness has been claimed for the Italian [11] ( airports), Indian [12] ( airports), the Brazilian [13] ( airports), or the Chinese network [14] ( airports). The situation is even worse in the case of road networks, in which the physical nature of the graph implies that the degree of each node is limited, as, for instance, it is difficult to plan a crossroad where more than six streets converge. In spite of this, [15] compares two fits for the degree distribution, according, respectively, to a power law and an exponential function, even though the maximum degree in the network is and the minimum is .

In order to confirm the presence of a scale-freeness distribution of the degrees, the most common approach has been to resort to a graphical representation. Plenty of examples can be found in the literature, for maritime [1618], road [1921], and rail networks [8, 2229]. Beyond such graphical fit, some interesting examples may also be highlighted. Specifically, [30], while analysing the evolution of global liner shipping networks between years 1996 and 2006, reports an exponent varying from to without describing how these values were obtained. Reference [31] concludes that the maritime network is scale-free without any calculation at all: “nearly of nodes account for less than of the respective accumulative values of the degree of the nodes, just like scale-free properties”. In the analysis of urban street networks, [32] states that “the investigation of how well the fat-tailed distribution can fit power law in comparing with other distributions (e.g., log-normal and exponential) shows that no significant evidence is found for scale-free feature in the dual space”; nevertheless no statistical evidence of any kind is provided. Finally, [33] identifies several street networks as scale-free and reports a goodness-of-fit: yet there is no explanation on how this last metric is computed, making it thus impossible to reproduce these results.

Not all research works suffer from this bias towards scale-freeness, and some noteworthy examples can be found. For instance, [34] correctly discards the scale-free structure in favour of an exponential distribution of degrees for the air transportation network. Reference [35], when analysing the temporal evolution of the Brazilian air network, states that “a reasonable fit is obtained by using a stretched exponential”, although no statistical analysis is provided. Finally, [36] correctly recognises that, even though there is a “suggestive scaling behavior” in the distribution of node degrees in maritime networks, “simple models for generating scale-free statistics are not sufficient to describe these empirical networks”; similar careful observations have been made for travel demand networks at the urban scale [3740] and location-based analysis of data from social media [41].

It is clear that the claim of the scale-free nature of many transportation networks has not been supported by suitable statistical tests. It is nevertheless undeniable that nodes are not homogeneous and that some of them attract most of the connections and traffic. Thus, even if these networks are not scale-free, they still present a scale-free like structure and display a long-tailed degree distribution. How does this impact the operational analysis of the system? In other words, how do the conclusions of the previously mentioned papers have to be changed, if the networks are long-tailed instead of scale-free? In simple terms, no effect is to be expected.

In order to understand this point, one has to take into account the fact that scale-free networks are a mathematical simplification, or model, of real-world networks. Defining an exact law for the degree distribution allows finding analytical solutions to problems like the dynamics of diseases [42] or voters [43], through a heterogeneous mean field approximation. These problems can nevertheless still be analysed when networks are not exactly scale-free by means of numerical simulations. Furthermore, as node degrees are indeed heterogeneous and follow a long-tailed distribution, all subsequent conclusions will still hold, like the importance of the central airports for the delay propagation or for the robustness of the system.

In synthesis, assessing the scale-freeness of a transportation network requires a solid statistical validation. If such validation cannot be performed, for instance, because of the limited network size, it is better to avoid any mention to a scale-free topology, as this would largely be irrelevant. Putting it simply, and in spite of its lure, there is more life beyond scale-freeness.

2.2. Recommended Solution

As previously introduced, there are two problems preventing an easy assessment of the scale-freeness of real-world networks: their limited size and the fact that statistical validations of the fits are seldom performed.

As for the first issue, it has been found that even perfect fits cannot be accepted as statistically significant when the number of samples (in this case, of nodes) is below [44]; and, as a rule of thumb, scale-freeness should be accepted only when the degrees span several orders of magnitude. Therefore, not even the best statistical analysis can support the scale-free hypothesis for the Italian air transport network, composed of nodes [11] nor for a network whose maximum degree is [15].

Regarding the second issue, i.e., the design of a statistical test, we here tackle it through three different techniques. To illustrate, these techniques are applied to the airport and bus networks described in Appendix. Power law and exponential fits of the degree distribution of both networks are reported in Figure 1, while the values of the statistical tests are reported in Table 1.

First of all, one may be tempted to use the goodness-of-fit , a metric which is conceptually simple, easy to compute, and well understood in the case of linear models. It is nevertheless known that the metric is unreliable for nonlinear models, as here it does not hold that the total sum-of-squares is equal to the regression sum-of-squares plus the residual sum-of-squares. Negative results may then appear, indicating that the nonlinear fit is worse than a simple average [45], as is the case in Table 1 for the bus network. Also, the linearisation of the model prior to its evaluation, for instance, by taking the logarithm of the node degree, is not a good solution: the resulting would represent the goodness of the linearised model and not of the original (nonlinear) one [46].

A second option entails resorting to the Akaike Information Criterion (AIC), a metric estimating the relative quality of a statistical model given some empirical data [47]. The AIC is based on calculating the Kullback-Leibler Divergence (KLD) between the values yielded by the model and the real available data, for then adjusting the value to compensate for the number of free parameters, in order to avoid overfitting. It is important to note that, while effective, the AIC returns a relative value, i.e., a value that can be used to compare different models (and decide which one is preferable) but not to assess the quality of a single model. Thus AIC can be used to choose between different types of nonlinear fits, but not to assess the statistical significance of one of them.

The third and best solution requires performing a full statistical test on the model, in order to obtain a value, which is then used to accept or reject the fit. Let us suppose that a model has already been fitted, such that a function (yielding the probability of finding a node of degree ) is available. As a first step, one needs to define a distance between the fitted model and the real data, i.e., how much they are dissimilar; this can be easily done through a Kolmogorov-Smirnov (KS) statistic. Afterwards, it is necessary to generate a large number of synthetic datasets using the fitted model and for all of them calculate the corresponding KS statistic. Finally, one should count the fraction of the time the synthetic statistic is larger than the value obtained for the real dataset: such fraction will be the final value. As can be seen from Table 1, no considered fit, being it power law or exponential, succeeds to pass this statistical test, with the obtained values being in all cases very close to .

One final note should be added. In the previous analysis, the value has been obtained supposing that the model describes the full distribution of possible degrees. This is nevertheless not always the case, as, for instance, the scale-freeness can be detected within a specific range of degrees, or the best model can be a truncated power law. The creation of the synthetic datasets should then be adapted to take this into account. For instance, let us consider the case of a truncated power law, in which the scale-free nature is observed only above . The synthetic datasets should account for this fact: below they should mimic the real dataset, while above it they can be created using the fit .

The interested reader may find an excellent review of this third solution, along with some practical examples, in [44].

3. Interpreting and Comparing Topological Metrics

3.1. Common Pitfalls and Misleading Interpretations

Once a network is obtained, the next logical step is to calculate a set of topological metrics to assess specific aspects of the structure, including the presence of triangles (i.e., transitivity or clustering coefficient), connectivity, and so forth. It is nevertheless important to understand how these values are affected by the network size, especially when one needs to compare multiple systems.

Let us explore this issue through a simple example. An important metric for a transportation system is the efficiency, defined as the inverse of the harmonic mean of the geodesic distance between nodes [48]:

being the distance between nodes and and the total number of nodes in the network. The efficiency measures how fast information (or any other element) can be transmitted in a network; thus, a value close to one indicates that most passengers can move between two nodes by means of direct connections.

It is straightforward to see how this metric is influenced by the number of links present in the network. Increasing the number of flights in an air transport network would also increase the number of passengers able to reach their destination directly. In the limit of all airports being connected with all other ones, will become one, indicating a perfect transport efficiency. It is important to note that a given value of is the result of the interaction between two aspects: the internal structure of the network and its link density. Therefore, a high value of does not imply an efficient network design.

If most topological metrics suffer from this dependency on the number of links composing the network, some of them are also defined as a function of the number of nodes. This is the case, for instance, of the diameter, defined as the shortest distance between the two most distant nodes in the network, and of the average path length [49]. Clearly, larger networks will, in principle, have larger diameters and path lengths than smaller ones.

How does this map to the problem of analysing a transportation network? First of all, conclusions cannot be drawn from the values of the topological metrics unless these are properly normalised, i.e., transforming to account for the number of nodes and links in the network. Secondly, comparisons can only be made on normalised values. To illustrate this point, we once again rely on three of the networks described in Appendix, specifically, the light rail, subway, and tram networks. Note that these have been chosen because of their comparable characteristics and sizes. Table 2 reports the values of several topological characteristics, both before and after a normalisation using random equivalent networks (i.e., composed of the same number of nodes and links, as further described in Section 3.2).

Several interesting observations can be obtained. First of all, the efficiency seems to be substantially higher (to be precise, higher) in the light rail (0.0979) than in the tram network (0.0659). This is nevertheless not accounting for the higher link density of the former: once normalised, the latter network appears as more efficient than the first ( versus ). Note that the highly negative values of the normalised metric indicate that these networks have not been optimised for direct connections, a message that is difficult to extract from the raw values. The opposite situation can be found in the modularity, i.e., a metric characterising the presence of communities: if the tram network seems to be more modular than the light rail, the situation is reversed once the values are properly normalised.

In synthesis, the values of topological metrics are seldom relevant per se; instead, they need to be normalised, both to simplify their interpretation and to enable comparisons between different networks. It is worth noting that many works published in the transportation context have omitted this step and have thus incurred in important interpretation errors.

For instance, [34] states that “the average path length of 2.23 in the [air transportation network of China] is very similar to that of India’s air transport system (2.26) and slightly above that of Italy’s (1.98-2.14), but larger than that of the US (ranging from 1.84 to 1.93)”. Yet, these four networks are completely heterogeneous, in terms of both number of nodes (from for Italy to for US) and link densities (from for Italy to for India). Obtaining similar average path lengths for China and India, when the latter has almost the double link density, actually indicates that their structure is substantially different. Other nonnormalised comparisons have been reported in [5052]. A synthesis of this problem in the case of air transport can be found in Table 1 of [53]: among the surveyed papers, only six normalised the average path length, and nine the clustering coefficient. It is also noteworthy that two works normalised the first metric, but not the second, even if the problem here described applies to both of them [54, 55].

Moving to street networks, [33] compares two topological metrics (clustering coefficient and average path length) for six cities and three other networks, in spite of having very heterogeneous link densities (from to ) and even in spite of being conceptually different networks (representing streets, proteins, or the Internet). A similar problem can be found in [56].

Reference [36] compares the global cargo-ship network to the worldwide air transportation network, by considering the unnormalised version of metrics like the diameter or the clustering coefficient. The similar values obtained in the last case ( versus ) lead the authors to highlight “a surprising degree of similarity of both networks”, in spite of the latter having a link density one order of magnitude higher than the former. Subsequent works based on similar data, as, for instance, [57, 58], did not solve the problem.

3.2. Recommended Solution

In a first approximation, normalising a topological metric is not a complex task. In synthesis, one needs to generate a large set of networks (called the null model) that lack the topological structure to be tested, for then seeing how the real network deviates from this set. Let us suppose that we calculated a topological metric over a network , composed of nodes and links, obtaining the value . A simple normalisation can be obtained through a Z-Score, defined as

represents the values yielded by the metric on a large set of null model networks ; and and are the average and standard deviation operators.

How should then this null model be defined? As the standard objective is to compare the real network against something that has no clear structure, the simplest solution entails using random Erdős-Rényi networks, with the same number of nodes and links. This may nevertheless yield biased results. To illustrate, suppose one is studying a street network, which is by definition planar; in other words, when two streets intersect, a link between them is necessarily created. Additionally, let us suppose that streets are built at random. Would this result in a lack of structure? Surprisingly, no: triangles would be very common, as any triplets of long streets, not parallel between them, would sooner or later intersect and form a triangle. If random networks are then used to normalise the transitivity metric, the result would be a very high Z-Score. Additionally, let us consider airport networks. While they lack the planar property, still their construction is guided by some principles that should be included in the null model: for instance, the fact that airports closer than 300 km are seldom connected by a direct flight. Once again, the use of a set of completely random networks may yield biased results.

In spite of the clear shortcomings associated with the use an Erdős-Rényi model, no accepted alternative is available for transportation systems, and the topic is still a matter of debate in other scientific disciplines [59, 60].

4. Identifying Node Importance by Arbitrarily Chosen Network Metrics

4.1. Common Pitfalls and Misleading Interpretations

Since the release of the ground-breaking studies on complex networks and their properties, it has often been found that the failure of a small fraction of elements in these networks might lead to a cascade effect which, when related to critical infrastructure, would result in major disruptions in our society. A few examples of such extensive, wide-ranging network failures include large-scale power outages in the United States [61], cross-continental supply-chain shortages in the Japanese 2011 tsunami aftermath [62], or the disruption of the European airspace after computer failures at Eurocontrol in April 2018. In all these events, the affected regions suffered extremely high economic costs [50, 63, 64]. Moreover, as infrastructure systems are becoming densely connected and dependent, the potential impact of failures is increasing to an unprecedented level. Therefore, analysing the robustness of networks and their interactions is of tremendous importance.

The robustness of a network is usually estimated based on the critical fraction of all nodes that, once removed, will cause a sudden disintegration [65]. From the perspective of statistical physics, this process is rather well-investigated for random network models [6670]. Yet, when it comes to the analysis of real-world network instances, it becomes more complicated. The major reason is that, for real-world networks, not all nodes and links perfectly fit into a predefined network model. Hence, when estimating the node importance for robustness, the statistical measures can go wrong. Over the years, multiple methods have been proposed for measuring the disintegration of a network over time. Perhaps the most frequently used method is to measure the relative reduction in the size of the giant component (or largest connected component) of the network; the rationale being that the functionality of a network strongly correlates with the number of connected nodes. We highlight an example in Figure 2. Since one is often interested in a single quantification measure for the robustness of a network, most related works use the robustness measure R [71]. Given a network composed of nodes, the value of is defined as where is the size of the giant component after removing nodes. Essentially, this procedure assesses how many nodes are contained in the giant component once a node is deleted from the network, while iterating over all nodes in the network.

While trying to quantify the robustness of a network, it is critical to understand that there exists no single robustness value R. For the computation of R, we need a node ranking as input, defining the sequence in which nodes are removed from the network. Different sequences, in general, induce different network disintegration patterns. Thus, an inappropriate choice of node sequence leads to unfounded conclusions regarding the actual robustness of a network. The design of such an order is far from trivial, given the large number of possible node orderings in real-world networks, i.e., a network with nodes has different node orderings. Therefore, existing studies often choose node sequences based on heuristics. Perhaps most known outside the core complex network research area are so-called network metrics, which assign scores to nodes, depending on properties derived at the micro/meso/macroscale. All nodes are ranked in order of the metric values (usually in decreasing order of importance). We discuss a few of these network metrics.

DEG attacks the nodes in order of their decreasing degree, i.e., the number of direct neighbors. The degree is only recorded one time in the beginning and not updated during the disruption process.

BETW (betweenness centrality [72]) measures the number of times a node appears on the shortest path between all pairs of nodes in the network. Nodes are removed with decreasing centrality scores.

CLOS (closeness centrality) measures the average shortest path distance of a node to all other nodes in the network. Nodes are removed with increasing centrality scores, given that a smaller closeness value indicated a closer relationship to all nodes in the network.

EIG (eigenvector centrality) measures the centrality of a node based on the centrality of its neighbors (see [73] for discussion on the concept). Nodes are removed with decreasing centrality scores.

PR (Pagerank [74]) was originally designed as an algorithm to rank websites based on the link structure. In our experiments we use a variant on undirected networks. Nodes are removed with increasing centrality scores.

KATZ (Katz centrality [75]) measures the centrality of a node based on the relative influence of nodes regarding direct neighbors and also all other nodes in the network that connect to the node through these direct neighbors. Nodes are removed with increasing centrality scores.

These metrics (and similar ones) have been used in many existing studies in order to analyse the robustness of transportation networks. In Figure 3, we show an example for a DEG-based attack to a network. Some of the studies [76, 77] only evaluate DEG of nodes for designing a targeted attack, e.g., stating that “for the selective attack strategy, we remove some nodes with higher degrees according to their degree order from high to low” [77]. Others compare a set of few static network metrics, but without considering interactive/iterative/dynamic metrics sufficiently for much stronger attacks [26, 7882] (see Section 4.2 for further discussion). For few studies, the authors do not reveal which kind of targeted attack they use: “We have simulated an attack on every network in our database by blocking travel through targeted stations” [83]. There are only a few notable exceptions, which correctly use interactive betweenness as a reference for network disruption simulation, e.g., [8487]. Papers published in transportation journals rarely consider advanced network dismantling methods, emerging throughout the last 2–3 years. It is interesting, however, that those papers introducing novel network dismantling methods, which rather appear in the complex network community, often take the worldwide airport network as a real-world case study [88, 89].

4.2. Recommended Solution

Previous metrics are based on an initial estimation of node importance in the original network. Yet, throughout the dismantling process, the roles of nodes in a network can change significantly. With the elimination of a (critical) node from the network, shortest paths between other nodes often change completely. Therefore, it is recommended to recompute a network metric throughout the dismantling process. In the literature, this process is referred to as interactive/dynamic attack generation. In Figure 4, we visualise the process of attacking the tram network based on interactive BETW; i.e., the values of BETW are recomputed after each node removal. BETWI always attacks the largest remaining GC and also chooses very vulnerable nodes in each step, making the attack rather disruptive to the network. In order to further address the problem of network dismantling, the complex network community has recently started to solve this problem more rigorously, by designing specific dismantling methods. We introduce a few of the relevant methods below.

CI (collective influence [90]) can be seen as an extension of the degree-based attack, taking into account the so-called ball, i.e., the neighbors which are k steps away. Originally designed for efficiently attacking hierarchical networks, CI has now been used in several research studies on general graphs.

KSHELL (K-shell iteration factor [91]) is based on the coreness of nodes in a network [92]. A large value indicates that the node has a strong ability to spread information. The algorithm combines shell decomposition and iterative node removal.

CHD (CoreHD attacks [93]) combine interactive degree and k-core [92] to achieve a decycling of networks. It iteratively removes the highest degree node among network 2-core graphs, until no 2-core graph remains, for then treating the remaining part through tree-breaking.

APTA [88] finds articulation points (or cut vertexes) in a network. In each step, the articulation points with the highest estimated impact are attacked first, based on an estimation of the largest size of the giant component after an attack. This process is repeated until the whole network is dismantled. During that process, if a network has no articulation point, the node with the highest degree is removed.

GND (generalised network dismantling [89]) computes a node sequence based on spectral properties of a novel node-weighted Laplacian operator. It also supports nonunit costs for node weights.

In Figure 5, we compare the robustness curves of the introduced network metrics and dismantling methods, grouped by dataset. The curves have a rather large deviation, particularly for the tram and the bus network. In all cases, BETWI identifies the best attack, while EIG is usually the worst strategy. In order to further compare the quality and applicability of methods, Figure 6 reports the obtained R values and run time for all network metrics and dismantling methods in this study, grouped by dataset. We find that BETWI is always the method with the smallest R value but also takes the longest time to compute (note the y-axis is shown log-scaled). Interestingly, the method APTA is often 2-3 orders of magnitude faster than BETWI but still identifies attacks with quite good R values. GND is often much slower than APTA but has a smaller R value with the apparent exception on the logistics network. This example highlights that no single method is the best, except BETWI. Therefore, it should be understood that designing an effective attack for a network is a trade-off between expected quality and computation time. If the network is small, BETWI is still the best one can get. With an increasing size of the network, one should preferably select specific dismantling methods, such as APTA and GND.

5. Networks Are Robust against Random Failures but Vulnerable to Targeted Attacks

5.1. Common Pitfalls and Misleading Interpretations

Since the choice of a node sequence significantly affects the level of disruption to a network, it is common to distinguish two classes of disruptions: random failures and targeted attacks. While the former do not have a driving force controlling the node sequence (which is thus being completely random), the latter is specifically tuned for creating the maximum damage to a network.

Existing studies often conclude with statements that the network is rather resilient to random failures, but more vulnerable to targeted attacks. These claims can be found on all kinds of transportation networks, including air transportation [9496], railway-based systems [26, 79, 83, 85, 97], and others [98100]. We only highlight two representative statements here; others follow very similar structures: “This scale-free structure has proved to be robust to random failure but vulnerable to targeted attack” [94]. “This study indicates that the subway network is robust against random attacks but fragile for malicious attacks” [79].

The general conclusion of random failures being less hazardous than targeted attacks is inherent to the definition of both node orderings, given that targeted attacks are specifically designed for a network at hand. Otherwise, if a targeted attack, for instance, as induced by a specific network metric, is worse than a random failure strategy, this simply means that this metric does not represent the node importance very well for the specific network.

5.2. Recommended Solution

The pure statement that a network is more vulnerable to targeted attacks does not provide real, novel insights. A more interesting question is how much more vulnerable a network is to a targeted attack, compared to a set of random failures. One way to measure this difference in vulnerability is to consider representative attacks obtained from an envelope of random attacks [101]. Essentially, the idea is not to identify the (rather obvious) fact but quantify the difference in attack efficiency. In general, one can start with the R value of the best targeted attack and compare it to the R value of the representative random attack. The larger the compared to , the stronger the effect of using targeted attacks. Formally, we can introduce a measure defined as . Moreover, it can be insightful to take the width of a random attack envelope into account, since random attacks on their own can still have a rather large variation in their induced R values.

In Figure 7, we visualise a set of random attacks for the six transportation networks in our study, as described in the Appendix. For each network, we generated 50 attacks randomly. Given these random attacks and their robustness curves, we compute the robustness envelope as follows: a minimum, maximum, and median curve are derived based on computing the corresponding aggregation function for all GC sizes at a given fraction of disrupted nodes. In addition, we plot the robustness curve as obtained by BETWI, the best known attacking strategy. In Figure 8, we show the results of comparing of real-world networks and their equivalent ER random networks with the same number of nodes and links. The obtained value of can be found to the left of the random network peak, which can be explained by the fact that random network instances have no (topological) properties which can be exploited by targeted attacks and, thus, targeted attacks in the real network are usually stronger. Yet, the distance from real-world network values and random network values vary significantly between the types. The logistics network, for instance, is much more vulnerable to targeted attacks than its random counterpart. Intuitively, the presence of a few hubs makes the network much more vulnerable to hub-targeted attacks, compared to random network instances. For the subway network, on the other hand, targeted attacks are almost as strong as in the random counterparts. This means that the subway network does not have an inherent structural property that can be further exploited by targeted attacks.

6. Discussion and Conclusions

In this work we have revisited some common problems that can be found in papers that apply complex network theory to the study of the topology of transportation systems, analysed their impact, in terms of how our understanding of the underlying system can be misleading, and presented a set of solutions. Four specific topics have been covered:(1)One of the most important topological properties of network is scale-freeness, i.e., the fact that the degree distribution of nodes follows a power law. Such theoretical model has been the foundation of many studies in complex network theory, and there has been a lot of interest in assessing whether real-world networks, including transportation ones, actually follow it. Yet, assessing the scale-freeness is not a trivial task, as it requires both large enough networks and the application of suitable statistical tests. We presented a review of some common errors and some potential solutions, including an analysis of which statistical tests are actually tailored to this problem(2)Beyond scale-freeness, the first step in the analysis of a complex network is usually its description through a series of topological metrics, i.e., metrics assessing some aspects of its structure. An important problem stems from the fact that such topological metrics are usually influenced by the number of nodes and links in the network; if these are not taken into account, comparing different networks may yield unreliable results. We presented some examples of misinterpretations of network metrics and suggested a simple solution based on the creation of null models(3)We would like to increase awareness of the fact that network metrics do not lead to optimal attacks. In fact, there is no single metric which always outranks all other metrics. Empirically, the interactive variant of betweenness is the best approach for analysing the robustness of a complex network. This high quality of attack sequence, however, comes at a price: computing betweenness requires computation time cubic in the number of nodes. For large networks, the run time becomes unacceptable. Therefore, we point out recent developments in network dismantling, a novel research direction specifically targeting the robustness analysis of networks. Several of these methods provide an interesting trade-off between quality and run time(4)Comparisons between random and targeted attacks have to be performed with care. By definition, a targeted attack is more disruptive than a random attack. The interesting case, however, is to explore this problem with respect to a reference random network with the same number of nodes and links. Analysing the obtained gives an indication about how much more vulnerable a specific real-world network is compared to its random counterpart. Using this measure, a classification of vulnerability with baseline values could be derived

It is the authors’ belief that these problems must be taken seriously by the scientific community because of two main reasons. First of all, they introduce the risk of obtaining biased (if not totally wrong) results; this may, on the long term, reduce the credibility associated with complex network analysis and hence create a burden on future ideas. Secondly, the problems here discussed are neither old nor limited to second tier journals. On the contrary, it is easy to find examples of papers published in this same year [29, 31] or in very prestigious journals, both in the statistical physics and in transportation communities.

In spite of this, this work has also an important bright side, as there is much ground for hope. While the number of papers that have fallen in these pitfalls is indeed large, one can also find many examples of technically sound and statistically robust analyses; see, for instance, [20, 36, 102]. These pitfalls can also represent the motivation for opening new lines of research and hence improving our understanding of transportation systems. Among others, the following are worth exploring:(1)It was suggested that an exact scale-freeness is not an essential requirement for subsequent analyses, as the fundamental point is the presence of a long tail in the distribution of degrees. At the same time, we reported that theoretical dismantling strategies, developed on the scale-free model, may not efficiently work on real networks. One may thus ask what is the effect of not following a perfect scale-free distribution, or, in other words, what are the consequences of having real, as opposed to theoretical, networks(2)Metrics normalisation required the development of suitable null models, able to create networks without any specific structure, but still constrained by the characteristics of the system under study. A completely random network may not be a good null model for the airport network, as very short flights have no economical meaning. This has been partially solved in other scientific fields, for instance, on protein networks [103], and should probably be tackled also for transportation systems(3)Most transportation studies on complex network robustness are performed on undirected, unweighted networks with unit costs for dismantling nodes/links. Clearly, all these assumptions are simplifications in order to make computation feasible and facing a limited amount of available data. We foresee the need for a generalised transportation network robustness framework, which, given a variable set of data (passenger data, schedules, etc.), computes the a realistic measure of robustness for a transportation system. While there exist a number of studies tailored specifically for regional transportation system at high level of detail, there is no agreement on a common model for transportation network robustness. Such a benchmark model would help to push our understanding of network robustness further and eventually improve our critical transportation infrastructure

As a final note, we would like to highlight that the same caution, one should devote to the previously discussed pitfalls, should also be applied to avoid misleading generalisations. Any network method being applied to a transportation problem is very much dependent on the available data and the problem at hand. If one should carefully investigate the applicability of previously published methods, rather than simply borrowing them from other disciplines, the solutions here proposed should similarly be judged according to the context. To illustrate, some theoretical models may require an exact scale-free distribution to yield meaningful results, and the characteristics of a null model should be consistent with (and adapted to) the system under analysis. In synthesis, it is important to keep in mind that “one size does not fit all”.

Appendix

In order to better introduce and illustrate the pitfalls in transportation network analysis discussed in this work, a set of exemplary transportation networks are used as case studies. These networks cover a wide range of transportation modes, including air, bus, light rail, subway, and tram. The networks and their setup are described below.

Airport network: the first case study represents the worldwide airport network for August 2014. The data was obtained from Sabre Airline Solutions (http://www.sabreairlinesolutions.com). We use monthly data encoding the number of passengers booking a flight (both direct and with intermediate connections, up to a maximum of three) between two airports. Based on these data, we derived an airport network, where each airport is represented by a node, and a link between pairs of them is created if at least 1000 passengers took a direct flight between these two airports in August 2014. Given that airport networks are usually symmetric, we converted the network into an undirected network before further processing and analysis. The obtained network consists of 1,808 nodes and 11,191 links; see Figure 9(a) for a graphical representation.

Bus network: the second case study represents the bus network in the greater area of Berlin. The data were downloaded from the official website of operator VBB, where they are provided in GTFS format (http://www.vbb.de/de/datei/GTFS_VBB_Feb2018_MitteDez2018.zip). Note that the data represent the future planning of the service and are therefore valid from February 2018 to the end of 2018. The file provides information on stop locations, connections, and schedules for all transportation modes in Berlin. We have extracted all bus routes (GTFS code route type 700) and created a complex network with stations being nodes and two nodes being connected if there is at least a bus service between them. The obtained network consists of 12,272 nodes and E=19,584 links; we have converted it to an undirected network for the analysis in our study. The network is visualised in Figure 9(b).

Light rail network: the third case study represents the light rail network in the greater area of Berlin. The data were downloaded from the official website of operator VBB (see above). We have extracted all light rail routes (GTFS code route type 109) and created a complex network with stations being nodes and two nodes being connected if there is at least one light rail service between them. The obtained network consists of 166 nodes and 184 links; we have converted it to an undirected network for the analysis in our study. The network is visualised in Figure 9(c).

Logistics network: the Australia Post problem (http://people.brunel.ac.uk/~mastjjb/jeb/orlib/files/phub1.txt) is a standard dataset for testing the efficiency of hub location solvers. We have downloaded the dataset and computed an optimal assignment for the incomplete hub location problem with hubs, fixed costs of and variable costs of , using an enhanced Benders decomposition [104]. The result is an optimal assignment of hub links, access links, and direct links, minimising the transportation costs in the network. The network is visualised in Figure 9(d).

Subway network: the subway network in the greater area of Berlin was downloaded from the official website of operator VBB (see above). We have extracted all subway routes (GTFS code route type 400) and created a network in the customary way. The obtained network consists of 163 nodes and 165 links; we have converted it to an undirected network for the analysis in our study. The network is visualised in Figure 9(e).

Tram network: the tram network in the greater area of Berlin was downloaded from the official website of operator VBB (see above). We have extracted all tram routes (GTFS code route type 900) and constructed the network accordingly. The obtained network consists of 420 nodes and 489 links; we have converted it to an undirected network for the analysis in our study. The network is visualised in Figure 9(f).

Data Availability

The network data used to support the findings of this study are available from the corresponding author upon request. The airport network data is available at http://www.sabreairlinesolutions.com.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.

Acknowledgments

This work is supported by the National Natural Science Foundation of China (Grants no. 61650110516 and no. 61601013, no. 71731001, and no. 61521091).