Journal of Advanced Transportation

Volume 2018, Article ID 3156137, 17 pages

https://doi.org/10.1155/2018/3156137

## Studying the Topology of Transportation Systems through Complex Networks: Handle with Care

^{1}Centro de Tecnología Biomédica, Universidad Politécnica de Madrid, 28223 Madrid, Spain^{2}Faculdade de Ciências e Tecnologia, Departamento de Engenharia Electrotécnica, Universidade Nova de Lisboa, 2829-516 Lisboa, Portugal^{3}National Key Laboratory of CNS/ATM, School of Electronic and Information Engineering, Beihang University, 100191 Beijing, China^{4}National Engineering Laboratory for Integrated Transportation Big Data Center, 100191 Beijing, China^{5}Beijing Advanced Innovation Center for Big Data-Based Precision Medicine, Beihang University, 100083 Beijing, China

Correspondence should be addressed to Xiaoqian Sun; nc.ude.aaub@qxnus

Received 11 June 2018; Accepted 8 August 2018; Published 19 August 2018

Academic Editor: Samiul Hasan

Copyright © 2018 Massimiliano Zanin et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

#### Abstract

The introduction of complex network concepts in the study of transportation systems has supposed a paradigm shift and has allowed understanding different transport phenomena as the emergent result of the interactions between the elements composing them. In spite of several notable achievements, lurking pitfalls are undermining our understanding of the topological characteristics of transportation systems. In this study, we analyse four of the most common ones, specifically related to the assessment of the scale-freeness of networks, the interpretation and comparison of topological metrics, the definition of a node ranking, and the analysis of the resilience against random failures and targeted attacks. For each topic we present the problem from both a theoretical and operational perspective, for then reviewing how it has been tackled in the literature and finally proposing a set of solutions. We further use six real-world transportation networks as case studies and discuss the implications of these four pitfalls in their analysis. We present some future lines of work that are stemming from these pitfalls and that will allow a deeper understanding of transportation systems from a complex network perspective.

#### 1. Introduction

In recent years, the topological structure of different transportation systems has become an important topic of research. This is the result of the convergence of two different lines of work. On one hand, the improvement in computational and data storage resources has allowed the transportation research community to gain access to large amount of real data, enabling the detailed description of those systems at different time and spatial scales. On the other hand, there has been a great effort from the statistical physics community in analysing the structure and dynamics of both theoretical and real complex networks [1–3]. It then became clear that most* complex* systems, i.e.*,* those composed of multiple interacting elements, cannot be fully understood by a reductionist approach, in which the composing elements are studied in an independent fashion. In order to understand and predict the collective (or* emergent*) dynamics, it is instead necessary to include information about how those elements interact between them, and about how different connectivity patterns influence such dynamics.

The convergence of both research fields has resulted in a paradigm shift in the way transportation systems are conceptualised and analysed. It became clear that these are complex systems and that the focus ought to be moved from one transportation unit (e.g.*,* an aircraft, a car, or a bus) to the global structure of interconnections that those units generate. Consequently, the generation and absorption of delays stop being local phenomena, i.e.*,* the result of the dynamics of a single aircraft, for becoming a propagation process conceptually similar to disease spreading. Similarly, the cancellation of a flight or the closure of an airport can be studied for their global consequences, i.e.*,* the changes in the mobility patterns across the whole system, instead of including just a quantification of the number of directly affected passengers.

Although fruitful, this convergence is also hiding pitfalls and difficulties. These come from two fronts. Firstly, complex network theory was not developed with a specific application in mind, but it is instead a general framework for understanding interacting systems. A statistical physicist must then take into account the fact that not all complex network concepts are applicable to the transportation context and that some adaptation may be required. Secondly, even if* prima facie* simple, complex network theory is based on a strong mathematical scaffolding that cannot be circumvented. The transportation scientist must then be aware of many theoretical requirements, such as the application of suitable statistical tests, to ensure the obtention of meaningful results.

Within the hundreds of contributions that have appeared in the last decade about the use of complex networks to understand transportation systems, a significant number of them presents one or more problems that make it difficult to interpret their results. These problems are not limited to trivial research works: on the contrary, they can be found in recent publications and in highly respected journals. In this work we aim at fostering a debate around them, by raising awareness in the scientific community and eventually at helping developing novel solutions. For the sake of compactness, this debate has been focused on the topological properties of transportation systems, for being the most basic and easily understandable application of complex network theory. These problems have been organised around two major topics: (*i*) the assessment of topological properties of the networks, including scale-freeness (Section 2) and other basic characteristics (Section 3), and (*ii*) the study of the robustness and resilience of transportation systems, in terms of both the used metrics (Section 4) and terminology (Section 5). Six real-world datasets are further used to illustrate these pitfalls. We finally draw conclusions in Section 6.

#### 2. Assessing the Scale-Freeness of Transport Networks

##### 2.1. Common Pitfalls and Misleading Interpretations

Originally, two types of graphs were extensively studied: regular ones, in which all nodes have the same degree (i.e.*,* number of connections) and random graphs, whose connectivity is completely random and thus in which node degrees follow a Poisson distribution. One of the most important discoveries in complex networks theory and the one that distinguished it from the mathematics’ graph theory is the realisation that nodes in real-world networks are not homogeneous: on the contrary, they usually display richer connectivity patterns. Specifically, it has been found that many nodes only have a handful of connections, while a few of them (called* hubs*) may be connected with the majority of their peers. The result is a* scale-free* distribution of the degree of nodes, which can be approximated by a power law [4].

Such heterogeneity in the nodes’ importance is also present in transport networks. Nodes are not all the same, with some of them being much more important than others. On one hand, this may be due to the way the network is constructed, with some nodes designed to connect different parts of the system. But it can also be the result of economical reasons, as, e.g.*,* in the case of airports serving big cities and thus collecting a larger demand and of historical reasons, as the case of ports or of specific maritime routes, being important because of their past [5]. It is thus natural to pose the question of whether transportation networks are also scale-free.

An open topic of discussion within statistical physics is when we can confidently define a network to be scale-free (see, for instance, [6, 7]) and how this can be translated to fields like, for instance, biology [8–10]. Historically, such analysis has been performed by plotting the degree distribution in a log-log scale and by verifying that such distribution approximately follows a straight line. This may nevertheless be misleading, as a log-log scale flattens most perturbations, such that many different distributions may therefore* seem* power laws. On the other hand, a more statistically sound analysis requires two conditions: a network size large enough to span several orders of magnitude in the node degrees and the execution of a statistical test, as will be discussed in Section 2.2.

With respect to the size requirement, it is easy to see that most of the air transport networks do not fulfil it, as the number of airports in a country or even in a supranational region seldom reaches the thousands. In spite of this, scale-freeness has been claimed for the Italian [11] ( airports), Indian [12] ( airports), the Brazilian [13] ( airports), or the Chinese network [14] ( airports). The situation is even worse in the case of road networks, in which the physical nature of the graph implies that the degree of each node is limited, as, for instance, it is difficult to plan a crossroad where more than six streets converge. In spite of this, [15] compares two fits for the degree distribution, according, respectively, to a power law and an exponential function, even though the maximum degree in the network is and the minimum is .

In order to confirm the presence of a scale-freeness distribution of the degrees, the most common approach has been to resort to a graphical representation. Plenty of examples can be found in the literature, for maritime [16–18], road [19–21], and rail networks [8, 22–29]. Beyond such graphical fit, some interesting examples may also be highlighted. Specifically, [30], while analysing the evolution of global liner shipping networks between years 1996 and 2006, reports an exponent varying from to without describing how these values were obtained. Reference [31] concludes that the maritime network is scale-free without any calculation at all: “*nearly ** of nodes account for less than ** of the respective accumulative values of the degree of the nodes, just like scale-free properties*”. In the analysis of urban street networks, [32] states that “*the investigation of how well the fat-tailed distribution can fit power law in comparing with other distributions (e.g., log-normal and exponential) shows that no significant evidence is found for scale-free feature in the dual space*”; nevertheless no statistical evidence of any kind is provided. Finally, [33] identifies several street networks as scale-free and reports a goodness-of-fit: yet there is no explanation on how this last metric is computed, making it thus impossible to reproduce these results.

Not all research works suffer from this bias towards scale-freeness, and some noteworthy examples can be found. For instance, [34] correctly discards the scale-free structure in favour of an exponential distribution of degrees for the air transportation network. Reference [35], when analysing the temporal evolution of the Brazilian air network, states that “*a reasonable fit is obtained by using a stretched exponential*”, although no statistical analysis is provided. Finally, [36] correctly recognises that, even though there is a “*suggestive scaling behavior*” in the distribution of node degrees in maritime networks, “*simple models for generating scale-free statistics are not sufficient to describe these empirical networks*”; similar careful observations have been made for travel demand networks at the urban scale [37–40] and location-based analysis of data from social media [41].

It is clear that the claim of the scale-free nature of many transportation networks has not been supported by suitable statistical tests. It is nevertheless undeniable that nodes are not homogeneous and that some of them attract most of the connections and traffic. Thus, even if these networks are not scale-free, they still present a* scale-free like* structure and display a long-tailed degree distribution. How does this impact the operational analysis of the system? In other words, how do the conclusions of the previously mentioned papers have to be changed, if the networks are long-tailed instead of scale-free? In simple terms, no effect is to be expected.

In order to understand this point, one has to take into account the fact that scale-free networks are a mathematical simplification, or model, of real-world networks. Defining an exact law for the degree distribution allows finding analytical solutions to problems like the dynamics of diseases [42] or voters [43], through a heterogeneous mean field approximation. These problems can nevertheless still be analysed when networks are not exactly scale-free by means of numerical simulations. Furthermore, as node degrees are indeed heterogeneous and follow a long-tailed distribution, all subsequent conclusions will still hold, like the importance of the central airports for the delay propagation or for the robustness of the system.

In synthesis, assessing the scale-freeness of a transportation network requires a solid statistical validation. If such validation cannot be performed, for instance, because of the limited network size, it is better to avoid any mention to a scale-free topology, as this would largely be irrelevant. Putting it simply, and in spite of its lure, there is more life beyond scale-freeness.

##### 2.2. Recommended Solution

As previously introduced, there are two problems preventing an easy assessment of the scale-freeness of real-world networks: their limited size and the fact that statistical validations of the fits are seldom performed.

As for the first issue, it has been found that even perfect fits cannot be accepted as statistically significant when the number of samples (in this case, of nodes) is below [44]; and, as a rule of thumb, scale-freeness should be accepted only when the degrees span several orders of magnitude. Therefore, not even the best statistical analysis can support the scale-free hypothesis for the Italian air transport network, composed of nodes [11] nor for a network whose maximum degree is [15].

Regarding the second issue, i.e.*,* the design of a statistical test, we here tackle it through three different techniques. To illustrate, these techniques are applied to the airport and bus networks described in Appendix. Power law and exponential fits of the degree distribution of both networks are reported in Figure 1, while the values of the statistical tests are reported in Table 1.