People are the most important factors of economy and the primary carriers of social culture. Cross-border migration brings economic and cultural impacts to the origin and destination and is also a key to reflect the international relations of related countries. In fact, the migration relationships of countries are complex and multilateral, but most traditional migration models are bilateral. Network theories could provide a better description of global migration to show the structure and statistical characteristics more clearly. Based on the estimated migration data and disparity filter algorithm, the networks describing the global multilateral migration relationships have been extracted among 200 countries over fifty years. The results show that the global migration networks during 1960–2015 exhibit a clustering and disassortative feature, implying globalized and multipolarized changes of migration during these years. The networks were embed into a Poincaré disk, yielding a typical and hierarchical “core-periphery” structure, which is associated with angular density distribution, and has been used to describe the “multicentering” trend since 1990s. Analysis on correlation and evolution of communities indicates the stability of most communities, yet some structural changes still exist since 1990s, which reflect that the important historical events are contributable to regional and even global migration patterns.

1. Introduction

In the tide of globalization, the scale and diversity of international migration are substantially increasing [1, 2]. In 2015, approximately 244 million people, or 3.3% of the world’s population, lived in a country other than their birthplace [3, 4], and this value is forecasted to double by 2050 [5]. Population migration could bring important effects on both importing and exporting countries [68], and some scholars have used quantitative models to analyze the influencing factors, evolution patterns, and trends of global population migration [2, 9].

Some early studies researched the mechanism of population migration, such as the conventional gravity model [1012], the random utility maximization (RUM) model [13, 14], and the self-selection method [1517]. But they most focus on the bilateral migration flow and relations between two countries. In reality, potential migrants usually face multiple optional destinations at the same time, and they have to make a decision after comparing the advantages of all possible choices. In some cases, these decisions even could be random or probabilistic [2]. So the existing methods based on bilateral relationship will bring the problem of information loss and distortion, which has promoted the development of multilateral models and theories. Subsequent scholars put forward the definition of the multilateral migration barrier and introduced the structural gravity model [1820] and multilateral probability model [2, 21, 22]. Such improvements from bilateral to multilateral analysis are meaningful yet still insufficient, because the indefinite definitions of multilateral barriers and coupling parameters hinder the following quantitative work and bring some controversy in estimation [23]. So, some improvement in the method is still needed.

The asymmetric and multilateral flow data shows the complexity of global migration system, and it needs a systematic model to describe the individual choice based on multilateral relationship, where the influences of other countries should not be ignored when discussing the migration flow between any two countries [2, 21]. Complex network is a powerful framework to understand the multilateral relationships in the real world, where nodes are world countries (or other elements) and links represent interaction channels between countries. And it can show the overall structure and statistical characteristics of the system more clearly, such as in biological systems [24], cortical circuits [25], and geographic maps [26]. In recent years, some researchers have used the complex network method to study international migration and have achieved some preliminary results [5].

The statistical characteristics of the network, such as degree correlation, clustering coefficient, and connectivity [27], could describe the preference of migrants when selecting the destination and also indicate the global/local connectivity and topology structure of the migration network [2830]. Fagiolo and Mastrorillo were the first scholars to study migration from a complex network perspective, and based on analyzing the statistical characteristics, they indicated that the global migration network was organized with a small-world binary pattern displaying the characteristics of disassortativity and high clustering [31]. This finding was later certificated by other scholars [29, 30]. Furthermore, the identification of communities could analyze the hierarchical structure, the relationships, and similarities of different countries [32]. Porat and Benguigui analyzed the degree distribution and connectivity of global migration networks and classified 145 destination countries into three classes [33]. Some scholars decomposed the world migration network into communities and analyzed its structure evolution [5], along with the glocalization, polarization, and globalization of the network [34]. In addition, it can also help conventional models to describe multilateral relations more scientifically, as Tranos analyzed the topology of a migration network and proposed the pull and push factors behind international migration flows between OECD countries with the network method and gravity model [35]. This paper comprehensively analyzes the statistical characteristics, topological structure, and evolution trend of the global migration networks, composed of 200 countries/regions, with an evolution time greater than 50 years.

Besides, in recent years, scholars have found hyperbolic features in some real-world networks [36, 37]. And here we try to study the geometric features of population migration networks. In addition to proposing the hyperbolic characteristics of the population migration network, the geometric configuration is also helpful for intuitively analyzing the regional and global structures of the whole system.

This paper is organized as follows: Section 2 introduces the data source and method to extract the backbone network of global bilateral migration, which is called GMN (global migration network) in the following sections. Section 3 analyzes the skeletal construction and community dynamics of GMNs, including the changes in network statistical characteristics from 1960 to 2015, and the structure evolution. The results confirm that the GMN is a disassortative network with high clustering coefficient, exhibiting globalized and multipolarized changes during 1960–2015. Additionally, the network represents the hyperbolic and hierarchical characteristics of international migration by embedding the countries on a Poincaré disk. The positions on the disk show the status of each country/region in the global migration network, and the hyperbolic distance can indicate the migration relations between countries. Section 4 provides the conclusions and discussion.

2. Materials and Methods

2.1. Data Source

The analysis requires the data of bilateral migration flow between countries. However, from the perspective of statistics, authoritative institutions generally only provide data on the composition of immigrants (immigration stock data), such as the “UN Global Migration database” [38] and “World Bank Global Bilateral Migration database” [39], which cover most of the countries in the world. Some existing global migration networks are directly based on the immigrant stock data, which could represent past flow quantities [5, 29, 31].

In addition, there are three common methods to estimate the bilateral migration flows based on the immigrant stock data published by the World Bank or United Nations [3]: (1) use the differences in successive bilateral stocks to estimate the corresponding migration flows [2, 30, 33]; (2) approximate the migration flow rates, which are then multiplied by additional data to obtain the estimated global migration flows [40]; and (3) frame the changes in migrant stocks as the residuals in a global demographic account [1, 41]. Among the literature, the third method, called “demographic accounting,” could estimate migration flows to match increases or decreases in the reported bilateral stocks with births and deaths during the period. Some scholars consider “demographic accounting” with a pseudo-Bayesian method as the most effective estimating method [3, 42]. And this paper uses the third method provided by Abel based on 200 countries/regions during 1960–2015 [43].

To reduce the impact of contingency, we separate the data into 6 periods: 1960–1969, 1970–1979, 1980–1989, 1990–1999, 2000–2009, and 2010–2015. Since there are only five complete years of data after 2010, we have doubled the estimated flow data of 2010–2015 to match other periods.

2.2. Global Migration Network (GMN)

From the data in the previous section, we construct an undirected complex network based on estimated bilateral migration flows for each period. Here, the nodes are the countries/regions, and the connection between nodes depends on whether there are migrant flows between them. The weight of the edge indicates the volume of migrants, which is the sum of emigrant and immigrant flows.

Global migration is a complex system with complicated microstructure and evolutionary characteristics [30, 31, 33]. The backbone of the network offers a perspective where the structural characteristics of the network are more prominent. There are many ways to extract the backbone network; we apply a method called disparity filter algorithm [44] (with details in Appendix A).

We first assess the effect of inhomogeneity at the local level; for each country with migration routes, we calculate the Herfindahl–Hirschman index [37, 44] (in Appendix A). is the weight of the edge connecting and . The local heterogeneity in the distribution of migration reveals that not all migration channels are equally significant (in Figure 1(a), most blue triangles are below and near the red line of , i.e., ), and thus, the disparity filter can be applied to select only migration channels that are significant to at least one of the countries at the end of the channel.

, , and are the number of nodes, number of links, and total weights in the backbone network, respectively, while , , and are those in the original flow network. As significance level changes from 0 to 0.1, the fraction of remaining nodes and the fraction of remaining weights gradually decrease, and the absolute value of the decreasing slope becomes increasingly large. To keep more countries, more weights, and fewer links in the backbone network, we choose a position (in Figure 1(b)) which can minimize the remaining links on the premise of remaining all nodes to extract the backbone network. The extracted backbone network, which is called the global migration network (GMN) in the following sections, is shown in Figure 2. The color of the node indicates the community of the country/region, which is consistent with Sections 3.3 and 3.4. The results show that the GMN became denser in the 2010s.

2.3. Hyperbolic Geometry and Embedding Methods

In a hyperbolic disk, the radius increases exponentially, and the distance between two points depends not only on the length of the line connecting the two points but also on the angle difference. The hyperbolic space can capture the centrality and hierarchy of the network easily, and some scholars have proposed that many real-world networks exhibit natural hyperbolic geometry [37, 45], ranging from biology to economics, finance, and trade [36, 37, 4549]. Some machine learning methods, for instance, embeddings of graphs such as latent space embeddings, Node2vec, and Deepwalk, have found important applications for community detection and link prediction in social networks. Maximilian Nickel and Douwe Kiela present an efficient algorithm to learn the embeddings based on Riemannian optimization [50]. This method is carried out on the Poincaré ball model, as it is well suited for gradient-based optimization (details in Appendix B). Here, we embedded the GMNs using a 2-dimensional Poincaré disk, with a learning rate of 0.1 and a negative sample size of 30. This setup produced hyperbolic embeddings in which each node —a country in the migration embedding—has radius and angle . Nodes with small radius hold central positions in the circularly arrayed hierarchy. The hyperbolic distance (depending on angle and radius) between two nodes quantifies their migration relation. Please find the comparative evaluation of hyperbolic embedding and Euclidean embedding in Section 3.2.

3. Results and Discussion

3.1. Basic Statistical Characteristics of GMNs

Figure 3(a) shows the evolution of the strength and numbers of edges for GMNs from 1960 to 2015. Obviously, the number of edges and the sum of weights exhibit a growth trend over time, which can also be observed in Figure 2. We believe that this growth implies a more frequent trend of global population migration. You can see Appendix C for the specific statistics of each network. Figure 3(b) shows the degree distribution of the backbone networks for all periods in gray and blue. We use the backbone network of 2010–2015 as an example to perform power rate fitting for the degree distribution by the nonequidistant bin method, shown in Figure 3(b) with red dots and line. It was found that the network essentially conformed to the power-law distribution. In other words, most countries have a single direction of population migration, while a small number of countries have population exchanges with many countries; this means that population migration exhibits local concentration.

The clustering coefficient measures the degree of connectivity of a network; the average shortest path reflects the difficulty of one node connecting to another node in the network. The clustering coefficients and the average shortest path (in the maximum connected subnet) of GMNs are shown in Figure 3(c). The clustering coefficient exhibits an upward trend, while the shortest path follows a downward trend, exhibiting the enhancement of small-world attributes. In our opinion, the global migration relations have become closer in recent years, which also indicates the increase in network clustering during 1960–2015.

The degree correlation in complex networks reflects the connection preference of nodes in the network, as defined as follows:with scaling hypothesis . indicates the average degree of the first neighbors of nodes with degree . If , then this is an assortative network; similarly, indicates a neutral network, while indicates a disassortative network. We calculate the degree correlation of each network by degree correlation function, and their values are all negative during 1960–2015. The degree correlation of the 2010–2015 GMN is shown in Figure 3(d). The GMNs in other years exhibit similar imagines. This method shows that all of the networks have the characteristics of negative matching, indicating that nodes with lower degrees are more likely to be connected with nodes with higher degrees. The reason may be that the migration of most small countries shows preferential movement to several large countries that have survival advantages or economic advantages. In fact, the disassortative characteristic of the international migration network has been verified in some existing studies [30, 31]. In addition to the migration networks, the international oil trading network shows a disassortative feature in which countries with fewer trading partners tend to develop oil trading relations with countries with more trading partners [51].

3.2. Hyperbolic Characteristics of GMNs

Some network structures could actually be better described by hyperbolic space [36, 37]. To further compare the advantages of hyperbolic embedding, we try to separately embed GMNs into Euclidean and hyperbolic planes. We consider the least-squares error function used in [52]. After unified measurement, the errors of two spaces are revealed in Figure 4 with dotted lines. In addition, we also compute the embedding (details in Appendix B) in both two spaces (considering the error function performance, we only embed the network in Euclidean space by nonclassical MDS). The result is shown in Figure 4 with solid lines. This figure shows that all errors in hyperbolic embedding are lower than those in Euclidean embedding; according to the , hyperbolic embedding also offers more professional performance in the expression of data size relations. This result indicates that the GMNs exhibit a significant hyperbolic characteristic during 1960–2015. The Poincaré disk embedding results of GMNs in the 1960s and the 2010s are visualized as shown in Figure 5.

Each node represents a country/region. The size of the node expresses its degree in GMN, the hyperbolic distances represent their relations in the global migration network, and the weights of edges are the same as those in the backbone network (after normalization). For clarity, only the names of nonperipheral countries whose distance less than 0.97 from the origin of the coordinates are shown. The color of the node indicates the geographical location of the country/region. The figure indicates that GMNs presents an obvious “core-periphery” structure. In the 1960s, the network is sparse, while in the 2010s, 200 countries have closer and more complex migration relations, which also conforms to the common law of global integration. The population migration in the 1960s primarily occurred within the regions or communities, and there are fundamentally fewer typical countries in the center of the Poincaré disk (Figure 5(a)). The Democratic Republic of the Congo (COD) is the second largest country in Africa. Once a Belgian colony, it became independent in 1960 and became a link in the global migration network between France (FRA) in Europe and African countries such as Sudan (SUD) and Madagascar (MDG).

Furthermore, for the period of 2010–2015, the network structure is more complex, and the hierarchy is more obvious. Here, the United States and Canada (with large degrees of and , respectively), which have closer population migration relations with other countries in various regions of the world, are more centrally located in the disk. France and the United Kingdom, which mainly connect local communities such as European and some African countries, have been slightly more marginal since the 2010s. Although the degrees of Portugal and Yemen are not large ( and , respectively), their locations indicate their contributions on connecting the migration of several continents.

Additionally, Figure 5 also shows that the hyperbolic distance between countries/regions is not entirely determined by the geographical location. It represents the correlation of embedding distances and geographic distances, which is significant during 1960–2015 (with sig.). This means that hyperbolic distance is positively correlated with geographical distance but encodes more than purely geographical information.

In order to describe the distribution characteristics of the countries on the hyperbolic plane more clearly, we define the angular density . First of all, we get some points on the hyperbolic plane (the points with polar coordinates of in Figure 6(a)); then we draw neighborhood of the points with the same radius . As these points move away from the core (), their neighborhood looks smaller, and the center of the circle tends to shift toward the core, but on the hyperbolic plane, these circles have the same area. So the green circle with the center , blue with , orange with , yellow with , and purple with have the same area. And we define the density of points as the number of countries loaded in their neighborhood.

Then, we sum the density for each and get the angular density distribution . Angular density can more intuitively show the distribution characteristics of countries in hyperbolic space. Figure 6(b) shows the angular density for the 1990s-2000s. Blue dotted line indicates the angular density for each , and the countries are the representatives with the relatively centering positions and they belong to the corresponding peaks. The color of the country indicates its region.

Angular density can help us see the aggregation of the global migration networks more intuitively: (1) In the 2010s, the aggregation has increased obviously, which is reflected by the increasing angular density, and more countries have gathered together. (2) Most of the countries in the center are located in North America or Europe. It is mainly because the countries in North America and Europe have the shorter migration distance from other countries, so they are naturally easier to be located in the center of hyperbolic plane. (3) Besides, in the 1990s, African countries were more aggregated, while in the 2010s, Latin American countries were more aggregated. (4) It shows a trend of “multicentering” of the global migration networks since the 1990s.

3.3. Communities Characteristics of GMNs

There are many ways to explore the communities of a complex network, such as the GN algorithm [53] based on network topology and Potts model [54] based on network dynamics. In this paper, the Louvain algorithm [55] based on modularity, which is rapid and exhibits an obvious clustering effect, is adopted. The algorithm divides each round of calculation into two steps: in the first step, the algorithm scans all nodes, traverses all neighbors of the node, and measures the modularity benefit of adding the node to the community of its neighbor; it then selects the corresponding neighbor node with the highest modularity gain and joins its community. This process is repeated until the results are stable. During 1960–2015, the modularity value is within 0.66–0.76, which proves the validity of clustering (green line in Figure 7(d)).

Figure 8 shows the communities of global migration networks during 1960–2015. Obviously, the result of clustering is relatively stable over the most recent fifty years. According to the composition of members, we define ten typical communities: (1) America: including the countries in North America, Central America, South America, and the Caribbean; (2) French related: including France and other neighboring European countries, as well as French territories and former colonies around the world; (3) British Commonwealth: including some original and current Commonwealth countries, including Australia, New Zealand, and Canada; (4) Indian Ocean: centered on India, including South Asia, North Africa, Southeast Asia, and other countries close to the Indian Ocean; (5–7) most sub-Saharan African countries divided into three communities: East-Middle, Western, and Southern Africa; (8) Europe: some countries in Western, Central, and Eastern Europe; (9) the former Soviet Union: including Russia and some former Soviet countries; and (10) East Asia: mainly East Asian countries, including some Southeast Asian countries in early times and Russia and some former Soviet countries in recent years.

After grouping, we could analyze the “globalization” or “polarization” trends based on comparing the global and local connectivity in migration communities. The cumulative distribution of degrees and weights (Figures 7(a) and 7(b)) shows that, on the whole, they both tend to be relatively flat. This means that many top countries are reducing their proportions of edges and flows, while other countries with low flows are experiencing relatively rapid development. The Gini coefficients of degree and weight both exhibit an overall downward trend, and the entire network becomes more balanced over time (Figure 7(c)).

In recent years, some scholars have proposed the “globalization of migration” hypothesis and emphasized both the progressively increasing number of countries involved in global migration and the diversification of origins and destinations [34, 56]. Some other scholars offered an alternative understanding, suggesting that, in recent years, globalization did not contribute to an overall increase in mobility possibilities but instead widened the gap between rich and poor countries [5759], leading to polarization of the global migration networks.

Here, we use the external-internal index (E-I index) to measure the comparison of local and global cohesion, which is widely used in group embeddedness [34, 60, 61]. We define the E-I index of GMNs as

The “internal” edge connects the two nodes in the same community, and the “external” edge connects the nodes from the different communities. and are the sums of external degrees and weights for all nodes, respectively; and are the sums of internal degrees and weights for all nodes, respectively. The E-I index ranges from 0 to 1. Smaller E-I index values indicate stronger connectivity between communities; larger E-I index values indicate stronger connectivity within the community and show that the community is more independent.

Figure 7(d) shows a downward trend of the E-I index with respect to both degrees and weights. This figure indicates the continuous growth trend of cross-community connection and to some extent proves the significant trend of globalization in GMNs over the past fifty years.

Furthermore, Figure 9 shows E-I index values for ten typical communities; the countries/regions possessing the largest degree in the communities are listed, which could be regarded as the center of communities. For most communities, the central country/region is relatively stable and unchanged for these 50 years or is only adjusted between neighboring countries. It is worth noting that there are mergers and splits of communities 8–10, Europe, the former Soviet Union, and East Asia, which will be described with details in Section 3.4. Here, dark green indicates the smaller E-I index values for the community in the corresponding time period.

The results present that, over the past 50 years, the migration relation between different communities has become closer, which is reflected in the overall decline in E-I index values presented in Figure 9. In particular, the two communities of the former Soviet Union and the Indian Ocean are typically introverted, with most of the migration flow coming from the “internal edges” connecting the community members; the communities of America, French related, and Europe (since the 2000s) are extroverted, where the cross-community migration relation is greater than that within the communities. On the whole, the number of extroverted communities is increasing over time, and this also shows that, for the potential immigrants, the possible moving routes among communities become more abundant. In contemporary times, the number of communities with E-I index values below 50% (dark green grids) has increased from one to four, which indicates that the GMNs became more globalized and multipolarized from the 1960s to the 2010s.

3.4. Structural Evolution of GMNs

To assess the network structure more clearly, we analyze the correlation of the network communities with time. Figure 10 shows the matrix of Jaccard similarity coefficients of ten typical communities covering 90–97% of the countries/regions. In general, the composition of the members of each community is stable, and the characteristics related to geographical location are shown (Figure 11). Over more than 50 years, the central countries/regions of most communities have not changed. The green color indicates greater correlation, along with the higher coincidence of the members for the cluster between two eras. In contrast, the yellow color indicates that the structure of the communities changed greatly during this time.

Focusing on the yellow grids in Figure 10, combined with the specific composition of each community in Figures 7 and 11, we found some structural evolution of global migration networks during the past 50 years.

3.4.1. Community of the Former Soviet Union

The ninth community, centered on Russia, was an independent cluster in the 1960s-1970s; in the 1980s-1990s, it merged into the community of Europe centered on Germany. Such structural changes may be related to the collapse of the former Soviet Union in 1991, when it pursued the policy of deporting the nonnative population, together with the boom of immigrants from the East into Western Europe [62, 63]. After 2000, the former Soviet Union cluster left the Germany group and merged into the community of East Asia. In fact, since then, Russia gradually replaced Hong Kong SAR as the new center of the community. The map also shows that Russia and these former Soviet Union countries have had a closer relationship with East Asia in GMNs since 2000 (Figure 11).

3.4.2. Country of Canada

Once belonging to the British Commonwealth, Canada had a close relationship with Hong Kong SAR, and they were in the same community in the 1960s-1970s. Canada changed during 2000–2010 to join the community of America, which contained most of the American countries (Figure 11). In fact, some scholars have certified the relationships of the countries in Latin America and North America, including the United States [64, 65]. However, beginning in 2010, Canada left the United States community and became the new center of the British Commonwealth community; it is also the third closest country to the center on the hyperbolic plane, after the United States and French Guiana (Figure 5).

3.4.3. Community Centered on France

The structure of the second community centered on France also substantially changed. In the 1990s, 68% of its members belonged to European countries or their territories, but in the 2010s, the European members only accounted for 58%. In contrast, in the 1960s, African countries accounted for only 15% of this community, but after 2010, the proportion of African countries increased to 42%, which also shows that France, as the representative and center of the community, became increasingly close to African countries in the global migration network. This trend should be closely related to the influence of language and historical colonies [2, 66].

3.4.4. Countries including Malaysia, Singapore, and Indonesia

In the 1960s-1970s, Malaysia, Singapore, and Indonesia were in the East Asia community, with Hong Kong SAR as the center. But since the 1980s, these three countries have been transferred to the community centered on India, which has greatly impacted the East Asia community and greatly reduced the number of its members. After 2000, the former Soviet Union community was merged and the center of the cluster was adjusted to Russia, which greatly changed the structure of the East Asia community again.

4. Conclusion

Global population migration is a typical complex system. At the microlevel, each potential migrant makes a rational decision on “whether” and “where” to migrate according to the diversity utility function. Although the individuals are heterogeneous, specific migration patterns and evolution rules are continually emerging on the macrolevel.

The migration relationship between countries is complex and multilateral, and network theories could provide better description and more clearly exhibit on its structure and statistical characteristics. This paper constructs undirected global migration networks (GMNs) based on estimated bilateral migration flows during 1960–2015. The GMNs display the characteristics of disassortativity and high clustering with a typical power-law in-degree distribution. In the most recent fifty years, the network density and clustering have been increasing; the Gini coefficient of the degree and weight both exhibit an overall downward trend; and the entire network becomes more balanced and exhibits greater connectivity with time.

From the network perspective, we analyze the evolution trend of international migration by comparing the global and local migration connectivity in communities. On the whole, the number of extroverted communities is increasing over time. This observation indicates the continuous growth trend of cross-community connection and, to some extent, proves the significant trends of “globalization” and “multipolarization” in the global migration network since the 1960s.

The existing literature does not discuss the geometric characteristics of the population migration network. This paper indicates that the GMNs exhibited a significant hyperbolic characteristic and hierarchical structure during 1960–2015, which is becoming more obvious in these years. We embed the GMNs into hyperbolic space and finally obtain the locations of 200 countries/regions on a 2-dimensional Poincaré disk. Based on the definition of angular density distribution, it showed a trend of multicentering of the global migration networks since the 1990s.

Finally, we analyze the correlation between network communities and the structural evolution of GMNs with time. In general, from 1960 to 2015, the composition of the members of each community remained stable, and the central countries/regions of most communities did not change. In addition, we still find some changes: the former Soviet Union community merged into Germany during the 1980s-1990s, which could be related to the collapse of the former Soviet Union, and it left the German group and replaced Hong Kong SAR as the center of the East Asian community after 2000; the community centered in France reduced the proportion of members from Europe, and more African countries, especially those with colonial relations and labor contracts with France, gradually joined the group beginning in the 1960s; some southeastern Asian countries such as Singapore, Malaysia, and Indonesia were in the East Asian community but transferred to the Indian Ocean community centered in India since the 1980s.

This paper provides a creative way to analyze the structural, statistical, and geometric characteristics and hierarchical structure of the population migration network. With respect to complex human migration behavior, it is far from sufficient to analyze only the migration flow data. In future research, we will consider the economic, social, and policy factors that affect the decision-making of potential migrants and research the features and evolution of population migration patterns more comprehensively and scientifically.


A. Inhomogeneities and Disparity Filter Algorithm

To calculate inhomogeneities at the local level, for each country with migration routes, the authors calculate the Herfindahl-Hirschman index (HHI) , which is extensively used as an economic standard indicator of market concentration, and it is also denoted as the disparity measure in the complex networks literature:where is the total flow between countries and and is the strength (aggregated migration) of country . If country distributes its migration homogeneously between its migration partners, then ; in the opposite case, if all its migration is concentrated on a single link, then . For the inhomogeneities network, we can use the disparity filter to extract the backbone.

The disparity filter proceeds as follows. The authors first normalize the weights of edges linking node with its neighbor as , with being the strength of node and being the weight of the edge connecting and . For each migration channel of a given country , the authors compute the probability that the link takes the observed value according to the purely random null model. By imposing a significance level , the authors can determine the statistical significance of a given migration channel by comparing to . Therefore, if , the flow through that migration channel can be considered compatible with a random distribution (with the chosen significance level ) and is thus discarded. The statistically relevant channels are those that satisfyfor at least one of the two countries and . represents the degree of node .

By applying this selection rule to all of the links in the network, the authors find the backbone, a new graph containing, in general, fewer links and nodes, as the GMN in this paper. However, the number of links and nodes removed depends on the value of the significance level . To find the appropriate value of , it is convenient to plot the fraction of remaining nodes and the fraction of remaining weights in the backbone versus the fraction of remaining links for different values of . As the filter becomes more restrictive, the number of links decreases while keeping almost all nodes until a certain critical point, after which the number of nodes begins a steep decay. To retain more countries, more weights, and fewer links, the authors choose a point where the number of nodes begins to be lower than the initial value as our specific indicator for extracting the backbone networks [44].

B. Hyperbolic Embedding Method and Evaluation Index

The method proposed by Maximilian Nickel and Douwe Kiela is based on the Poincaré ball model, as it is well suited for gradient-based optimization [50]. In particular, let be the open d-dimensional unit ball, where denotes the Euclidean norm. The Poincaré ball model of hyperbolic space then corresponds to the Riemannian manifold (, ), that is, the open unit ball equipped with the Riemannian metric tensorwhere and denotes the Euclidean metric tensor. Furthermore, the distance between points is given as

Note that equation (B.2) is symmetric and that the hierarchical organization of the space is solely determined by the distance of nodes to the origin. Due to this self-organizing property, equation (B.2) is applicable in an unsupervised setting where the hierarchical order of objects such as text and networks is not specified in advance. Remarkably, equation (B.2) therefore allows us to learn embeddings that simultaneously capture the hierarchy of objects (through their norms) as well as their similarity.

The authors embedded our GMN in all time periods (1960–2015), using a 2-dimensional Poincaré disk, with a learning rate of 0.1 and a negative sample size of 30. To further explain the embedding effect, the general form of the function is as follows:where is the dissimilarity between nodes and and represents the embedded distances. Equation (B.3) is a general form from which several special embedding error functions can be obtained by substituting appropriate values of the constants , , and [52]. In our calculation, we made all of the constants equal to 1 for simplicity. To transfer the migration matrix to the dissimilarity matrix, for every weight , we use ( denotes the maximum weight in the matrix) to replace the original data. In Euclidean space, we use two kinds of regular MDS (multidimensional scaling) methods, namely, the nonmetric MDS (hereinafter referred to as NMM) and nonclassical MDS (hereinafter referred to as NCM), to embed the data for comparison.

The error function described in the previous section calculates the cumulative difference between the embedded distance and the actual data. However, it also concerns another issue: whether the two countries with closer relations are actually closer to each other than other countries after embedding. Here, the authors propose a scoring scheme to assess this possibility. For any two edges and that exist in the network, where and , and with corresponding embedding distances and , the authors calculate that

The authors repeat this random selection n times and obtain the following scores: . Since there are almost no consistent original migration data or embedding distances between different countries, is almost nonexistent. Thus, indicates the probability of meeting the required rules for any two links. The closer the is to 100, the better the embedding distance can interpret the size relationship in the original data.

C. Statistical Characteristics of GMN

Table 1 shows the statistical characteristics and evolution trends of global migration networks (note that APL indicates average path length; CC indicates clustering coefficient; ND indicates node degree; NS indicates node strength). The authors can observe an increase in the edges and weights, along with increased network density and connectivity, which means that many countries have become closer to each other in the GMNs.

D. Results of Community Division

The authors use the Louvain algorithm to divide the community of all networks. For example, the complete community results from 2010 to 2015 are presented in Table 2.

Data Availability

The bilateral migration data used in this paper could be downloaded from https://figshare.com/collections/Bilateral_international_migration_flow_estimates_for_200_countries/4470464 or by contacting Dr. Xiaomeng Li ([email protected]).

Conflicts of Interest

The authors declare no conflicts of interest.


The authors appreciate the comments and helpful suggestions from Professors Honggang Li, Handong Li, and Yougui Wang. This work was supported by the Chinese National Natural Science Foundation (71701018 and 61673070); Humanities and Social Sciences Foundation of Ministry of Education of China (20YJAZH010); the National Social Sciences Fund, China (14BSH024); China Scholarship Council; and the Beijing Normal University Cross-Discipline Project.