Abstract

Globalization and informatization have significantly reshaped the map of the global economy. Mega cities and regions have become the battlegrounds in the interplay between globalization and localization, with megaregions becoming the most globally significant spatial configurations in this regard. However, academics and government departments disagree on how to define the spatial boundaries of megaregions. In this study, on the basis of highway traffic flow data between cities, we integrate the community detection and core-periphery profile algorithms to characterize the city networks in China and identify the city groups and delineate the core structures of city groups, which are the underlying megaregional structures in China. Based on this, we identify 21 megaregions among city groups in China, including the Yangtze River Delta, Pearl River Delta, Beijing-Tianjin-Hebei, and Chengdu-Chongqing megaregions, and preliminarily delineate their spatial boundaries. On the whole, there are spatial differences among China’s megaregions to a certain extent. Central and eastern China have numerous, large, and a high density of megaregions, while the western region has relatively few megaregions. The latter also differs notably from mature megaregions in terms of rank sizes, urban systems, and functional divisions of labor. Generally, this study develops a novel analytical framework for identifying the functional regions of megaregional space in China from a perspective of relational geography, with methodological implications for other fields of inquiry.

1. Introduction

The past two decades have seen globalization and informatization that greatly reshape global economic geography and the rise of a network society, and mega cities and regions have become the spatial units that host the fiercest interplay between globalization and localization [13]. As globalization and urbanization have progressed, urban competition has gone beyond individual cities, and it is increasingly about position and functional connections in divisions of labor, especially competition, and cooperation in urban networks. Because of this, megaregions have become globally significant spatial configurations [46]. Since China’s 11th Five-Year Plan period (2006–2010), the development of megaregions has been elevated to the status of a national strategy, and they have been seen as the primary entities for promoting China’s “new type of urbanization.” Megaregions are currently a popular topic in both academia and governments. The concept repeatedly crops up in various settings, and it has become an important development strategy in regional economic and spatial planning in China [7]. As an objective geographical phenomenon, megaregions have always been an important topic of research in urban geography and urban planning, but for a long time the term has had ambiguous connotations in academic research, as the academic community has failed to reach a consensus on its definition and spatial boundary. This, in turn, has led to the ambiguity surrounding megaregions in practice [8, 9].

Defining the spatial boundary of a megaregion has always been a fundamental task of research on these spatial entities, and it is a prerequisite for understanding their formation, development, and evolution. Given that megaregions are very large, complex, dynamic, and open systems [10], their spatial boundaries can be ambiguous and changing. Defining and understanding the spatial boundaries of megaregions can also be tedious and difficult work for both geographers and planners. Based on the definition of megalopolis [10], combined with criteria for distinguishing metropolitan areas and regions, scholars have proposed criteria for identifying the spatial boundaries of megaregions (or megalopolises) from various perspectives [7, 11, 12]. Along with the “relational turn” in economic geography [13, 14], many scholars have emphasized the importance of the perspectives of relationality and connectivity in understanding the regional delineation of bounded geographic areas [15, 16]. Recently, Nelson and Rae used a large data set of commuter flows to identify megaregions in the United States [17]. Others integrate the big data with emerging methods, shedding light on the division of multilevel functional regions as a geographic issue through visualization and spatial analysis [1820].

With regards to identifying megaregions in China, substantial changes have taken place in both research data and methods, especially along with the advancement of modern geographic information technology and the advent of the era of big data. In addition to using traditional urban area-based spatial identification methods, some scholars have begun to try new methods of geographic information analysis, such as gravity model [21] and network analysis [22], to explore emerging data types, including night-time light [23] and point of interest (POI) [24], or combining qualitative and quantitative analyses to develop frameworks for recognizing the spatial boundaries of megaregions from a multidimensional perspective and defining the spatial boundaries of China’s megaregions or a single megaregion [25, 26]. Depending on their research approaches, scholars have defined and divided megaregions into different numbers, types, and ranges, and they have produced various megaregion boundary schemes [27].

In this era of globalization and information, cities are constantly becoming connected to the global production networks as locations and nodes. Cities and regions are being linked through various flows, networks, and relationships, forming urban functional region systems of various spatial scales [28, 29]. As large-scale urbanized regional landscapes, megaregions are huge complex systems that integrate economic, social, political, cultural, and other elements [1, 30]. Economic connections and functional divisions of labor are the most important spatial characteristics of these functional region systems, and they are key indicators that reflect their essence. As a result, defining the spatial boundaries of megaregions requires consideration from the perspective of relational geography.

The field of network science has shown that real-world networks tend to consist of various mesoscale structures, such as community structures [31] or core-periphery structures [32]. Thus, the national-scale city networks also have mesoscale structures, such as community and core-periphery structures, within the overall network. These mesoscale structures provide potential and new perspectives for a comprehensive and scientific understanding of urban and regional structural systems within the national urban system. City networks, which are rooted in geographical spatial relationships, have typical mesoscale structural foundations that can provide a potential methodology for empirically classifying and identifying the spatial boundaries of megaregional space. Highway passenger transportation mainly covers short distances, with significant spatial dependence and distance decay, which has proved useful in analyzing intercity functional relationships at the city and regional scales. For these reasons, intercity highway flows are crucial indicators for analyzing regional economic systems at the megaregional scale.

Against this background, on the basis of the city network perspective, we integrate the community detection and core-periphery profile algorithms to identify the city groups of the city networks and delineate the megaregions among the city groups in China so as to provide a methodological framework based on the relational geography for determining the spatial boundaries of megaregions.

2. Methodology and Data

2.1. Analytical Framework

As for the spatial characteristics of megaregions, we adopt a city network perspective to identify the spatial boundaries of megaregions. Specifically, we integrate the community detection and core-periphery profile algorithms to build an analytical framework and then spatially recognize the boundaries of megaregions in China based on the intercity connectivity. First, using highway flow data among cities at or above the prefectural level, we characterize the characteristics of the city network structures in China and then employ the community detection algorithm to divide China’s city networks and identify the city groups in functional region systems. Second, based on the above city groups, we further use the core-periphery profile algorithm to identify the core structures of the city groups and extract the core nodes, which are the most influential and most closely connected groups of nodes and correspond to the most cohesive agglomeration of cities in the city groups, which are believed as the “megaregional” component of the city groups. Third, taking into consideration natural surface conditions and socioeconomic development backgrounds, we finally delineate the spatial boundaries of megaregions in China.

2.2. Methods
2.2.1. Community Detection

In the field of network science, a community refers to a subset of a network. The nodes of a network can be grouped into sets of nodes so that each community is closely connected internally with sparser connections between groups. The identification of densely connected groups, based on network attributes, is referred to as community detection [33]. Community detection, which is crucial for understanding group structures in networks, has long been one of the most important issues in network science. To identify the community structures in the real-world networks, many efficient algorithms are created, such as Girvan and Newman [34], Walktrap [35], Fast-greedy [36], and Infomap algorithms [31].

As in real-world networks, node weight, edge weight, and linkage direction are important features for illustrating community structures of networks. Although the community detection methods could be conducted in different algorithms, many of the popular algorithms are currently unable to take directed and weighted networks into consideration. To fill this gap, the Infomap algorithm, currently one of the most robust algorithms, can consider topological properties including node and edge weights and network directions, which exhibits remarkable robustness and adaptability [5, 37, 38]. Therefore, based on the intercity highway passenger flows, we employ the Infomap algorithm of community detection to identify the city groups from the city networks in China.

In brief, the Infomap algorithm identifies communities within directed and weighted networks via the combined use of random walks and compression principles [31]. According to Shannon’s source coding theorem, if we use n codewords to describe the n states of a random variable X that occur with frequencies pi, the average length of a codeword can be no less than the entropy of the random variable X itself:

This expression provides a lower bound on the average length of codewords in each codebook. To calculate the average length of the code describing a step of the random walk, we need only to weight the average length of codewords from the index codebook and the module codebooks based on their rates of use. This is the map equation:

In this expression, L(M) denotes the expectation of average code length that the random walk spends inside and outside communities; q is the probability to exit module i; H(Q) is the frequency-weighted average length of codewords in the index codebook; pi is the probability to visit any node that the random walker spends in module i; and H(Pi) is the frequency-weighted average length of codewords in module codebook i.

2.2.2. Core-Periphery Profile

The portrait of a network as divided into a dense core and a sparse periphery, referred to as a core-periphery structure, originated from scholars in social sciences in the 1990s, and now this paradigm has been extended to other disciplines [32]. To identify the core-periphery structures in networks, some algorithms were proposed successively, such as the block-modeling [39], k-shell decomposition [40], and centrality [41]. However, most of the proposed algorithms are unable to deal with the weighted networks, and their robustness still needs to be verified. Against this background, Della et al. recently proposed the algorithm of core-periphery profile [42], disclosing the overall network structures and the peculiar roles of specific nodes.

In a network with an ideal core-periphery structure, peripheral nodes (p-nodes) are allowed to link to core nodes only; namely, no connectivity exists among p-nodes. In most real-world networks, however, the structure is not ideal although the core-periphery structure is evident: a weak (but not null) connectivity exists among the peripheral nodes. This calls for the generalized definition of α-periphery, which denotes the largest subnetwork S with the persistence probability .

We define the core-periphery profile αk, k = 1, 2, …, n, of the network by the following algorithm. This is the equation:

We start by the node i with the weakest connectivity and generate a sequence of sets by adding, at each step, the node attaining the minimal increase in the persistence probability. Correspondingly, we obtain the core-periphery profile, that is, the sequence of the persistence probabilities of the sets Pk.

The above algorithm provides, as byproducts, two other important tools of analysis including the centralization and coreness. We define the centralization C for a core-periphery profile as the complement to 1 of the normalized area; namely,

We can therefore quantify such a similarity by measuring the area between the -curve of a given network and that of the star network and normalizing to assign C = 1 to the star network itself (maximal centralization) and C = 0 to the complete network (no centralization). If a network displays a definite core-periphery structure (large C), then the sequence naturally provides a measure of coreness of each node. We have for all p-nodes (the periphery in the strict sense), whereas the coreness of the last inserted node is maximal and equal to .

2.3. Data Sources

This study uses original data on highway flows between 289 administrative units above the prefectural level in China. The data structure is asymmetric 289 × 289 matrix, characterizing the strength of spatial connectivity among cities in China based on the highway flows. We primarily used a vehicle services website (checi.cn) for web page retrieval and then extracted intercity highway flow data by circular queries. In addition, we combined the websites of the commercial services, such as China Highway Ticket Network and provincial and municipal highway ticket networks, Ctrip.com and Changtu.com, to verify and correct data manually based on cross-checks to ensure the completeness and accuracy of the data. This included data capture carried out using Microsoft’s C# programming language. The data collection date was June 2017.

3. Results

3.1. Identifying the City Groups in China

Based on the highway flow data between 289 cities at or above the prefectural level in China, we employ the Infomap community detection algorithm [31] to divide the whole cities into groups of cities. And then, we identify 19 city groups in total. Figure 1 shows the spatial network patterns of city groups in China based on the intercity highway flow data.

In terms of spatial patterns, these 19 city groups exhibit strong spatial dependence and diverse spatial organization patterns, with an apparent multilevel, networked regional configuration. In terms of spatial form, there are obvious hierarchical structures of city networks developing within city groups, which reflect the spatial imbalance toward core cities. Most city groups show varying degrees of core-periphery structures, with core cities occupying dominant positions within the network structures. Peripheral cities have a relatively weak connection with the network structures. This is the major reason that we use the core-periphery structures in city groups to identify megaregions in this study.

The 19 city groups have different spatial compositions. Within the city groups, cities have obvious differences in rank size and distribution, with a general trend of a slightly decreasing gradient. The specific parameters are shown in Figure 2. City Group 1, City Group 2, and City Group 3 are the three largest city groups, corresponding to the Jiangsu-Zhejiang-Shanghai city group in the Yangtze River Delta, which contains 25 cities and has a total of 8,357 connections; the Guangdong-Guangxi city group, which contains 37 cities and has a total of 9,519 connections; and the Beijing-Tianjin-Hebei-Shandong city group, which contains 32 cities and has a total of 5,993 connections. These city groups far exceed the other city groups in terms of the number of cities they incorporate and their number of connections, indicating significant agglomerative economic effects.

3.2. Extracting the Core Structures of City Groups

Based on the above city groups, we employ the core-periphery profile algorithm [42] on the basis of random walkers to extract the most cohesive structures of city groups, which are embryonic structures of megaregions among China’s regional economies. Because City Group 19 has only two prefectural-level cities—Urumqi and Karamay—that are geographically distant and have relatively weak economic connections, they do not have the basic conditions to develop city networks and economic integration. We do not believe that Xinjiang currently has the natural and economic conditions for a megaregion to develop, so only the other 18 city groups are used as the basis of this study.

Using the core-periphery profile algorithm mentioned previously, we calculate the polarization effects of the core-periphery structures as well as the position and importance of nodes in the networks. The centralization results are shown in Figure 3. From the core-periphery centralization, most of city groups in China have values greater than 0.5, indicating that they have obvious core-periphery structures, core cities have a clear agglomerative effect within the city groups, and some nodes have important positions and roles in the networks. The centralization coefficients of City Group 1 and City Group 16 are less than 0.5, which means the core-periphery structures of those two groups are not as strong as the others, rather than showing that they do not have core-periphery structures. Specifically, City Group 1 (Jiangsu-Zhejiang-Shanghai) has a relatively balanced network structure, so the polarization effect of the core-periphery structure is less pronounced than in other city groups. City Group 16 (central Inner Mongolia) has a low overall network density and no node cities with obvious advantages. The gap in strength between the cities is not notable, and the city network has relatively low connectivity. The city group does not have obvious core and periphery components, so its centralization coefficient is relatively low.

Figure 4 compares the “coreness” of the core-periphery structures of city groups in China. The x-axis ranks the city groups according to their coreness, and the y-axis is the coreness value. To investigate core-periphery structure features of each city group, we set the node order and coreness of the city groups to the same level, assuming that the number of nodes is 100, which allows comparison of the core-periphery structures of different city groups. Overall, the core-periphery structures of city groups in China are roughly similar. The number of core nodes in the network is generally greater than the number of periphery nodes, with the ratio of core to periphery nodes relatively stable at around 6 : 4. The curves in Figure 4 reveal that the watershed between core and periphery nodes is at approximately the point of 40%. After that, the coreness of the city groups begins to diverge, displaying four types.

The first type is the city groups with centralization coefficients greater than 0.8, which have notable features of core-periphery structures. Their core structures are situated after the 50% mark. This type includes City Group 15 (Guizhou), City Group 17 (Gansu, Ningxia, Qinghai, and Tibet), and City Group 18 (Yunnan). The second type is the city groups with centralization coefficients between 0.6 and 0.8, which have clear features of core-periphery structures. They are the most numerous types of city groups, including the City Groups 2, 4, 5, 6, 8, 9, 11, 12, 13, and 14. The core structures of this type are situated at over 40%. The third type is the city groups with centralization coefficients between 0.4 and 0.6, with core nodes at or above 30%. This type includes the City Groups 1 (Jiangsu-Zhejiang-Shanghai), 3 (Beijing-Tianjin-Hebei-Shandong), 7 (Fujian), and 10 (Jiangxi). The fourth type is the City Group 16 (central Inner Mongolia), which has a centralization coefficient of 0.32 and a relatively weak core-periphery structure, with little distinction between core and periphery nodes.

Based on the above descriptions of coreness and node order, we separate the core structures and periphery structures of the city groups and extract the core-periphery structures of city groups. The core structures are the embryonic structures of megaregions, so they provide a foundation for identifying the spatial boundaries of megaregions in China. The results are shown in Figure 5.

On the whole, the city groups in China exhibit the core-periphery structures embedded in the city networks. They essentially form ringed nested structures, with regional central cities at the core and regional peripheral cities at the edge. The core structures are larger than the periphery structures. In addition, the core structures have a variety of irregular spatial patterns. The different-color core structures in Figure 5 represent the core structures of corresponding city groups. For example, Core Structure 1 corresponds to City Group 1, and the other core structures are similarly shown. From the composition of spatial structures, the core structures of China’s city groups are mainly based around a regional administrative or economic center that forms a spatially adjacent and compact urban agglomeration with surrounding cities. Periphery structures, on the other hand, are mainly composed of cities located on the edges of city groups and a small number of cities in city groups that lack connections to other cities in the region. The ratios of the core structures to the periphery structures of city groups differ, but the number of cities in core structures is mostly greater than the number of cities in corresponding periphery structures. Of course, there is also a small number of periphery structures that contain more cities than corresponding core structures, which is determined by the inner structure of city networks of city groups.

Moreover, the core structures of the various city groups have a variety of irregular spatial distribution patterns, which can be roughly divided into three types: spatially adjacent and compact clustered structure; spatially separate or discontinuous multigroup structure; and belt structure with obvious directionality. Due to the spatial interactions and superpositions during the development of city networks [5], the core-periphery structures of some city groups display notable spatial fragmentation. Core structures of city groups are not affected by geographical distance decay that exhibit in the spatially adjacent and compact clustered type, but they are more common in the spatially separate or discontinuous multigroup type. This is due to the nature of “space of flows” and city networks.

3.3. Delineating the Spatial Boundaries of Megaregions

Based on the above analysis, we separate the core and periphery structures in the city networks of city groups and extract the core structures of the city groups. These core structures are the embryonic structures of megaregions, so they provide a foundation upon which to delineate the spatial boundaries of megaregions. The results show that the core structures of city groups are largely spatially coupled with the distribution of megaregions in China, which provides a new approach and understanding for comprehensively defining the spatial boundaries of megaregions.

With reference to the basic conditions for the development of megaregions and the above analysis of the core-periphery structures of city groups, we initially identify 21 underlying megaregions and delineate their corresponding spatial boundaries. The 21 megaregions are the Yangtze River Delta (YRD), Pearl River Delta (PRD), Beijing-Tianjin-Hebei (BTH), Chengdu-Chongqing (CCQ), Shandong Peninsula (SDP), Central Plains (CPL), Central and Southern Liaoning (LNP), Western Taiwan Strait (WTS), Guanzhong Plain (GZP), Changsha-Zhuzhou-Xiangtan (CZT), Hohhot-Baotou-Ordos-Yulin (CIM), Lanzhou-Xining (LXN), Central Shanxi (CSX), Wuhan (WUH), Central Anhui (CAH), Northern Jiangxi (NJX), Southern Guangxi (SGX), Central Guizhou (CGZ), Central Yunnan (CYN), Harbin-Daqing-Qiqihar (HAB), and Central Jilin (CJL) megaregions. Their specific geographic locations and boundaries are shown in Figure 6.

In terms of the foundations and current state of socioeconomic development of megaregions in China, the central and eastern megaregions have high population density, large economies, and deepening production networks, which promote functional divisions of labor and collaborative development between cities. They have good foundations for development in terms of their size, flows of factors of production, and city networks, which provide excellent conditions for the formation and development of megaregions. There are relatively few megaregions in western China, however. With the exception of the Chengdu-Chongqing megaregion, megaregions in the western region are smaller, with issues such as lower population density and relatively small economies. Moreover, the cities in some megaregions are scattered and spatially separated, so the intercity connections require crossing sizable geographic distances, which restricts the development of regional city networks. The megaregions in western China also suffer from uneven development. Most of their core cities are in the agglomeration stage of development, with a high degree of urban primacy, and significant polarization of central cities is relatively common. The megaregions in western China are also still developing in terms of their spatial structures, organizational patterns, and functional divisions of labor.

4. Discussion

Overall, the results reveal the spatial differentiation among megaregions in China. Central and eastern China have numerous, large, and a high density of megaregions, while the western region has relatively few. The latter also differs notably from mature megaregions in terms of rank sizes, urban systems, and functional divisions of labor, with some currently not deserving being megaregions. Some regions are disadvantaged by their basic conditions, such as geographical location, carrying capacity on resource and environment, and socioeconomic development, thus hampering the development of megaregions.

Compared with the 19 megaregions proposed in the 13th Five-Year Plan (2016–2020) of China, the megaregions identified in this study both overlap and differ to some extent. The main differences are the inclusion of the Middle Yangtze River, Harbin-Changchun, Northern Ningxia, and Northern Xinjiang megaregions in the 13th Five-Year Plan and their exclusion from this study, and the inclusion of the Central Anhui megaregion in this study and its exclusion from the 13th Five-Year plan.

Regarding the Middle Yangtze River megaregion, as we know, there are natural geographical obstacles between the three provinces of Hubei, Hunan, and Jiangxi, including the Mufu, Jiuling, and Luoxiao mountains range. Thus, the three provinces have so far failed to achieve functional integration of their urban areas. Moreover, the megaregions of the three provinces have tended to develop spatial connections of city networks in isolation and produce three independent city groups in the functional region system. The Harbin-Changchun megaregion also has obvious spatial separation and scattered distribution. In the process of dividing the city groups, several cities in central Jilin showed closer spatial connections with Liaoning Province, while the Harbin-Changchun megaregion does not show the tendency of integrated development. As for the Northern Ningxia megaregion, some cities in Ningxia are divided into groups with Gansu, Qinghai, and Tibet during the division of city groups, which means that the Northern Ningxia megaregion has not been independent. For the Northern Xinjiang megaregion, there are only two prefectural-level cities that are spatially distanced, so we could not identify a basis for the development of an independent megaregion. The Central Anhui megaregion identified in this article is not mentioned in the 13th Five-Year Plan.

5. Conclusions

As large-scale agglomerative landscapes of urbanized regions, megaregions are huge and complex systems that integrate multiple elements. From the perspective of spatial connotation, megaregions have been an important type of functional region system. The economic connections and functional divisions of labor, as the most spatial characteristics that reflect the nature of megaregions, are the most crucial foundations for the development of megaregions. Therefore, on the basis of the city network perspective, we attempt to integrate the community detection and core-periphery profile algorithms to identify the city groups of city networks in China and delineate the megaregions among city groups in China so as to provide a methodological framework based on relational geography and functional connections for determining the spatial boundaries of megaregions.

In the end, we identify the following 21 megaregions among city groups in China and tentatively delineate the corresponding spatial boundaries. Specifically, they are the Yangtze River Delta, Pearl River Delta, Beijing-Tianjin-Hebei, Chengdu-Chongqing, Shandong Peninsula, Central Plains, Central and Southern Liaoning, Western Taiwan Strait, Guanzhong Plain, Changsha-Zhuzhou-Xiangtan, Hohhot-Baotou-Ordos-Yulin, Lanzhou-Xining, Central Shanxi, Wuhan, Central Anhui, Northern Jiangxi, Southern Guangxi, Central Guizhou, Central Yunnan, Harbin-Daqing-Qiqihar, and Central Jilin megaregions.

As complex systems of urban-regional development, megaregions are the functional regions of large-scale urbanized agglomerative areas. Therefore, the delineation of the spatial boundaries of megaregions requires a comprehensive evaluation of multiple dimensions, rather than relying solely on one certain aspect. In this study, we introduce network science to the delineation of spatial boundaries of megaregions from the city network perspective. This analytical framework shows great potential in the identification of functional regions, which could also be one of the contributions of this article.

Although we use highway traffic flows to delineate the spatial boundaries of megaregional regions in China at the scale of the prefecture level and above the city, the intercity traffic flow used in this study just represents one type of intercity flows and the results are still preliminary. In the future, we still need to combine multiple-source data of networks and flows for complement and cross-checking [43, 44] and then comprehensively evaluate the spatial boundaries of megaregions based on the knowledge of the regional background.

Data Availability

All related traffic flow data in this paper are available upon request.

Conflicts of Interest

The author declares no conflicts of interest regarding the publication of this paper.

Acknowledgments

This research was supported by the National Natural Science Foundation of China (Grants Nos. 41901154 and 42130508), the Strategic Priority Research Program of the Chinese Academy of Sciences (Grant No. XDA20010102), and the Second Tibetan Plateau Scientific Expedition and Research Program (Grant No. 2019QZKK1007).