Abstract

Hierarchy of cities reflects the ubiquitous structure frequently observed in the natural world and social institutions. Where there is a hierarchy with cascade structure, there is a Zipf's rank-size distribution, and vice versa. However, we have no theory to explain the spatial dynamics associated with Zipf's law of cities. In this paper, a new angle of view is proposed to find the simple rules dominating complex systems and regular patterns behind random distribution of cities. The hierarchical structure can be described with a set of exponential functions that are identical in form to Horton-Strahler's laws on rivers and Gutenberg-Richter's laws on earthquake energy. From the exponential models, we can derive four power laws including Zipf's law indicative of fractals and scaling symmetry. A card-shuffling model is built to interpret the relation between Zipf's law and hierarchy of cities. This model can be expanded to illuminate the general empirical power-law distributions across the individual physical and social sciences, which are hard to be comprehended within the specific scientific domains. This research is useful for us to understand how complex systems such as networks of cities are self-organized.

1. Introduction

The well-known Zipf’s law is a very basic principle for city-size distributions, and empirically, the Zipf distribution is always associated with hierarchical structure of urban systems. Hierarchy is frequently observed within the natural world as well as in social institutions, and it is a form of organization of complex systems which depend on or produce a strong differentiation in power and size between the parts of the whole [1]. A system of cities in a region is always organized as a hierarchy with cascade structure [2]. Where mathematical models are concerned, a hierarchy of cities always bears an analogy to network of rivers [3, 4], while the latter has an analogy with earthquake energy distribution. There seems to be hidden order behind random distributions of cities, and the similar order can be found behind river networks and earthquake phenomena. Studies on urban hierarchies will be helpful for us to understand the general natural laws which dominate both physical and human systems.

Urban evolution takes on two prominent properties: one is the Zipf distribution at the large scale [58], the other is the hierarchical scaling relations between different scales and measures (e.g., [2, 914]). If a study area is large enough, the size distribution of cities in the area always follows Zipf’s law. The Zipf distribution, that is, the rank-size distribution, is one of ubiquitous general empirical observations across the individual sciences (e.g [1518]), which cannot be understood with the set of references developed within the specific scientific domain [19]. In fact, the Zipf distribution and hierarchical structure is two different sides of the same coin. Hierarchy can provide a new angle of view for us to understand Zipf’s law and allometric scaling of cities, and vice versa. Both Zipf’s law and allomtric growth law are related with fractals (e.g., [6, 2023]), and fractal theory is one of powerful tools for researching complexity and regularity of urban development.

In this paper, Zipf’s law, allometric scaling, and fractal relations will be integrated into the same framework based on hierarchy of cities, and, then, a model of playing cards will be proposed to explain the Zipf distribution and hierarchical scaling. From this framework, we can gain an insight into cities in the new perspective. Especially, this theoretical framework and model can be generalized to physical scientific fields. The rest of this paper is organized as follows. In Section 2, three exponential models associated with four power laws on hierarchy of cities are presented, and an analogy between cities, rivers, and earthquake energy is drawn to show the ubiquity of hierarchical structure. In Section 3, two case analyses based on large-scale urban systems are made to lend further support to power laws and exponential laws of cities. In Section 4, a theory of shuffling cards on urban evolution is illustrated to interpret the spatial patterns and hidden rules of city distributions. Finally, the discussion is concluded with several simple comments.

2. Cities, Rivers, and Earthquakes: Analogous Systems?

2.1. The Scaling Laws of Cities

First of all, the mathematical description of hierarchies of cities should be presented here. Grouping the cities in a large-scale region into classes in a top-down order, we can define a urban hierarchy with cascade structure. The hierarchy of cities can be modeled with a set of exponential equations where denotes the top-down ordinal number of city class , refers to the number of cities of a given size, correspondingly, and to the mean population size and urban area in the th class. As for the parameters, is the number of the top-order cities, and and are the mean population and urban area of the first-order cities. In theory, we take . The common ratios are defined as follows: denotes the interclass number ratio of cities, the population size ratio, and the urban area ratio. In fact, (2.1) and (2.2) are just the generalized Beckmann-Davis models [7, 24, 25]. According to Davis [25], if as given, then it will follow that , where the arrow denotes “approach” or “be close to.” If so, (2.1) and (2.2) express the rule, otherwise they express the generalized 2n rule.

Several power-law relations can be derived from the above exponential laws. Rearranging (2.2) yields , then taking logarithm to the base of this equation and substituting the result into (2.1) yields a power function as where . Equation (2.4) can be termed as the size-number scaling relation of cities, and is just the fractal dimension of urban hierarchies measured with population [2]. By analogy, the area-number scaling relation of cities can be derived from (2.1) and (2.3) in the following form in which . Here can be regarded as the fractal dimension of urban hierarchies measured with urban area. It is easy for us to derive an allometric scaling relation between urban area and population from (2.2) and (2.3) such as where denotes the proportionality coefficient, and is the scaling exponent. In light of the dimensional consistency, the allometric scaling exponent is actually the ratio of the fractal dimension of urban form to that of urban population [26].

In theory, the size-number scaling relation, (2.4), is mathematically equivalent to the three-parameter Zipf-type model on size distribution [7, 22, 27]. The latter can also be derived from (2.1) and (2.2), and the result is where is the rank of cities in decreasing order of size, and is the population of the ρth city. As for the parameters, we have the constant of proportionality , small parameter , and the power exponent [7]. If we omit the small parameter from (2.7), we have the common two-parameter Zipf model where is the population size of the largest city, and the Zipf exponent . If as given, then we will have the one-parameter Zipf model which is the well-known rank-size rule equivalent to the rule on cities. The rank-size distribution suggests self-similarity behind random patterns, and the fractal dimension is an important parameter to understand urban hierarchy [7, 22, 28].

2.2. Analogy of Cities with Rivers and Earthquake

The hierarchy of cities reflects the cascade structure which is ubiquitous in both physical and human systems. To provide a general pattern for us to understand how the evolutive systems are self-organized, we can draw an analogy between cities, rivers, and earthquake energy distributions (Figure 1). In fact, (2.1), (2.2), and (2.3) have the property of “mirror symmetry.” That is, if we transpose the order m, the structure of mathematical models will not vary, but exponents will change sign. Thus the three exponential laws can be rewritten as follows: where denotes the bottom-up ordinal number ,, and fulfill the same roles as in (2.1), (2.2), and (2.3), , , and represent the city number, population size, and urban area of the bottom order, respectively, and now. As regards the ratio parameter, we have , and .

These exponential models can be employed to characterize river networks and hierarchies of the seismic activities of a region (say, Japan) over a period of time (say, 30 years). Equations (2.10), (2.11), and (2.12) bear an analogy to Horton-Strahler’s laws in geomorphology [2931] and Gutenberg-Richter’s laws in geology and seismology [32, 33]. If the three exponential laws on cities, Horton-Strahler’s laws on rivers, and Gutenberg-Richter’s laws on earthquake are tabulated for comparison, they are identical in form to one another (Table 1). According to Horton [29], Schumm [30], and Strahler [31], the scaling relations of a network of rivers can be measured with river branch length (L), the number of tributary rivers of a given length (B), and drainage areas (S). According to Gutenberg and Richter [32], a hierarchy of seismic activities can also be described with three measurements: the size of released energy (E), the frequency/number of earthquakes of a certain magnitude (f), and rupture area (). The ordinal number indicative of the class of cities or rivers corresponds to the moment magnitude scale (MMS) of earthquakes. Thus, the similarity between (2.10), (2.11), and (2.12) and Horton-Strahler’s laws as well as Gutenberg-Richter’s laws is based on the corresponding measurement relations as follows: (1) city number river branch number earthquake frequency (), (2) city population size river branch/segment length earthquake energy (), (3) urbanized area drainage/catchment area fault break area ().

Despite all these similarities, there are clear differences among cities, rivers, and earthquake energy distributions as hierarchies. Actually, hierarchies can be divided into two types: one is the real hierarchy with physical cascade structure such as a system of rivers, and the other is dummy hierarchy with mathematical cascade structure such as earthquake energy in given period and region. For river systems, the rivers of order have direct connection with those of order . However, for earthquake, the quake energy sizes in the mth class have no fixed relation to those in the ()th class. For example, if the MMS of a main shock in a place is 7, the MMS of its foreshocks and aftershocks is usually 3~5 rather than 6. The earthquakes of order 6 and 8 often occur in another place and time and cannot be directly related to the shock of order 7. Generally speaking, the interclass relation in a dummy hierarchy is in the mathematical sense rather than physical sense. Cities come between rivers and earthquakes. It is hard for us to bring to light the physical cascade structure of a hierarchy of cities, but it is convenient to research into its mathematical structure.

Typically, Horton-Strahler’s laws are on real hierarchies, while Gutenberg-Richter’s laws on dummy hierarchies (Table 2). There are many empirical analyses about Horton-Straler’s law and Gutenberg-Richter’s laws [33, 34]. As for the exponential laws of cities, preliminary empirical evidence has been provided by Chen and Zhou [35]. In next section, two new cases will be presented to validate (2.1) to (2.8), lending further support to the suggestion that hierarchies of cities are identical in cascade structure to network of rivers and size distributions of earthquake energy.

3. Empirical Evidences for Urban Scaling Laws

3.1. Cascade Structure of USA’s Hierarchy of Cities

The theoretical regularity of city size distributions can be empirically revealed at large scale [36, 37]. The cities in the United States of America (USA) in 2000 are taken as the first example to make an empirical analysis. According to (2.1), (2.2), and (2.3), in which the number ratio is taken as , the 452 US cities with population more than 50,000 can be grouped by population size into 9 levels in the top-down way (). The population size is measured by urbanized area (UA). The 9 classes compose a hierarchy of cities with cascade structure. The number of cities (), the average population size (), and the mean urbanized area () in each class are listed in Table 3. The bottom level, namely, the 9th class () is what is called “lame-duck class” by Davis [25] due to absence of data from the small cities (less than 50,000). Then, the scaling relations between city number and urban population, between city number and urban area, and the allometric relation between urban area and population, can be mathematically expressed with power functions and displayed with double logarithmic plots (Figure 2).

The least squares calculations involved in the data in Table 3 yield a set of mathematical models taking the form of power function. The urban size-number scaling relation is The goodness of fit is about , and the fractal dimension is estimated as around (Figure 2(a)). The urban area-number scaling relation is The goodness of fit is about , and the fractal parameter is around (Figure 2(b)). The area-population allometric relation is The goodness of fit is around , and the allometric scaling exponent is about (Figure 2(c)). The hat of symbols and denotes the estimated values differing to some extent from the observed and theoretical values.

The fractal parameters and related scaling exponents can also be estimated by the common ratios. As mentioned above, the number ratio is given ad hoc as . Accordingly, the average size ratio is about , and the average area ratio is around . Thus, consider the formulae given above, , , , we have

According to the mathematical relationships between different models illuminated in Section 2.1, the power-law relations suggest that the hierarchical structure can also be described with a set of exponential functions, that is, (2.1), (2.2), and (2.3). The number law expressed by (2.1) is known, that is, . The models of the size law and the area law are in the following forms: which correspond to (2.2) and (2.3). The hat of symbols and indicates the estimated values. The goodness of fit is and , respectively. The fractal parameters and scaling exponents are estimated as , , and .

Theoretically, the fractal parameters or scaling exponents of a hierarchy of cities from different ways, including power laws, exponential laws, and common ratios, should be the identical with each other. However, in practice, the results based on different approaches are always close to but different from one another due to the uncontrollable factors such as random noises, spatial scale, and degree of system development. The average values of the fractal dimension and allometric scaling exponent can be calculated as , , and .

3.2. Cascade Structure of PRC’s Hierarchy of Cities

Another large-scale urban system is in the People’s Republic of China (PRC). By the similar method, the 660 cities of China in 2005 can be classified by population size into 10 levels . Different from US cities, the urban area of China’s cities is not UA, but the “built-up area (BA),” which is also called “surface area of built district.” The city number (), the average population size (), and the average urban area () in each class are tabulated as follows (Table 4). The bottom level, namely, the 10th class () is also a lame duck class because of undergrowth of small cities. The scaling relations can be expressed with three power functions and are illustrated with log-log plots (Figure 3). For the first two scaling relations, it is better to remove the data point of the lame duck class, which can be regarded as an outlier, from the least square computation in the regression analysis. As is often the case, the power-law relations break down when the scale of observation or systems is too large or too small [19].

Analogous to the US case, the least squares computations of the quantities listed in Table 4 give a set of power-law models and exponential models. The urban size-number scaling relation is The goodness of fit is , and the fractal dimension is estimated as (Figure 3(a)). The urban area-number scaling relation is The goodness of fit is , and the fractal parameter is (Figure 3(b)). The area-population allometric relation is The goodness of fit is , and the allometric scaling exponent is (Figure 3(c)).

The scaling exponents can also be estimated by number, size, and area ratios. The number ratio is given as (Table 4). Correspondingly, the average size ratio is , and the average area ratio is . In this case, the fractal parameters are estimated as follows:

The above results imply that (2.1), (2.2), and (2.3) can also be employed to characterize the hierarchical structure of China’s cities. The number law is . The models of the size and area laws can be expressed as The goodness of fit is and , respectively. The fractal parameters are estimated as , , and . Now, the average values of the fractal parameters or scaling exponents of the hierarchy of the PRC cities from three different ways can be calculated as , , and .

3.3. Interpretation of the Fractal Parameters of Urban Hierarchies

The fractal property and fractal dimension of a hierarchy of cities can be understood by analogy with the regular fractals such as Cantor set, Koch curve, and Sierpinski carpet. A fractal process is a typical hierarchy with cascade structure, and we can model it using the abovementioned exponential functions and power laws, for example, (2.1) to (2.6). There are three approaches to estimating the fractal parameters. The first is the regression analysis based on a power law, the second is the least square calculation based on a pair of exponential laws, and the third is numerical estimation based on the common ratios. In theory, the results from these different methods are identical in value to one another. However, for the empirical analysis, they are different to some extent from each other because of the chance factors of urban evolution and local irregularities of hierarchical structure (Table 5). In practice, the method based on the power laws is in common use as it can reflect the scaling relations directly, but the one based on the common ratios is simpler and more convenient. As for the method based on the exponential functions, it can show further information of hierarchical structure. For the random fractals, the more regular the cascade structure of cities, the more consistent the results from different approaches are. So, in a sense, the degree of consistency of fractal parameter values from the three different methods implies the extent of self-similarity of a urban system.

The fractal dimensions measured by city sizes (population and area) indicate the equality of the city-size distribution. The higher fractal dimension value of a urban hierarchy suggests smaller difference between two immediate classes, while the lower dimension value suggests the larger interclass difference. For the fractal dimension measured by city population , if , then we have , otherwise, . For the dimension measured by urban area , if , then we have , or else, . As indicated above, the scaling exponent is the ratio of to , and it can be treated as an elasticity coefficient. As far as a hierarchy of cities is concerned, the ratio of one dimension to the other dimension (say, ) is more important than the value of some kind of fractal dimension (say, or ). If , that is, , urban land area grows at a faster rate than that of population (positive allometry), and this suggests that the per capita land area will be more than ever the larger a city becomes; contrarily, if , that is, , urban land area grows at a slower rate than that of population (negative allometry), and this implies that the per capita land area will be less the larger a city is. Evidently, if , that is, , urban area and population grow at the same rate (isometry), and per capita land area is constant. Thus it can be seen that the scaling exponent can reflect the different types of urban land use: intensive or extensive, economical or wasteful.

Generally speaking, for the cities in the real world, we have , . If as given, then . Thus, . Both USA’s cities and PRC’s cities satisfy this rule. The similarities and differences between the cities of USA and those of PRC can be found from the parameter values estimated in Table 5. The consistency of fractal parameter values from different approaches is good for the two countries. The fractal dimension value based on city population is less than that based on urban area, that is, . Accordingly, the scaling exponents are less than 1, that is, . For the USA’s cities, , thus, ; for the PRC’s cities, , consequently, . The different values seem to suggest that the land use of USA’s cities is more efficient than that of PRC’s cities. However, it should be noted that the differences of parameter values partially result from different measures (say, for urban area, UA differs from BA). Especially, different countries have different definitions about urban area and population size. Anyway, as a whole, the cascade structure of USA cities is more regular than that of the PRC cities since the value of USA’s cities is closer to 1, and this conforms to Zipf’s law.

4. Cards Shuffling Process of Urban Evolvement

4.1. A Metaphor of Shuffling Cards for City Distributions

Many evidences show that urban evolution complies with some empirical laws which dominate physical systems. The economic institution, system of political organization, ideology, and history and phase of social development in PRC are different to a great extent from those in USA. However, where the statistical average is concerned, the cities in the two different countries follow the same scaling laws. Of course, the similarity at the large scale admits the differences at the small scale, thus the stability at the macrolevel can coexist with the variability at the microlevel of cities [5]. For the self-organized systems, the mathematical models are always based on the macrolevel, while the model parameters can reflect the information from the micro level. Notwithstanding the difference at the micro level displayed by parameters, the hierarchy of USA cities is the same as that of the PRC cities at the macro level shown by mathematical equations.

All in all, the hierarchy of cities can be described with three exponential models, or four power-law models including Zipf’s law. The exponential models reflect the “longitudinal” or “vertical” distribution across different classes, while the power-law models reflect “latitudinal” or “horizontal” relation between two different measurements (say, urban area and population size) (see Appendix A). The empirical analysis based on both America’s and China’s cities gives support to the argument that, at least at large scale, the hierarchical structure of urban systems satisfies the exponential laws such as (2.1), (2.2), and (2.3), or the power laws such as (2.4), (2.5), and (2.6). This suggests that the cascade structure of hierarchies of cities can be modeled by the empirical laws which are identical in mathematical form to Horton-Strahler’s laws on networks of rivers and Gutenberg-Richter’s laws on spatio-temporal patterns of seismic activities.

Urban hierarchy represents the ubiquitous structure frequently observed in physical and social systems. Studies on the cascade structure with fractal properties will be helpful for us to understand how a system is self-organized in the world. In the spatiotemporal evolution of cities in a region, there are at least two kinds of the unity of opposites. One is the global target and local action, and the other is determinate rule (at the macro level) and the random behavior (at the micro level). To interpret the mechanism of urban evolution and the emergence of rank-size patterns, a deck-shuffling theory is proposed here. A regional system (a global area) consists of many subsystems (local areas), and each subsystem can be represented by a card. The card-shuffling process symbolizes the introduction of randomicity or chance factors into evolution of regions and cities. The model of shuffling cards is only a metaphor, and the logical relation between this model and real systems of cities is not very significant.

Suppose there are many blank cards. We can play a simple “game” step by step as follows (Figure 4).

Step 1 (Put these blank cards in “Apple-Pie” order to form a rectangle array). For simplicity, let the number of cards in the array be , where and are positive integers. There is no interspace or overlap between any two cards (Figure 4(a)). As a sketch map, let us take for instance.

Step 2 (Fix these ordered blanks cards for the time being). Then draw a hierarchy of “cities” to form a regular network with cascade structure in light of (2.1), (2.2), and (2.3). Let the size distribution of cities follow Zipf’s law with (Figure 4(b)). In this instance, both the mathematical structure and physical structure can be described with the exponential laws or power laws given above.

Step 3 (Shuffle cards). Note that these cards are not blank and form a deck now. Unfix and mix these cards together, then riffle these cards again and again at your pleasure (Figure 4(c)). Finally, the cards are all jumbled up so that the spatial order disappears completely.

Step 4 (Rearrange the cards closely). Take out cards at random one by one from the deck, and place them one by one to form a array again (Figure 4(d)). The result is very similar to the map of real cities.

Examining these shuffled cards in array, you will find no ordered network structure of “cities” anymore. The physical structure of the network of “cities” may not follow the exponential laws and power laws yet. To reveal the hidden order, we must reconstruct the hierarchy according to certain scaling rule. Thus the physical cascade structure changes to the mathematical cascade structure, and then the regular physical hierarchy can be replaced with the dummy hierarchy (Table 6). The central place models presented by Christaller [38] represent the regular hierarchy, while the real cities in a region, say, America or China, can be modeled by a dummy hierarchy. In particular, in Step 2, the cities are arranged by the ideas of recursive subdivision of space and cascade structure of network [39]. The spatial disaggregation and network development can be illustrated by Figure 5 [6]. After shuffling “cards,” the regular geometric pattern of network structure is destroyed, but the mathematical pattern is preserved and can be disclosed by statistical average analysis at large scale.

4.2. Zipf’s Law as a Signature of Hierarchical Structure

After shuffling “cards,” the regularity of network structure will be lost, but the rank-size pattern will keep and never fade away. In this sense, Zipf’s law is in fact a signature of hierarchical structure. This can be verified by the empirical cases. Since the scaling relation of size distributions often breaks down when the scale is too large or too small [9, 40], we should investigate the scaling range between certain limits of sizes. The size distribution of the 482 American cities shows no trail on the double logarithmic paper, but the distribution of the 660 Chinese cities has a long tail on the log-log plot. According to the general rule of scaling analysis [22], the trail should be truncated in terms of logarithmic linearity, and only the 594 cities coming out top are kept for the parameter estimation. For the 482 US cities with population over 50,000, a least-square calculation yields such a model The goodness of fit is about , and the fractal dimension of urban hierarchy is estimated as around . For the 594 PRC cities with population size over 100,000, which approximately form a line on log-log plot (Figure 6), the rank-size model is The goodness of fit is , and the fractal dimension is estimated as about . Please note that the sample size for the rank-size analysis here differs to a degree from that for the hierarchical analysis in Section 3.2. Despite some errors of parameter estimation, the mathematical structure of urban hierarchy is indeed consistent with the Zipf distribution.

4.3. Symmetry Breaking and Reconstruction of Urban Evolution

The idea from shuffling cards can be employed to interpret urban phenomena such as the relationship between central place models and spatial distribution of human settlements in the real world. The central place models suggest the ideal hierarchies of human settlements with cascade structure [38], while the spatial patterns of real cities and towns are of irregularity and randomicity. If the actual systems of cities are as perfect as the models of central places, they will yield no new information for human evolution. Urban systems can be regarded as the consequences of the standard central place systems after “shuffling cards”. After the cards with central place patterns are shuffled, the ordered network patterns are thrown into confusion, but the rank-size pattern never changes. To reveal the regularity from urban patterns with irregularity, we have to model hierarchy of cities and then construct a dummy network (Figure 7).

The process of shuffling cards is a metaphor of symmetry breaking of apriori-ordered network. Owing to symmetry breaking, chance factors are introduced into the determinate systems, thus randomicity or uncertainty comes forth [41, 42]. In a sense, it is symmetry breaking that leads to complexity. Precisely because of this, we have inexhaustible information and innovation from complex systems. The question is how to disclose the simple rules behind the complex behaviors of complex physical and social systems. A possible wayout is to reconstruct symmetry by modeling hierarchies (Figure 7).

A hierarch with cascade structure can be treated as a “mathematical transform” from real cities to the regular cities (Figure 8). Suppose that there is a random pattern reflecting the spatial distribution of cities (Figure 8(a)). This pattern represents the systems of cities after “shuffling cards” (Figure 4(d)). The city size distribution of this system follows Zipf’s law. Let the number ratio . Then we can construct a hierarchy with cascade structure (Figure 8(b)). This hierarchy is in fact a dummy network of cities. By the principle of recursive subdivision of geographical space [6, 39], we can reconstruct an ordered network of cities (Figure 8(a)). This model on systems of cities can represent the regular network before “shuffling cards” in the apriori world (Figures 4(b), and 5(d)).

5. Discussion and Conclusions

In urban studies, Zipf’s law includes three forms: the first is the one-parameter Zipf’s law, that is, the pure form of Zipf’s law; the second is the two-parameter Zipf’s law, that is, the general form of Zipf’s law; the third is the three-parameter Zipf’s law, that is, the more general form of Zipf’s law [7]. If the small adjusting parameter, ς, equals zero, the three-parameter Zipf’s model will change to the two-parameter Zipf’s model, and if the scaling exponent, q, equals 1, the two-parameter Zipf’s model will reduce to the one-parameter Zipf’s model. Zipf’s law is associated with the principle of least effort, while the law of least action can be interpreted with the entropy-maximizing principle and spatial correlation analysis [20, 40]. The one-parameter Zipf’s law can be derived from the postulate of global entropy maximization, while the two- or three-parameter Zipf’s law can be derived from the postulate of local entropy maximization [40]. The population data of American cities can be roughly fitted to the one-parameter Zipf’s law. However, if we fit the size data of Chinese cities to the one-parameter Zipf’s model, the effect is not satisfying. This suggests that US cities are consistent with the global entropy maximization, but the PRC cities are dominated by the principle of local entropy maximization. This also suggests that the single Zipf’s law with only one parameter cannot explain the complexity and diversity of urban evolution. By the way, if we fit the dataset of American cities or Chinese cities to the Bradford distribution model derived by Leimkuhler [43], the goodness of fit is great. The Leimkuhler’s version of Bradford’s law came from Zipf-Mandelbrot distribution [44, 45], and it is equivalent to the three-parameter Zipf’s model (Appendix B). However, the physical meaning of the Leimkuhler model is not yet clear [46], and it remains to be researched in the future.

Zipf’s law used to be considered to contradict the hierarchy with cascade structure. Many people think that the inverse power law implies a continuous distribution, while the hierarchical structure seems to suggest a discontinuous distribution. In urban geography, the rank-size distribution of cities takes on a continuous frequency curve, which is not consistent with the hierarchical step-like frequency distribution of cities predicted by central-place theory [38]. However, the problem of the contradiction between the Zipf distribution and the hierarchies of central places has been resolved by different theories and methods (e.g., [7, 41, 42, 47]). In fact, the size distributions of urban places in the real world always appear as approximately unbroken frequency curves rather than the stair-like curves. The step-like hierarchical structure of central places is based on spatial symmetry, but according to dissipative structure theory, such a regular hierarchical distribution as central place patterns is very infrequent in actual case because that the spatial symmetry is always disrupted by the historical, political, and geographical factors [42]. What is more, the regular hierarchical structure is not allowed by the nonlinear dynamics of urbanization [7], and the simple fractal structure of urban hierarchies is often replaced wih the multifractal structure [47]. The multifractals of urban hierarchies suggest an asymmetrical hierarchy of cities, which differs from the standard hierarchical systems in central place theory.

Therefore, the hierarchical models are mainly based on the idea of statistical average rather than reality or observations. In terms of statistical average, the rank-size distribution can always be transformed into a hierarchy with cascade structure. However, the traditional hierarchical structure predicted by central place theory cannot be transformed into the rank-size distribution. On the other hand, the size distributions in the real world support Zipf’s law and the hierarchical model based on statistical average instead of the step-like hierarchical distribution. Consequently, a conclusion can be drawn that the absolute hierarchy should be substituted by the statistical hierarchy associated with the rank-size distribution. Precisely based on this concept, the metaphor of shuffling cards is proposed to interpret the urban evolution coming between chaos and order.

To sum up, Zipf’s law is a simple rule reflecting the ubiquitous general empirical observations in both physical and human fields, but the underlying rationale of the Zipf distribution has not yet been revealed. This paper tries to develop a model to illuminate the theoretical essence of the rank-size distribution: the invariable patterns of evolutive network or hierarchy. The hierarchy with cascade structure provides us with a new way of looking at the rank-size distribution. The hierarchy can be characterized by both exponential laws and power laws from two different perspectives. The exponential models (e.g., the generalized rule) and power-law models (e.g., the rank-size rule) of cities represent the general empirical laws. Studies on the human systems of cities will be instructive for us to understand physical phenomena such as rivers and earthquakes. By analogy with cities, we can understand river networks and earthquake behaviors and all the similar physical and social systems with hierarchical structure from new perspectives.

The theory of shuffling cards is not an underlying rationale, or an ultimate principle. As indicated above, it is a useful metaphor. The idea from cards shuffling is revelatory for us to find new windows, through which we can research the mechanism of the unity of opposites such as chaos and order, randomicity and certainty, and complexity and simplicity. A conjecture or hypothesis is that complex physical and social systems are organized by the principle of dualistic structure. One is the mathematical structure with regularity, and the other is the physical structure with irregularity or randomicity. The mathematical structure represents the apriori structure before shuffling cards, while the physical structure indicates the empirical structure after shuffling cards. A real self-organized system always tries to evolve from the physical structure to the mathematical structure for the purpose of optimization. In short, in the process of “shuffling cards” of urban system, there is an invariable and invisible pattern. That is the rank-size distribution dominated by Zipf’s law. To bring to light the latent structure and basic rules of urban evolution, further studies should be made on the rank-size pattern through proper approach in the future.

Appendices

A. Longitudinal Relations and Latitudinal Relations of Hierarchies

The longitudinal relations are the associations across different classes, while the latitudinal relations are the correspondences between different measures such as city population size and urban area. These relations can be illustrated with the following figure (Figure 9).

B. Bradford’s Law of Scattering and City-Size Distributions

If the size distribution of cities follows Zipf’s law, it will always conform to the Leimkuhler’s version of Bradford’s “law of scattering” [43, 44]. Bradford’s law is a special case of the Zipf-Mandelbrot “rank frequency” law [45]. The Zipf-Mandelbrot rank-frequency law is mathematically equivalent to the three-parameter Zipf’s law. Leimkuhler [43] gave a distribution model such as where , in which is the total number of cities, and is the proportion of the total population in all the cities (cumulative proportions), obviously . According to Leimkuhler [45], (B.1) was derived from Bradford’s law and is called “the Bradford distribution.” This suggests that (B.1) is equivalent to the special case of the Zipf-Mandelbrot rank-frequency law. Therefore, it is not surprise that the data of American and Chinese cities can be fitted to Leimkuhler’s version of Bradford’ law. Zipf’s law is an equivalent of Pareto’s density distribution [20], while the Bradford distribution proposed by Leimkuhler [43] is a cumulative distribution. A cumulative distribution can always yield a better goodness of fit then a density distribution.

Acknowledgments

This research was sponsored by the National Natural Science Foundation of China (Grant No. 41171129). The support is gratefully acknowledged. Many thanks to the anonymous reviewers whose interesting comments were helpful in improving the quality of this paper.