#### Abstract

An evolutionary model of the city size distribution is presented that explains the size of a city from the reproduction process and the migration of humans between cities. The model suggests that the city size distribution is a lognormal distribution with a power law tail in agreement with empirical results and computer simulations. The main idea of the model is that the competition between cities in the migration process is the origin of Gibrat's law. While growth rate fluctuations generate the lognormal branch of the size distribution, the power law tail for large cities is caused by a small mean growth rate.

#### 1. Introduction

The paper aims at deriving the size distribution of cities from the idea that the evolution of a city is a self-organized process. The term city is used here very loosely to mean any human settlement not distinguishing between villages, towns, cities, or megacities. The size is usually characterized by the urban area or human population. Ranking cities of a finite region, for example, a country, from largest to smallest, Zipf observed in 1949 that the probability to find the size of a city that is greater than some is given by , with a Zipf exponent . This power law relationship is known as Zipf’s law [1].

Using appropriate definitions of the city size the validity of Zipf law was confirmed by the investigation of large cities [2–10]. For smaller cities empirical investigations established considerable deviations from this law. Eeckhout [11, 12] showed that taking all settlements into account the size distribution for small cities can be rather described by a lognormal distribution. Empirical investigations suggest therefore that the size distribution is lognormal with a power law tail [13–15].

Several theoretical models have been established to understand these empirical facts [7, 16–22]. Since classic economic theories fail to explain the size distribution [23] the majority of models have in common to apply Gibrat’s law of proportionate effects [24, 25] taking advantage of geographic and socioeconomic forces in their reasoning.

The presented evolutionary theory goes beyond previous research by deriving Gibrat’s law from the main two processes governing the population size of a city: the reproduction of humans and their migration between cities. The key idea is to regard cites as entities of a self-organized system. The dynamics of self-organized systems is governed by positive feedback processes [26, 27]. Both the reproduction of humans and their migration are self-amplifying processes, while the migration flow is shown to be the result of an evolutionary competition between cities. Growth rate fluctuations of these processes are the origin of Gibrat’s law in similarity to other competing social systems [28–30]. Gibrat’s law together with a small mean growth rate generates the power law tail for large cities. A model of the self-organized evolution of cities is established in the next section. It allows the derivation of the empirically found city size distribution. The relation of the model to Zipf’s law is discussed in the conclusion.

#### 2. The Model

We want to study the city size evolution of a country in a time interval . The size of a city is characterized by its population. The population of the th city at time step is denoted by and the total population of a county is the sum over all cities: where indicates the total number of cities of the country. In order to establish a continuous model the size of a city is scaled by a large number such that is a positive real number and the total population is determined by (the parameter can be interpreted as the total number of humans at the last time step of the time period )

Three processes changing the city population in time are taken into account:(i)the reproduction process of the population of a city,(ii)the migration flux of humans between cities,(iii)the restructuring of existing cities (e.g., by mergers and the generation of new cities).We want to take advantage of the fact that the restructuring process of cities (iii) is usually much slower than the first two processes (i) and (ii). We chose the time interval such that the number of cities of a country can be regarded as constant: The corresponding number of cities is denoted by .

The reproduction process increases the population of a city by The effective growth rate of a city is determined by , while indicates the birth rate and indicates the mortality rate of inhabitants of a city.

The evolution of the th city size is further determined by the migration of humans between cities given by the following balance: where is the inflow rate and is the outflow rate of humans. A positive is related to an effective inflow and a negative to an effective outflow of inhabitants. Under the condition that migration takes place essentially within a country we obtain the following for this exchange process: and hence . Since both processes take place independently the growth of the population of a city is determined by Taking advantage of (7) the growth of the total population of the country is determined by with the mean growth rate as follows (the average over the population size is indicated by brackets):

##### 2.1. The Migration Flow

In order to establish the growth dynamics caused by the migration of humans we have to specify the inflow and outflow rates of humans of a city. For this purpose we want to denote humans interested in relocation as potential movers. Their total number is indicated by the variable , respectively, by the real number . As a first approximation we assume that potential movers occur randomly, independent of the city with a country mean generation rate . The number of potential movers generated per unit time in the th city is therefore . The amount of potential movers at a given time step is then determined by the following balance: This relation means that the number of potential movers of a city increases with the generation rate and decreases as a result of the migration outflow with . Under the condition that relaxes sufficiently fast, we obtain the following for the stationary solution: which characterizes the outflow rate of inhabitants of a city.

The inflow rate of humans of a city is determined by the number of decisions per unit time of potential movers to migrate to the th city. This rate will be zero if there are no potential movers. Under the condition that migrants prefer existing cities, the decision rate will be also zero if the size of a city vanishes. Hence, the inflow rate of a city is a function of both parameters, the number of potential movers, and the size of a city, . As a first approximation the inflow rate can be expanded near and as a product of both variables. We obtain the following for the first nonzero contribution: where the rate characterizes the chance that potential movers prefer just the th city for migration. It is therefore termed preference rate and can be regarded as a function of various location factors, not further specified here. From the constraint equation (7) we further demand that where From (14) it follows that and the migration flow of the th city becomes

##### 2.2. Evolutionary Dynamics

Taking advantage of (17) we obtain for the city growth dynamics given by (8) the following (note that both processes migration and reproduction are self-amplifying): While the first term indicates population growth, the second term expresses a replicator dynamics. It suggests that the city evolution is governed by an evolutionary competition in the migration process. The preferential growth process of a city is determined by the parameter . Since is usually termed as fitness, we want to denote given by (15) as city fitness. It is essentially characterized by the migration preference rate .

The replicator dynamics implies that cities higher than the mean fitness increase their size in time at the expense of cities with a lower fitness. The effective city growth rate is determined by the fitness advantage as follows:

##### 2.3. The City Size Distribution

In this section we want to derive the city size distribution based on the dynamics of city growth given by (18). The size distribution of cities is determined by the probability to find the population of a city in the interval and . The main idea to derive the city size distribution is to regard both growth rates and as consisting of a mean growth rate averaged over the time interval and growth rate fluctuations. Hence, we write the reproduction growth rate of a city as follows (for a simpler notation the subscription indicating the city is omitted. The average over the time interval is indicated by brackets with index ): where the first term is the mean growth rate averaged over the time interval and the growth rate fluctuations are indicated by . Equivalently we write the following for the fitness advantage: The urban growth dynamics of cities given by (18) turns into where the time average growth rates of the cities are and the fluctuating variable is determined by As a first approximation the fluctuating variable is treated as an independent, identical distributed (iid), random function with mean value and time correlation: where is a white noise amplitude. The Dirac delta function indicates that the fluctuating variable is regarded as uncorrelated in time.

The evolution of the city size depends on the relation between the magnitudes of the mean growth rate and growth rate fluctuations . We can distinguish between two cases.

For the case , the mean growth rate dominates over growth rate fluctuations in the considered time interval . In this case it is shown in Appendix A that the growth dynamics of cities has the form of a Fisher-Eigen law [27]. The competition between cities leads to a rapid replacement process of cities accompanied by a concentration of inhabitants in the cities with the highest fitness advantage, while cities with lower fitness disappear. In this case the stationary distribution reduces to a single megacity. However, this evolution is in contradiction with (4), suggesting that the number of cities remains nearly constant in the time interval .

Therefore we want to confine here the discussion to the other case . This relation between the mean growth rate and growth rate fluctuations can be taken into account by regarding the mean city growth rate as a small quantity. Introducing a small parameter with , we demand that the mean growth rates of the cities are of the order : In this case, however, the city growth dynamics depends on the size of a city. This becomes clear when we consider (22) for small cities with a size . Equation (26) suggests that the first term in (22) is then of the order and can be neglected compared to the second term. For small cities with (22) reduces to

Because is a fluctuating variable, this relation describes a multiplicative stochastic growth process known as Gibrat's law of proportionate effects [24]. A consequence of the competition in the migration process between cities and fluctuations of the reproduction process is in this model. With (25) the central limit theorem claims that the size distribution is given by a time-dependent lognormal distribution of the following form: where and are free parameters and is the city size scaled by the size at .

For large cities with , however, the first term comes into play. As derived from Appendix B, (22) can be interpreted as a generalized Langevin equation [31]. It yields as a stationary solution a power law city size distribution of the following form: with a Zipf exponent . For large cities the city size distribution approaches therefore a power law (Pareto) distribution. At a given time step in the city size evolution the size distribution separates into two branches, a lognormal distribution for small cities and a Pareto tail for large cities .

#### 3. Conclusion

In agreement with empirical results and numerical simulations the model suggests that the size distribution of cities is lognormal with a power law tail if the mean growth rate of cities can be regarded as small . Neglecting restructuring processes of cities (e.g., by mergers) the stationary size distribution for all cities would be over a very long time interval a Pareto distribution. The size distribution separates at an intermediate time step into two branches because small cities () approach this stationary Pareto distribution much slower than large cities ().

The lognormal contribution is in this theory essentially a consequence of a preferential growth process of cities. While the population size of small cities is mainly influenced by the competition between cities in the migration process, large cities benefit from their size and can grow faster due to the reproduction of their inhabitants. This prevents them from extinction and generates the Pareto tail in the size distribution. That large cities growth on average faster than small cities is confirmed empirically, for example, by Rosen and Resnick [32].

Gabaix discussed the stationary power law city size distribution and argued that the power law distribution has its origin in Gibrat’s law [17]. He suggested that for a stationary city growth rate the Zipf exponent must be . Empirical investigations found a Zipf exponent that is somewhat greater than one, depending on the various characteristics (these characteristics can be interpreted as the location factors mentioned above) [32]. The model suggests that the Zipf exponent is determined by the relation between the mean growth rate and the magnitude of growth rate fluctuations. As shown in Appendix C the constraint that the mean growth rate is small leads to . Following Gabaix, the Pareto tail established in this model is therefore governed by Zipf’s law.

Cities play in this approach the same role as species in the biological evolution. They adapt to the preferences of potential movers in a self-organized process. However, city preferences depend on a variety of location factors fluctuating in time [33]. Therefore the growth of small cities is dominated by growth rate fluctuations (Gibrat’s law) generating the lognormal part of the city size distribution. As mentioned above large cities benefit from their size in the growth process. In the economic literature this effect is known as “economies of scale.” Since only cities with profit from their size, the parameter characterizes the minimum size of a city to exploit economies of scale. The Pareto tail of the city size distribution can be interpreted therefore to be the result of economies of scale.

The uneven shape of the city size distribution is in view of the presented evolutionary model universal and a direct consequence of the reproduction and migration of humans.

#### Appendices

#### A. Replacement Process

We want to discuss the case that the mean growth rate of cities is large compared to growth rate fluctuations . Equation (22) suggests that in this case the growth of a city can be written as Let us consider next to the city with index an arbitrary second city with index . The time evolution of the proportion of the two cities is determined by the quotient rule as follows: with The time dependent solution is with the initial proportion of the city sizes at as follows: Hence, if the th city has a growth rate advantage , the other city is replaced in time. Introducing the population share of the th city as: and taking the time derivative, we obtain the following: with Applying (A.1) population shares are also governed by a replicator equation as follows: The stationary solution of the replacement process is determined by . This condition can be satisfied by setting either or . Since , the stationary solution is a megacity with ; while , for all other cities. Hence, the case of large mean growth rates of the cities during corresponds to a rapid replacement process. It is associated with the disappearance of cities while a single megacity survives.

#### B. Power Law Distribution of the City Size

For , (22) can be interpreted as a generalized Langevin equation of the following form: with and . Following Richmond and Solomon [31], this multiplicative stochastic relation can be transformed into a relation with additive noise by introducing the functions and as follows: respectively. Inserting these relations into (B.1) we obtain a Langevin equation of the following form: For uncorrelated fluctuations suggested by (25) this relation describes a random walk in the potential . For a sufficiently long time the probability distribution approaches the distribution as follows: where is a normalization constant. In terms of the original variable, we get which yields with the corresponding functions for and as follows: The stationary distribution of a multiplicative process is therefore a power law distribution.

#### C. Zipf’s Law

Write the average growth rate of the cities as Using (25), we obtain by approximating (22) by a difference equation:

It was shown by Gabaix [17] that for the stationary power law distribution has the form: with a Zipf exponent . Equation (26) implies, however, that . Hence we can write (C.2) as Taking the usual time interval (year), we obtain for the average growth rate for sufficiently small . Following Gabaix, Zipf’s law (with ) applies to the Pareto tail of the city size distribution.

#### Conflict of Interests

The author declares that there is no conflict of interests regarding the publication of this paper.