We study empirically and analytically growth and fluctuation of firm size distribution. An empirical analysis is carried out on a US data set on firm size, with emphasis on one-time distribution as well as growth-rate probability distribution. Both Pareto's law and Gibrat's law are often used to study firm size distribution. Their theoretical relationship is discussed, and it is shown how they are complementable with a bimodal distribution of firm size. We introduce economic mechanisms that suggest a bimodal distribution of firm size in the long run. The mechanisms we study have been known in the economic literature since long. Yet, they have not been studied in the context of a dynamic decision problem of the firm. Allowing for these mechanism thus will give rise to heterogeneity of firms with respect to certain characteristics. We then present different types of tests on US data on firm size which indicate a bimodal distribution of firm size.

1. Introduction

Recent statistical studies of firm size distribution in different countries and time periods (see Gibrat [1], Fujiwara et al. [2] and Mantegna and Stanley [3].) indicate that the distribution of firm size obeys a power law for various measures of firm sizes, such as total asset and number of employees. The nonnormal firm size distribution has been a long standing controversial issue in industrial organization literature. An important early study on firm size distribution has found that it follows Gibrat's Law [1].

Looking at the empirical data, one observes that firms grow unevenly over time. Firm size changes or fluctuates. One quantitative way to look at this is to look at the probability distribution function (pdf) of the growth rate, which is defined as the ratio of the second year's value and the first year's value. Several studies on this aspect [4], again on the upper tail, lead to confirmation of Gibrat's law [1], which states that the growth rate pdf conditional on the first year's value is independent from the first year's value. In other words, the growth rate is independent from the first year's value, when the latter is large.

The fact that these two laws, Pareto's law and Gibrat's law, were observed to coexist suggested some deep relations between them. In fact, just such a relation was found when the law of detailed balance, which states that the simultaneous pdf of the first year's value and the second year's value is symmetric under the exchange of the two variables. This also leads to a theoretical relationship of the positive-growth side of the growth-rate pdf and the negative-growth side of the same, which were confirmed by the data.

Overall recent empirical research has found that, first, for the upper tail of the size distribution of firms the Pareto law holds (which may also be consistent with Gibrat's law that above a certain minimum size the growth rate of firms is independent of size), second, the variance of firms' growth rates is independent of size over and above a certain minimum size and, third, the frequency of the moving up and down in size class is also roughly the same above certain size classes.

Yet, all three results appear to be consistent with the fact that the upper size classes behave distinctively from lower size classes and that the middle size classes are “thinned out.” Thus, there might be a tendency toward a twin-peak distribution with different behavior of the upper and lower size classes. Empirical work may have to deal with the fact that groups of firm size classes behave differently and that there is a bimodal distribution of size classes in the long run. (Other recent literature has also discovered bimodal firm size distribution; see Bottazzi and Secchi [5] and Bottazzi et al. [6].) We also provide an analytical discussion of the relationship of Pareto's law and Gibrat's law. But since the derivation of this relationship is rather technical, details of this study are presented in Appendix A. The analysis of the upper and lower tail of the firm size distribution via Pareto`s law and Gibrat`s law is complemented by an empirical study that addresses the issue of the “thinning out” of the middle and the issue of a bimodal distribution of firm size.

Since the statistical study of the size distribution of firms does not provide us with much insight into underlying mechanisms, we also focus on the dynamics of firm size distribution to give the statistical results some theoretical underpinning. We introduce and study economic mechanisms that may explain the dynamics of firm size and firm growth over time. We show that there is a long tradition in economic theory and, in particular, in the industrial organization literature of the last fifty years that has pointed out some major mechanisms why one indeed would expect a bimodal size distribution of firms in the long run. In Sections 2 and 3, we will demonstrate those mechanisms in the context of a dynamic investment model of the firms. In our theoretical study, we will argue that firms may be heterogeneous in size, because they are exposed to some of the subsequent mechanisms.

We want to exclude some arbitrary behavior of firms, and thus presume that all firms pursue the same dynamic investment strategy by aiming at maximizing the present value of their future pay off. Of course, there are also other influences on firm growth such as growth of overall demand, industry demand, cost and technology shocks and elaborate financing practices. Yet we want to focus only on some major mechanisms that have been studied in economic theory since long and confront their implied predictions for firm size dynamics with the results of our empirical study.

The remainder of the paper is organized as follows. In Sections 2 and 3, we introduce the mechanisms that may give rise to certain properties of the size distribution of firms, in particular a twin-peak distribution of firm size. In Section 4, then our empirical results will be shown demonstrating that there is indeed a tendency toward a twin-peak size distribution of firms in the long run. Section 5 concludes the paper. Appendix A presents the technical derivations of the relationship of the Pareto law and Gibrat law pertaining to the upper and lower size classes.

2. Modeling the Dynamics of Firm Size Distribution

We introduce here some mechanisms that may give rise to some dynamics toward a twin-peak size distribution of firms. The first important mechanism that we want to study is based on locally increasing returns to scale. This idea of locally increasing returns to scale exist in the literature since Marshall [7]. Recently, it has extensively been employed by Arthur [810] who demonstrates that a variety of positive feedbacks such as learning by using, scale economies, and increasing returns to information and skills are set in motion if one firm enjoys, for example, by historical accident, some size advantage. Hereby, locally increasing returns may be approximated by a convex-concave production function as proposed by Skiba [11] and Brock and Milliaris [12].

Our second mechanism is based on the adjustment cost of capital which has been employed in investment theory since Eisner and Stroz [13], Lucas [14], and Gould (1968). We will explore quadratic as well as more general adjustment cost of capital and study to what extent it will generate a size-dependent dynamics. We are in particular interested in size effects on adjustment costs that will be advantages for larger firms. (A further discussion and survey of the adjustment cost literature is given in Kato et al. [15].)

A third mechanism can be seen to originate in cost and ease (or tightness) of credit. Important contribution to this line of research can be found in Townsend [16], Bernanke et al. [17], and Grüne et al. [18]. This literature assumes asymmetric information and agency costs in borrowing and lending relationships. This may, as Gertler and Gilchrist [19] have shown, in particular hold true for small firms. We here draw on the insight of the literature on costly state verification (This literature originates in the seminal work by Townsend [16].) in which lenders must pay a cost in order to observe the borrower's realized returns. This motivates the use of collateralized credit. Uncollateralized borrowing is assumed to pay a larger risk premium than collateralized borrowing or self-financing. The risk premium arises from the threat of bankruptcy namely from the cost constituted by auditing, accounting, legal cost, as well as loss of assets arising from asset liquidation. So, one would expect for different types of firms different risk premiums and thus a credit spread across firms as already Merton [20] has demonstrated.

It is presumed that the risk premium drives a wedge between the expected return of the borrower and the risk-free interest rate and becomes zero, in the limit, when debt approaches zero. A suitable function for this premium, will be introduced below. Yet, even if the credit cost spread is endogenized, there might be borrowing constraints for the firm that finances investment externally.

We may specify a general model of a firm 𝑖 which is potentially exposed to the above mechanisms. Let us define the dynamic decision problem that firm 𝑖 faces. Leaving aside the index 𝑖, the decision problem is proposed to be as follows: 𝑉(𝑘)=Max𝑗0𝑒𝜃𝑡̇̇𝑓(𝑘(𝑡),𝑗(𝑡))𝑑𝑡,(2.1)𝑘(𝑡)=𝑗(𝑡)𝜎𝑘(𝑡),𝑘(0)=𝑘,(2.2)𝐵(𝑡)=𝐻(𝑘(𝑡),𝐵(𝑡))(𝑓(𝑘(𝑡),𝑗(𝑡))𝑐(𝑡)),𝐵(0)=𝐵0.(2.3)

The decision problem of the firm is to maximize its present value 𝑉(𝑘), of (2.1), using the investment 𝑗 as choice variable. (Subsequently, we leave aside the time index.) The term under the integral of (2.1) is the discounted payoff, 𝑓(𝑘,𝑗), the net income of the firm, discounted at a discount rate 𝜃. Equation (2.2) represents the law of motion of the capital stock, 𝑘, with 𝜎 a coefficient representing the depreciation of the capital stock. Equation (2.3) denotes the evolution of the firm's debt, 𝐵. Note that we allow for negative investment rates 𝑗<0 that is reversible investment for simplicity. (The model can also be interpreted as written in efficiency labor; therefore, 𝜎 can represent the sum of the capital depreciation rate, and rate of exogenous technical change.) In (2.3), the term 𝐻(𝑘,𝐵) represents credit cost depending on net worth of the firm, and 𝑐 can be viewed as dividend payment (or consumption stream of the asset holders), which is treated here as exogenous. (How it can be endogenized is further specified in Grüne et al. [18].)

Allowing for adjustment cost, 𝑗𝛽𝑘𝛾, the firm's net income is determined by𝑓(𝑘,𝑗)=𝑦(𝑘)𝑗𝑗𝛽𝑘𝛾,(2.4) whereby the income of the firm 𝑦(𝑘) is generated from capital stock, through a production function, to be defined below. So, overall, investment, 𝑗, is undertaken so as to maximize the present value of net income of (2.4) given the adjustment cost of capital 𝑗𝛽𝑘𝛾 in (2.4). Note that 𝜎>0,  𝛼>0,  𝛽>1, and 𝛾>0 are constants. For 𝛽=𝛾=𝑧, the adjustment cost is quadratic.

As to the production function, 𝑦(𝑘), we may take a convex-concave production function as proposed in Skiba [11] and specified below giving us the incorporation of the first mechanism into our model. We also could use a Cobb-Douglas production function 𝑦(𝑘)=𝑎𝑘𝛼 and stress adjustment cost of capital with size effects. This will deliver us a second variant of our model.

Since net income in (2.4), less the dividend stream 𝑐, can be negative, the temporary budget constraint requires further borrowing from credit markets, and if there is positive net income, less dividend, debt can be retired.

Note that in the above general case of adjustment cost 𝑗𝛽𝑘𝛾 in (2.4), if we take 𝛽=2 and 𝛾=0, we have the standard model with quadratic adjustment cost of investment. When we employ the locally increasing returns production function, a convex-concave production function, we will drop the adjustment cost term 𝑗𝛽𝑘𝛾, as also done in Brock and Milliaris [12].

For representing our third mechanism, we assume that the credit cost 𝐻(𝑘,𝐵) in (2.3) may be state dependent, depending on the capital stock, 𝑘, and the level of debt 𝐵 with 𝐻𝑘<0 and 𝐻𝐵>0. Note, however, that if we assume that the credit spread depends inversely on net worth and the net worth is equal to the value of the capital stock, we get a special case of our model when only the risk-free interest rate determines the credit cost. We then have a credit cost that is only depending on an exogenous interest rate and a state equation for the evolution of debt such aṡ𝐵(𝑡)=𝜃𝐵(𝑡)𝑓(𝑘(𝑡),𝐵(𝑡)),𝐵(0)=𝐵0.(2.5) In this case, we only would have to consider the transversality condition lim𝑡𝑒𝜃𝑡𝐵(𝑡)=0, as the nonexplosiveness condition for debt, to close the model (2.1)–(2.3).

In general, we define the limit of 𝐵(𝑡) equal to 𝑉(𝑘) which represents the present value borrowing constraint which is here defined as curve and not as a point. The present value 𝑉(𝑘) represents the debt capacity of the firm. In all of the above cases, the problem to be solved is then how to compute the optimal investment strategy and the present value, 𝑉(𝑘), of the firm.

If the interest rate 𝜃=𝐻(𝑘,𝐵)/𝐵 is constant (as aforementioned in computing the present value of the future net income, we do not have to assume a particular fixed interest rate, but the present value, 𝑉(𝑘), will, for the optimal investment decision, enter as argument in the credit cost function 𝐻(𝑘(𝑡),𝑉(𝑘(𝑡))) as in (2.5), then as is easy to see, 𝑉(𝑘) is in fact the present value of 𝑘𝑉(𝑘)=Max𝑗0𝑒𝜃𝑡̇𝑓(𝑘(𝑡),𝑗(𝑡))𝑑𝑡,(2.6)s.t.𝑘(𝑡)=𝑗(𝑡)𝜎𝑘(𝑡),𝑘(0)=𝑘0̇,(2.7)𝐵(𝑡)=𝜃𝐵(𝑡)(𝑓(𝑘(𝑡),𝐵(𝑡))𝑐(𝑡)),𝐵(0)=𝐵0,(2.8) with 𝑘(0) and 𝐵(0) the initial value of 𝑘 and 𝐵.

The case, however, when the credit cost is endogenous, thus, when we have 𝐻(𝑘,𝐵), then the present value itself becomes difficult to treat. Pontryagin's maximum principle is not suitable to solve the problem with endogenous credit cost, and we thus need to use dynamic programming to solve for the present value and investment strategy of the firm. One step toward using dynamic programming is to formulate our models of (2.1)–(2.3) and (2.6)–(2.8) as Hamilton-Jacobi-Bellman (HJB) equation. An example of how this can be done is shown in Appendix B.

In the context of the model that explores the role of risk premiums and credit spread as a cause for unequal firm size, we can also study the impact of “ceilings” in debt contracts and their impact on firm size. Indeed, credit restrictions may affect the investment decisions. (Suppose the “ceiling” is of the form 𝐵(𝑡)<𝐶, with 𝐶 a constant, for all 𝑡. The ceiling has clearly an effect on the investment dynamics. Either 𝐶>𝑉(𝑘), then the ceiling is too high, because the debtor might be tempted to move close to the ceiling and then goes bankrupt if 𝐵>𝑉(𝑘). If 𝐶<𝑉(𝑘), then the firm may not be able to develop its full growth potentials through optimal investment decisions.)

In all three cases—locally increasing returns to scale, nonlinearity in adjustment costs of capital, and credit spread and credit constraints (the latter two arising from credit risk)—the optimal investment strategy and the growth of the firm depend on the initial size of the firm. We permit firms to be heterogeneous with respect to the way how they are exposed to the above three mechanisms.

We will show that there can be thresholds that separate the solution paths for 𝑉(𝑘) to different domains of attraction. For firms with lower capital stock—and being exposed to one of the above mechanisms—but below some threshold it will be optimal for the firms to contract, whereas large firms above some threshold may choose an investment strategy to expand. Moreover, we can admit in our study various paths for dividend payment, 𝑐, and their impact on the investment strategy and the present value curve 𝑉(𝑘) for our different model variants. (The latter aspect is studied in Grüne et al. [18].)

3. Numerical Study on the Dynamics of Firm Size Distribution

Next, we present numerical results obtained for our different specification of a production function, adjustment costs of capital, and imperfect capital markets. Throughout this section, we specify the parameter 𝜎=0.15 and 𝛾=0.3. The other parameters will be model specific and specified below. (Note that we, of course, could choose another source of heterogeneity of firms, namely, by assuming different technology parameters for firms. This might be another line of research which we will not pursue here.) Unless otherwise noted, we use for the dividend stream 𝑐(𝑡)0 in our experiments which are relaxed in Grüne et al. [18].

As for the numerical procedure, we use two algorithms. The first is a dynamic programming algorithm as presented in Grüne et al. [18] and applied in Grüne and Semmler [21]. The examples below were computed for different 𝑘𝑠 in the compact interval [0,2] with control range 𝑗[0,0.25]. (In all our experiments, larger control ranges did not yield different results.) To solve the problem (2.6)–(2.8), a dynamic algorithm has been used with the numerical time step =0.05 and an initial grid with 39 nodes. The final adapted grid consisted of 130 nodes. The range of control values was discretized using 101 equidistributed values.

For a second algorithm that computes domains of attraction of the problem (2.1)–(2.3), we used the time step =0.5, and in order to generate the discrete time model, we used a highly accurate extrapolation method. For this algorithm, the range of control values was discretized using 51 equidistributed values. The domain covered by the grid was chosen to be [0,2]×[0,3], where the upper value 𝐵=3 coincides with the value 𝑐=3 used in order to implement the restriction sup𝑡0𝐵(𝑡)<. The initial grid was chosen with 1024 cells, while the final adapted grids consisted of about 100000 up to 500000 cells depending on the example. For this algorithm, the figures below always show the set 𝐸Γ which approximates the present value curve 𝑉(𝑘). Recall that the width of this set gives an estimate for the spatial discretization error.

We start our numerical examples first with the common case of quadratic adjustment costs of investment and constant returns to scale. This gives us the usual case of a unique (positive) steady state and about which firm size is predicted to be normally distributed. In the next examples, we have built in the above stated three mechanisms giving rise to multiple domains of attraction and thus multimodal distribution of firm size.

3.1. Constant Returns to Scale with Quadratic Adjustment Costs

For the common case usually found in the literature, and as represented in (2.6)–(2.8), we assume a concave production function 𝑦(𝑘)=𝑎𝑘𝛼, with 0<𝛼<1 and quadratic adjustment cost. As model parameters, we specify 𝛼=0.5, 𝛽=2, 𝑏=0.5, 𝑎=0.29, and 𝜃=0.1. This specifies the most simplest variant of a dynamic decision problem with adjustment costs which has often been employed in economics and which can be shown to exhibit solely one positive steady state equilibrium 𝑘 and a normally distributed firm size. The present value of the representative firm is here simply given by the present value of the net income stream of the firm, since we here assume a constant credit cost as shown in (2.8).

To compute the optimal investment strategy and firm value, we can use our first procedure, our dynamic programming (DP) algorithm. The firm value is given by the optimal value function in Figure 1 and the solution path of the dynamic decision problem, (As, for example, presented in Grüene and Semmler [21].) the investment decision, and thus the optimal growth of (or contraction) of the firm is given by the optimal control in Figure 1. The (unique) positive steady state is about 𝑘=0.95 for the capital stock, below which the firm would grow and above which it would be optimal for the firm to shrink its capital stock. Allowing for shocks to the size of the firms, a normal distribution of firm size would be expected.

3.2. Increasing Returns to Scale and Adjustment Costs with Size Effects

Next, we numerically solve for the case of increasing returns to scale and adjustment costs with size effects. Although, as shown in Grüne et al. [18], these are two separate causes for generating multiple domains of attraction, we demonstrate both mechanisms in a single numerical exercise. We assume that (2.1)–(2.3) hold, but we still presume that the firm faces constant credit cost 𝐻(𝑘,𝐵)=𝜃𝐵 as in (2.8). We again can use the DP algorithm in order to solve the discounted infinite horizon problem (2.1)–(2.3). Figure 2 shows the corresponding optimal value function representing the present value of the prototypical firm, 𝑉(𝑘), (upper graph) and the related optimal control, the investment decision, in feedback form (lower graph); see Figure 2.

For our parameters, the model does not necessarily have an unique positive steady state equilibrium. There can be multiple domains of attraction for firms that are exposed to either increasing returns and/or adjustment costs with size effects. The fate of a particular firm, when it is exposed to increasing returns to scale and/or to adjustment costs with size effects, then will depend on the initial capital stock size, 𝑘. There is a threshold, 𝑠, at 𝑘+=0.267 which is clearly visible in the optimal investment strategy, which is discontinuous at this point.

Thus, the dynamic decision problem of the firm faces a discontinuity. For firms with initial values of the capital stock 𝑘(0)<𝑘+, it is optimal to shrink the capital stock to 𝑘=0, for initial values of the capital stock 𝑘(0)>𝑘+ the optimal investment will lead to the domain of attraction 𝑘=0.996. In sum, investment for a firm to the left of 𝑘+ is lower than 𝜎𝑘 and makes the capital stock shrink, whereas investment for a firm to the right of 𝑘+ is larger than 𝜎𝑘 and lets the capital stock increase. At 𝑘+, investment for the firm then jumps. In this case, then multiple domains of attraction exist and a bimodal distribution of firms with different growth rates of different size classes would be expected. (The upper tail of the distribution may then exhibit the Pareto distribution (Power law), whereas the lower part may just show a (log-)normal distribution.)

3.3. Capital Markets and Credit Spread

A similar result can be obtained if there is a distinct credit spread, and firms have to pay idiosyncratic risk premiums, or there may be credit constraints on firms' investment. This may also result in multiple domains of attraction. Let us presume that credit cost 𝐻(𝑘,𝐵) is endogenous depending on net worth.

Before, we had postulated that a risk premium is positively related to the default cost and inversely related to the borrowers net worth. Net worth is defined as the firm's collateral value of the (illiquid) capital stock less the agent's outstanding obligations. Following Bernanke et al. [17], we measure the inverse relationship between the risk premium and net worth in a function such as𝛼𝐻(𝑘(𝑡),𝐵(𝑡))=1𝛼2+𝑁(𝑡)/𝑘(𝑡)𝜇𝜃𝐵(𝑡),(3.1) with 𝐻(𝑘(𝑡),𝐵(𝑡)) the credit cost depending on net worth, 𝑁(𝑡)=𝑘(𝑡)𝐵(𝑡), with 𝑘(𝑡) as capital stock and 𝐵(𝑡) as debt. The parameters are 𝛼1,𝛼2,𝜇>0, and 𝜃 is the risk-free interest rate. In the analytical and numerical study of the model below, we presume that the risk premium will be zero for 𝑁(𝑡)=𝑘(𝑡), and thus, in the limit, for 𝐵(𝑡)=0, the borrowing rate is the risk-free rate. Although this could occur for small scale firms, a borrowing rate closer to the risk-free rate is more likely to hold for large-scale firms. (See Gertler and Gilchrist [19].)

In general, it is not possible to transform the above problem into a standard infinite horizon optimal control problem for our prototype of firm; hence, we will use our second procedure, the algorithm for the computation of domains of attractions from Grüne et al. [18], and undertake experiments for different shapes of the credit cost function.

For the endogenous credit cost function (3.1), we specify 𝜇=2. Taking into account that we want 𝜃 to be the risk-free interest rate, we obtain the condition 𝛼1/(𝛼2+1)2=1, and thus 𝛼1=(𝛼2+1)2. Note that for 𝛼2 and 0𝐵𝑘, one obtains 𝐻(𝑘,𝐵)=𝜃𝐵, that is, the model from the previous section. In order to compare these two model variants we use the formula 𝐻(𝑘,𝐵)=(𝛼1/𝛼22)𝜃𝐵 for 𝐵>𝑘. (For small values of 𝛼2, it turns out that the present value curve satisfies 𝑉(𝑘)<𝑘; hence, this change of the formula has no effect on 𝑉(𝑘).)

Figure 3 shows the respective present value curves 𝑉(𝑘) under the condition sup𝑡0𝐵(𝑡)< for 𝛼2=100,10,1 (from top to bottom) and the corresponding 𝛼1=(𝛼2+1)2.

For 𝛼2=100, the trajectories on the curve 𝑉(𝑘) show almost the same behavior as the optimal trajectories in the previous section: there exists a threshold (now at 𝑘+=0.32) and two stable domains of attraction at 𝑘=0 and 𝑘=0.99. Our numerical solutions have revealed that for decreasing values of 𝛼2100, the threshold value 𝑘+ first increases (i.e., moves to the right) and the stable domain of attraction 𝑘 decreases (i.e., moves to the left) until they meet at about 𝛼2=31. For all smaller values of 𝛼2, there exists just one equilibrium at 𝑘=0 for all capital stock sizes which is stable.

The latter is illustrated for a discrete value of 𝛼210, where there is no threshold observable, and there exists only one domain of attraction at 𝑘=0 which is stable. The reason for this behavior lies in the fact that for such a small 𝛼2, credit becomes more expensive; hence, for this small 𝛼2, it is no longer optimal for the firm—with any size of the capital stock—to borrow large amounts and to increase the capital stock for a given initial size of the firm; instead, it is optimal to shrink the capital stock to zero. Thus, with small 𝛼2, and thus large borrowing cost, it is for any firm size, that is for any initial capital stock, optimal to shrink the capital stock, and the firm will exhibit negative growth rates.

3.4. Capital Market and Credit Constraints

On the other hand, often one has to impose for the investment decision of the firm a debt ceiling, defined as a fraction of the capital stock. This is the case when a firm's borrowing is constrained. For a particular 𝐻(𝑘,𝐵) from (3.1), we can test what impact a debt ceiling has on firm growth and firm value. We impose the restriction 𝐵(𝑡)/𝑘(𝑡)𝑐 for some constant 𝑐. Again, we use the algorithm for computing domains of attraction to solve this problem numerically. Figure 4 shows the respective present value curves for 𝛼=100. With only a low risk premium but a credit constraint, a firm with 𝑐=0.6 would not grow sufficiently.

The potentially high-value curve can be reached with no credit constraint with 𝐵/𝐾=1.2. Yet, for 𝑐=0.6 the present value curve 𝑉(𝑘) coincides with the “restriction curve” 𝐵(𝑘)=𝑐𝑘. In this case, the curve (𝑘,𝑉(𝑘)) is no longer invariant for the dynamics; that is, each trajectory 𝐵(𝑡) with 𝐵(𝑡)𝑉(𝑘(𝑡)) leaves the curve (𝑘,𝑉(𝑘)) and eventually 𝐵(𝑡) tends to 0. Debt is controllable, but firm value moves down, and growth is limited for 𝑐=0.6; that is, the corresponding trajectories leave the curve 𝑉(𝑘) and eventually 𝐵(𝑡) tends to zero. (The simulations are halted at zero, but we would like to report if continued the 𝐵(𝑡) curve becomes negative.)

This means credit constrained firms are not able to undertake sufficiently high investments and will thus grow at a lower rate. They will not be able to realize their growth potentials, and thus their potential present value that they would be able to obtain without credit constraint. We can observe that with borrowing constraints, even for an optimal investment strategy, firms of different size classes would be expected if there are strict borrowing limits on firms.

4. Dynamics of Firm Size Distribution in the Long Run

The above mechanisms predict that one might expect in the long-run movements of growth rates and a size distribution of firms characterized by a clustering of firms in the upper region of size classes as well as at the lower end of size classes. Moreover, one would expect that the middle size classes are “thinned out,” possibly giving rise to a long-run twin-peak distribution of firm size.

We want to present some empirical evidence that may confirm some dynamics toward a long-run twin-peak distribution of firm size which is implied by the above model variants. To address this issue empirically, we concentrate on firm size distribution in the US manufacturing industry.

The data of the following empirical study are taken from the pstar dataset used in B. W. Hall and R. E. Hall [22]. The data set contains 23 variables which quantify certain characteristics such as investment, stock price, or assets' value of US firms in the manufacturing industry for the time period 1960 to 1991. We use the variable netcap which is defined as book value of assets, adjusted for the effects of inflation to represent capital stock. In the following net capital which is normalized by the average net capital of all firms will be used as a measure of size.

The pstar data set gives us a set of observed data points or capital stocks, respectively, which can be interpreted as a sample of an unknown probability density function for several years. To analyze certain characteristics of this density, one has to determine the unknown density. If, for example, the density function has changed from being unimodal to a bimodal one, it can be regarded as a hint that the middle size classes have been thinned out, supporting the above stated theoretical ideas. The graphs in Figure 5 indeed seem to support the hypothesis that over time, there is a heavy tail in the upper size classes arising and that the middle size classes get thinned out.

In order to discuss this in a more quantitative manner, we first fit the probability density function (pdf) with a log-normal distribution.

Figure 6 gives some of the rank-size plot with the corresponding fits with log-normal distribution (to be more exact, the cumulative distribution 𝑃>(𝑥) defined in (A.1) obtained by log-normal 𝑃(𝑥)). The overall agreement is good although there are small but distinct differences. This difference is plotted for the same years in Figure 7, where it is apparent that they share a common feature.

This means that they are not random fluctuations around the log-normal. Most notable feature is that that they dip between −1 and 0, which means that since the derivative of cumulative probability is the negative of pdf, the actual pdf is given by log-normal pdf plus a twin peak pdf with its first peak at around −1 and second peak at a little less than 1. The strength of the twin-peak can be measured by fitting these differences with a function such as 𝑎𝑒𝑥sin(𝑘𝑥+𝑚) (with 𝑥 being the horizontal coordinate) and measuring the “amplitude” 𝑎. We have chosen to do this in the range 𝑥[1.5,0.3] in order to avoid large fluctuation for large 𝑥 and the resulting curves are plotted in solid black curves. We find that the agreement with the actual difference is in a reasonable range.

The resulting amplitude 𝑎 is plotted in Figure 8 together with a linear fit. The systematic tendency for 𝑎 to increase confirms that in fact the middle thins out. This follows from the estimated change of the Sinus curves for the years 1970, 1975, 1980, 1985, and 1990 represented in Figure 8 where it is visible that the amplitude is rising over time indicating that the distribution of firms becomes concentrated at the lower and upper end. (Another way to estimate a bimodal distribution of firms size data is to use Kernel estimation and the Markow chain approach; see Kato et al. [15].)

5. Conclusions

In this paper, we have first studied the two scaling laws, Pareto's law and Gibrat's law. Both seem to hold for large firms. Their relations under the law of detailed balance was also studied. This discussion is done purely kinematically, that is, independent from any models one might put forward. Yet, this kind of study helps one to isolate dynamical features, which remains to be explained. This is the strength and the power of this kind of analysis. As a result, we now have a specific shape of the growth-rate pdf in the framework of Gibrat’s law, which should be the target of the modern analysis of firm size dynamics. The analysis of the upper and lower tail of the firm size distribution via Pareto’s law and Gibrat`law is empirically complemented by a statistical study of what happens in the middle. Our study empirically addresses the issue of the “thinning out” of the middle and the issue of a bimodal distribution of firm size.

In order to theoretical motivate such an enlarged study, namely, a study of the bimodal size distribution of firm classes in the long run, we have introduced a dynamic model of firm behavior, where firms might be exposed to locally increasing returns, nonlinear adjustment costs of capital, credit spread, determined by risk premiums, and credit constraints. Using dynamic programming, we compute the dynamics of firm size and their long run size distribution. A bimodal distribution is predicted. Empirically, this means that over time, there is a “thinning out” of firm size classes in the middle. Our statistical analysis then has supported the theoretical model's predictions.

Finally, we want to note that, of course, the evolution of firm size distribution in industries is presumably more complex than characterized by our above statistical and analytical studies. Numerous empirical studies on the dynamics of the firm size distribution over the life cycle of an industry have documented the complexity of the forces affecting the size distribution of firms such as growth of overall demand, industry demand, cost and technology shocks, financing practices, and industry regulation. (For a survey of the variety of forces of growth in certain stages of the life cycle of the industry, see Mazzucato and Semmler [23].) Yet, we would like to stress that our paper might give a theoretical rationale and some empirical evidence that may help to highlight some major mechanisms possibly leading to bimodal distribution and fluctuation of firm size in the long run.


A. Scaling Laws

We, here, give an analysis of Pareto’s law and the Gibrat's law, which are often observed in the data. We present a theoretical discussion of the relationship between these two laws. They are also called scaling laws.

Pareto's Law and Gibrat's Law (For more details of the following and an empirical verification using a large EU dataset, see Fujiwara et al. [4].)
The cumulative probability distribution function 𝑃>(𝑥) can be defined as follows in terms of the pdf 𝑃(𝑥): 𝑃>(𝑥)=𝑥𝑃𝑥𝑑𝑥.(A.1) The distribution 𝑃>(𝑥) and thus 𝑃(𝑥) obey a power law for large 𝑥 if 𝑃(𝑥)𝑥𝜇1,(A.2) which is the famous Pareto-Zipf's law (“Pareto's law” hereafter). (See Pareto [24].) Further, the Pareto index 𝜇 usually fluctuates around 1 for firm sizes and varies around 2 for personal income [4]. A curious fact is that in addition to the difference between the central values 1 and 2, the range of variation for the latter is often wider than that of the former.
The growth rate 𝑅 is defined as 𝑅=𝑥2/𝑥1, where 𝑥1 is the first year's log firm size and 𝑥2 the next year's. We can observe that the pdf for different bin (𝑛) of 𝑥1 often overlap with each other, which is Gibrat's law. This law is expressed in terms of the conditional probability 𝑄(𝑅𝑥1) of the growth rate for fixed 𝑥1 as 𝑄𝑅𝑥1=𝑄(𝑅).(A.3)
In the following, we discuss the relationship between these two laws.

Relations between the Two Laws
All the data mentioned above satisfy the law of detailed balance to a certain degree. In terms of the simultaneous pdf 𝑃12(𝑥1,𝑥2), this law is written as follows: 𝑃12𝑥1,𝑥2=𝑃12𝑥2,𝑥1.(A.4) The accuracy of (A.4) can be checked with the two-dimensional Kolmogorov-Smirnov test.
The following can be proven under the law of detailed balance [2]. (1)If Gibrat's law holds, Pareto's law is satisfied. (2)Under the same condition as the above, the “Reflection law,” 𝑄(𝑅)=𝑅𝜇2𝑄𝑅1,(A.5) is satisfied.
The proof goes as follows: we denote the joint pdf of 𝑥1 and the growth rate 𝑅=𝑥2/𝑥1 is denoted by 𝑃1𝑅(𝑥1,𝑅). Since 𝑃12(𝑥1,𝑥2)𝑑𝑥1𝑑𝑥2=𝑃1𝑅(𝑥1,𝑅)𝑑𝑥1𝑑𝑅 under the change of variables from (𝑥1,𝑥2) to (𝑥1,𝑅), these two pdf's are related to each other as follows: 𝑃1𝑅𝑥1,𝑥2𝑥1=𝑥1𝑃12𝑥1,𝑥2.(A.6)
We define conditional probability 𝑄(𝑅𝑥1) as follows: 𝑃1𝑅𝑥1,𝑅=𝑃1𝑥1𝑄𝑅𝑥1(A.7)=𝑃𝑅(𝑥𝑅)𝑆1𝑅,(A.8) Both 𝑃1(𝑥1) and 𝑃𝑅(𝑅) are marginal 𝑃1𝑥1=0𝑃1𝑅𝑥1=,𝑅𝑑𝑅0𝑃12𝑥1,𝑥2𝑑𝑥2𝑃,(A.9)𝑅(𝑅)=0𝑃1𝑅𝑥1,𝑅𝑑𝑥1,(A.10)
The empirical facts corresponding to this are observed in Fujiwara et al. [4]. They can be described in terms of these pdf's as follows. (A) Detailed Balance
The joint pdf 𝑃12(𝑥1,𝑥2) is a symmetric function 𝑃12𝑥1,𝑥2=𝑃12𝑥2,𝑥1.(A.11)
(B) Pareto's Law
The pdf 𝑃1(𝑥) obeys power law for large 𝑥𝑃1(𝑥)𝑥𝜇1,(A.12) for 𝑥 with 𝜇>0.
(C) Gibrat's Law
The conditional probability 𝑄(𝑅𝑥) is independent of 𝑥𝑄(𝑅𝑥)=𝑄(𝑅).(A.13) Let us first rewrite the detailed balance condition (A) in terms of 𝑃1𝑅(𝑥1,𝑅)𝑃1𝑅𝑥1,𝑅=𝑥1𝑃12𝑥1,𝑥2=𝑥1𝑃12𝑥2,𝑥1=𝑥1𝑥2𝑥2𝑃12𝑥2,𝑥1=𝑅1𝑃1𝑅𝑥2,𝑅1,(A.14) where (A.11) was used in the second line and (A.6) was used in the first and the third line. The above relation may be rewritten as follows by the use of the conditional probability 𝑄(𝑅𝑥1) in (A.7): 𝑄𝑅1𝑥2𝑄𝑅𝑥1𝑃=𝑅1𝑥1𝑃1𝑥2.(A.15)

In passing, it should be noted that (A.14) leads to the following: 𝑃𝑅(𝑅)=0𝑃1𝑅𝑥1,𝑅𝑑𝑥1=0𝑅1𝑃1𝑅𝑥2,𝑅1𝑑𝑥1=0𝑅2𝑃1𝑅𝑥2,𝑅1𝑑𝑥2=𝑅2𝑃𝑅𝑅1,(A.16) where (A.14) was used in the second line and the third line is merely a change of integration variable. This relation between the marginal growth-rate pdf 𝑃𝑅(𝑅) for positive growth (𝑅>1) and negative growth (𝑅<1) leads to the following relation, as it should: 1𝑃𝑅(𝑅)𝑑𝑅=10𝑃𝑅(𝑅)𝑑𝑅.(A.17)
Let us prove that the properties (A) and (C) lead to (B). By substituting Gibrat's law (A.13) in (A.15), we find the following: 𝑃1𝑥1𝑃1𝑥2=1𝑅𝑄𝑅1.𝑄(𝑅)(A.18) This relation can be satisfied only by a power-law function (A.12).
Proof. Let us rewrite (A.18) as follows: 𝑃1(𝑥)=𝐺(𝑅)𝑃1(𝑅𝑥),(A.19) where 𝑥 denotes 𝑥1, and 𝐺(𝑅) denotes the right-hand side of (A.18); that is, 𝐺1(𝑅)𝑅𝑄𝑅1.𝑄(𝑅)(A.20) We expand this equation around 𝑅=1 by denoting 𝑅=1+𝜖 with 𝜖1 as 𝑃1(𝑥)=𝐺(1+𝜖)𝑃1=((1+𝜖)𝑥)1+𝐺𝑃(1)𝜖+1(𝑥)+𝑃1(𝑥)𝜖𝑥+=𝑃1𝐺(𝑥)+𝜖(1)𝑃1(𝑥)+𝑥𝑃1𝜖(𝑥)+𝑂2,(A.21) where we used the fact that 𝐺(1)=1. We also assumed that the derivatives 𝐺(1) and 𝑃1(𝑥) exists in the above, whose validity should be checked against the results. From the above, we find that the following should be satisfied: 𝐺(1)𝑃1(𝑥)+𝑥𝑃1(𝑥)=0,(A.22) whose solution is given by 𝑃1(𝑥)=𝐶𝑥𝐺(1).(A.23) This is the desired result, Pareto's law, and is consistent with the assumption made earlier that 𝑃1(𝑥) exists. By substituting the result (A.23) in (A.20) and (A.18), we find that 𝐺(𝑅)=𝑅𝐺(1),(A.24) which is consistent with the assumption that 𝐺(1) exists.
From (A.20) and (A.24), we find the following relation: 𝑄(𝑅)=𝑅𝜇2𝑄𝑅1,(A.25) which should be in contrast to (A.16). If Gibrat's law (A.13) holds for all 𝑥[0,], then 𝑃𝑅(𝑅)=𝑄(𝑅) from (A.10). If so, (A.25) contradicts (A.16), since 𝜇>0. Besides, Pareto's law we derived from Gibrat's law is not normalizable if it holds for any 𝑥. Therefore, Gibrat's law should hold only for a limited range of 𝑥.
The result (A.25) shows that the function 𝑄(𝑅) is continuous at 𝑅=1, as easily seen by substituting 𝑅=1+𝜖 with 𝜖>0 on both hand side and taking the limit 𝜖+0.

B. The HJB Equation

In the general case of (2.1)–(2.3) with endogenous credit cost and a finance premium as stated in (3.1), we have the following HJB equation:𝐻𝑘,𝐵(𝑘)=max𝑗𝑓(𝑘,𝑗)+𝑑𝐵(𝑘)𝑑𝑘(𝑗𝜎𝑘).(B.1)

Note that in the limit case, where there is no borrowing and 𝑁=𝑘, and thus the constant discount rate 𝜃 holds, we obtain the HJB equation𝑉(𝑘)=max𝑗𝑓(𝑘,𝑗)+𝑑𝑉(𝑘)𝑑𝑘(𝑗𝜎𝑘).(B.2)

In this case, 𝑉(𝑘) is the creditworthiness; in other words, the maximum amount the firm can borrow is equal to the asset price 𝑉(𝑘).

The HJB-equation (B.1) can be written as𝐵(𝑘)=max𝑗𝐻1𝑓(𝑘,𝑗)+𝑑𝐵(𝑘)𝑑𝑘(𝑗𝜎𝑘),(B.3) which again is a standard dynamic form of the HJB equation. Next, for the purpose of an example, let us specify 𝐻(𝑘,𝐵)=𝜃𝐵𝜅 where, with 𝜅>1, the interest payment is solely convex in 𝐵. We then have𝐵(𝑘)=max𝑗𝑓(𝑘,𝑗)+𝑑𝐵𝑑𝑘(𝑗𝜎𝑘)1/𝜅𝜃1/𝜅.(B.4)

The equilibria of the HJB equation (B.4), with 𝜅>1, are shown below. The algorithm to study the more general problem of (B.3) or (B.4) is summarized in Grüne et al. [18]. Note that if 𝜅=1, we have a standard form of the HJB equation.

If the HJB equation (B.4) holds with 𝐻(𝐵)=𝜃𝐵𝜅, the finance premium depends on the debt of the firm. This extension is presented in Grüne et al. [18]. For 𝐻(𝐵)=𝜃𝐵𝜅 for 𝜅1, it leads to the following equation for candidates of equilibrium steady states:1+2𝑗𝑘𝛾=𝛼𝑘𝛼1𝜎𝜎2(2𝛾)𝑘1𝛾𝑘𝜃𝜅𝛼𝜎𝑘𝜎2𝑘2𝛾(𝜅1)/𝜅.(B.5)

Here, again, for 𝜅=1, the steady state candidates, leaving aside consumption, are the same as for a standard HJB equation, that is, when 𝐻(𝑘,𝐵(𝑘))=𝜃𝐵, and thus in (B.3) and (B.4), 𝜅=1 holds. For details of the solution methods for the problem of (2.1)–(2.3), see Grüne et al. [18], and for further application of the algorithm, see Grüne and Semmler [21].


The authors would like to thank Dr. Masanao Aoki, Dr. Mauro Gallegati, Dr. Corrado Di Guilmi, and Dr. Taisei Kaizouji for collaborative works at various stages of this research. They would like to thank two referees of the journal for their extensive comments on the paper.