Abstract

In this paper we use large deviation theory to determine the equilibrium distribution of a basic droplet model that underlies a number of important models in material science and statistical mechanics. Given and , distinguishable particles are placed, each with equal probability , onto the sites of a lattice, where equals . We focus on configurations for which each site is occupied by a minimum of particles. The main result is the large deviation principle (LDP), in the limit and with , for a sequence of random, number-density measures, which are the empirical measures of dependent random variables that count the droplet sizes. The rate function in the LDP is the relative entropy , where is a possible asymptotic configuration of the number-density measures and is a Poisson distribution with mean , restricted to the set of positive integers satisfying . This LDP implies that is the equilibrium distribution of the number-density measures, which in turn implies that is the equilibrium distribution of the random variables that count the droplet sizes.

1. Introduction

This paper is motivated by a natural question for a basic model of a droplet. Given and , distinguishable particles are placed, each with equal probability , onto the sites of a lattice . Under the assumption that and that each site is occupied by a minimum of particles, what is the equilibrium distribution, as , of the number of particles per site? We prove in Corollary 3 that this equilibrium distribution is a Poisson distribution, with mean , restricted to the set of positive integers satisfying . As we explain near the end of the Introduction, this equilibrium distribution has important applications to technologies using sprays and powders.

As in many other models in statistical mechanics, we can identify the equilibrium distribution by exhibiting it as the unique minimum point of a rate function in a large deviation principle (LDP). Other models for which this procedure can be implemented are discussed at the end of the Introduction.

For the droplet model we prove the LDP for a sequence of random probability measures, called number-density measures, which are the empirical measures of a sequence of dependent random variables that count the droplet sizes. This LDP is stated in Theorem 1. Our proof is self-contained and starts from first principles, using techniques that are familiar in applied mathematics and statistical mechanics. For example, the proof of the local large deviation estimate in Theorem 5, a key step in the proof of the LDP for the number-density measures, is based on combinatorics, Stirling’s formula, and Laplace asymptotics.

Our use of combinatorial methods goes back to Boltzmann in his work on the discrete ideal gas. He calculated the Maxwell-Boltzmann equilibrium distribution for this system by analyzing the asymptotic behavior of a particular multinomial coefficient [1]. Starting with Boltzmann’s work, combinatorial methods have remained an important tool in both statistical mechanics and in the theory of large deviations, offering insights into a wide variety of physical and mathematical phenomena via techniques that are elegant, powerful, and often elementary. In applications to statistical mechanics, this state of affairs is explained by the observation that “many fundamental questions are inherently combinatorial, including the Ising model, the Potts model, monomer-dimer systems, self-avoiding walks and percolation theory” [2]. For the two-dimensional Ising model and other exactly soluble models, [3, 4] are recommended.

A similar situation holds in the theory of large deviations. For example, Section of [5] discusses combinatorial techniques for finite alphabets and points out that because of the concreteness of these applications the LDPs are proved under much weaker conditions than the corresponding results in the general theory, into which the finite-alphabet results give considerable insight. The text [6] devotes several early sections to large deviation results for i.i.d. random variables having a finite state space and proved by combinatorial methods, including a sophisticated, level-3 result for the empirical pair measure.

In order to formulate the LDP for the number-density measures in our droplet model, a standard probabilistic model is introduced. The configuration space is the set consisting of all , where denotes the site in occupied by the th particle. The cardinality of equals . Denote by the uniform probability measure that assigns equal probability to each of the configurations . The asymptotic analysis of the droplet model involves the two random variables, which are functions of the configuration : for , denotes the number of particles occupying the site in the configuration ; for , denotes the number of sites for which .

We focus on the subset of consisting of all configurations for which every site of is occupied by at least particles. Because of this restriction is indexed by . It is useful to think of each particle as having one unit of mass and of the set of particles at each site as defining a droplet. With this interpretation, for each configuration , denotes the mass or size of the droplet at site . The th droplet class has droplets and mass . Because the number of sites in equals and the sum of the masses of all the droplet classes equals , the following conservation laws hold for such configurations: In addition, since the total number of particles is , it follows that . These equality constraints show that the random variables and are not independent.

In order to carry out the asymptotic analysis of the droplet model, we introduce a quantity that converges to sufficiently slowly with respect to ; specifically, we require that as . In terms of and we define the subset of consisting of all configurations for which every site of is occupied by at least particles and at most of the quantities are positive. This second condition is a useful technical device that allows us to control the errors in several estimates. In Appendix of [7] we present evidence supporting the conjecture that this condition can be eliminated. The discussion in that appendix involves a number of interesting topics including Stirling numbers of the second kind (see [8, pp. 96-97] and [9, §5.4]) and their asymptotic behavior [10, Example ].

The random quantities in the droplet model for which we formulate an LDP are the number-density measures . For these random probability measures assign to the probability , which is the number density of the th droplet class. Because of the two conservation laws in (1) and because , for , is a probability measure on having mean . Thus takes values in , which is defined to be the set of probability measures on having mean .

The probability measure defining the droplet model is obtained by restricting the uniform measure to the set of configurations . Thus equals the conditional probability . In the language of statistical mechanics defines a microcanonical ensemble that incorporates the conservation laws for number and mass expressed in (1).

A natural question is to determine two equilibrium distributions: the equilibrium distribution of the number-density measures and the equilibrium distribution of the droplet-size random variables . These distributions are defined by the following two limits: for any , any , and all where denotes the open ball with center and radius defined with respect to an appropriate metric on . As we prove, the equilibrium distributions of and coincide. As in many models in statistical mechanics, an efficient way to determine the equilibrium distribution is to prove an LDP for , which we carry out in Theorem 1. This theorem is the main result in the paper.

The content of Theorem 1 is the following: as , the sequence of number-density measures satisfies the LDP on with respect to the measures . The rate function is the relative entropy of with respect to the Poisson distribution on having components for . In this formula is the normalization that makes a probability measure, and equals the unique value for which has mean [Theorem A.2]. Using the fact that equals 0 at the unique measure , we apply the LDP for to conclude in Theorem 2 that is the equilibrium distribution of . Corollary 3 then implies that is also the equilibrium distribution of .

The space is the most natural space on which to formulate the LDP for in Theorem 1. Not only is the smallest convex set of probability measures containing the range of for all , but also the union over of the range of is dense in . As we explain in part (a) of Theorem 4, is not a complete, separable metric space, a situation that prevents us from directly applying general results in the theory of large deviations that require the setting of a complete, separable metric space.

The droplet model is defined in Section 2. Step in the proof of the LDP for is to derive the local large deviation estimate in part (b) of Theorem 5. This local estimate, one of the centerpieces of the paper, gives information not available in the LDP for , which involves global estimates. Step is to lift the local large deviation estimate to the large deviation limit for lying in open balls and certain other subsets of while Step is to lift the large deviation limit for open balls and certain other subsets to the LDP for stated in Theorem 1. Steps and are explained in Section 4.

Details of Steps and as well as other routine proofs are omitted from the present paper. They appear in the unpublished companion paper [7], which also contains additional background material. The paper [1] explores how our work on the droplet model was inspired by the work of Ludwig Boltzmann on a simple model of a discrete ideal gas. The main connection is via the local large deviation estimate in part (b) of Theorem 5. When , the LDP for a path version of with and varying appears in [11, 12].

The main application of the results in this paper is to technologies using sprays and powders, which are ubiquitous in many fields, including agriculture, the chemical and pharmaceutical industries, consumer products, electronics, manufacturing, material science, medicine, mining, paper making, the steel industry, and waste treatment. In this paper we focus on sprays; our theory also applies to powders with only changes in terminology [13]. The behavior of sprays might be complex depending on various parameters including evaporation, temperature, and viscosity. Our goal here is to consider the simplest model where the only assumption is made on the average size of droplets in the spray. In many situations it is important to have good control over the sizes of the droplets, which can be translated into properties of probability distributions. The size distributions are important because they determine reliability and safety in each particular application.

Interestingly, there does not seem to be a rigorous theory that predicts the equilibrium distribution of droplet sizes, analogous to the Maxwell-Boltzmann distribution of energy levels in a discrete ideal gas [14, 15]. Our goal in the present paper is to provide such a theory. We do so by focusing on one aspect of the problem related to the relative entropy, an approach that characterizes the equilibrium distribution of droplet sizes as being a Poisson distribution restricted to . We expect that this distribution will be important in experimental observations. A full understanding of droplet behavior under dynamic conditions requires treating many other aspects and is beyond the scope of this paper. We plan to apply the ideas in this paper to understand the entropy of dislocation networks [16].

The importance of predicting droplet size can be seen from the wide range of applications utilizing sprays [17, 18]. Because of the importance of this problem, novel approaches for measuring size distribution of droplet size in sprays have been developed [1923]. What makes the problem of predicting droplet size particularly interesting is the complexity of droplet-size distribution, which is attributed to many factors such as temperature and viscosity. As [24] shows, even the nozzle plays a significant role in the outcome. Many theoretical tools used to understand the distribution of droplet size in sprays include entropy [25], which also plays a key role in the present paper.

We end the Introduction by expanding on a comment made at the beginning of this section. This comment concerns one of the main applications of large deviation theory in statistical mechanics, which is to identify the equilibrium distribution or distributions of a model as the minimum point(s) of the rate function in an LDP for the model. This procedure is also useful to study phase transitions in the model, which concern how the structure of the set of equilibrium distributions changes as the parameters defining the model change. There are numerous other models for which this procedure has been used. They include the following three lattice spin models: the Curie-Weiss spin system, the Curie-Weiss-Potts model, and the mean-field Blume-Capel model, which is also known as the mean-field BEG model. As explained in the respective Sections , , and of [26], the large deviation analysis shows that each of these three models has a different phase transition structure. Details of the analysis for the three models are given in the references [6, IV.4], [2729]. Section of [30] outlines how large deviation theory can be applied to determine equilibrium structures in statistical models of two-dimensional turbulence. Details of this analysis are given in [31].

2. Definition of Droplet Model and Main Theorem

After defining the droplet model, we state the main theorem in the paper, Theorem 1. The content of this theorem is the LDP for the sequence of random, number-density measures, which are the empirical measures of a sequence of dependent random variables that count the droplet sizes in the model. As we show in Theorem 2 and in Corollary 3, the LDP enables us to identify a Poisson distribution as the equilibrium distribution both of the number-density measures and of the droplet-size random variables. In Theorem 4 we prove a number of properties of two spaces of probability measures in terms of which the LDP for the number-density measures is formulated.

We start by fixing parameters and . The droplet model is defined by a probability measure parameterized by and the nonnegative integer . The measure depends on two other positive integers, and , where . Both and are functions of in the large deviation limit . In this limit we take and , where , the average number of particles per site, equals . Thus . In addition, we take sufficiently slowly by choosing to be a function satisfying and as ; for example, for some . Throughout this paper we fix such a function . The parameter and the function first appear in the definition of the set of configurations in (3), where these quantities will be explained.

Because and are integers, must be a rational number. This in turn imposes a restriction on the values of and . If is a positive integer, then along the positive integers and along the subsequence . If , where and are relatively prime, positive integers with , then along the subsequence for and along the subsequence . Throughout this paper, when we write or , it is understood that and satisfy the restrictions discussed here.

In the droplet model distinguishable particles are placed, each with equal probability , onto the sites of the lattice . This simple description corresponds to a simple probabilistic model. The configuration space is the set consisting of all sequences , where denotes the site in occupied by the th particle. Let be the measure on that assigns equal probability to each site in , and let be the product measure on with equal one-dimensional marginals . Thus is the uniform probability measure that assigns equal probability to each of the configurations ; for subsets of we have , where card denotes cardinality.

The asymptotic analysis of the droplet model involves two random variables. For and , denotes the number of particles occupying site in the configuration . For and , denotes the number of sites for which . The dependence of and on is not indicated in the notation. Because the distributions of both random variables depend on , both and form triangular arrays.

We now specify the role played by the nonnegative integer , first focusing on the case where is a positive integer. The case where is discussed later. For , in general there exist sites for which ; that is, sites that are occupied by 0 particles. The next step in the definition of the droplet model is to restrict to a subset of configurations for which every site is occupied by at least particles and the following constraint holds: for any configuration at most of the components are positive, where and as . Because for every site is occupied by at least particles, we have and is indexed by . We denote by the sequence and define . In terms of this notation

The constraint restricting the number of positive components of is a useful technical device that allows us to control the errors in several estimates. In Appendix of [7] we give evidence supporting the conjecture that this restriction can be eliminated.

When is a positive integer, for each , each site in is occupied by at least particles. In this case it is useful to think of each particle as having one unit of mass and of the set of particles at each site as defining a droplet. With this interpretation, for each configuration , denotes the mass or the size of the droplet at site . The th droplet class has droplets and mass . Because the number of sites in equals and the sum of the masses of all the droplet classes equals , it follows that the quantities satisfy the two conservation laws in (1) for all .

We now consider the modifications that must be made in these definitions when . In this case the first constraint in the definition of disappears because we allow sites to be occupied by 0 particles, and therefore is indexed by . On the other hand, we retain the second constraint in the definition of , which requires that for any configuration at most of the components for are positive. When , the definition of becomes . Because the choice allows sites to be empty, we lose the interpretation of the set of particles at each site as being a droplet. However, for the two conservation laws in (1) continue to hold.

For the remainder of this paper we work with any fixed nonnegative integer . The probability measure defining the droplet model is obtained by restricting the uniform measure to the set . Thus equals the conditional probability . For subsets of , takes the form

Having defined the droplet model, we introduce the random probability measures whose large deviations we will study. For these measures are the number-density measures that assign to the probability . This ratio represents the number density of droplet class . Thus for any subset of By the two formulas in (1) and . Thus is a probability measure on having mean .

We next introduce several spaces of probability measures that arise in the large deviation analysis of the droplet model. denotes the set of probability measures on . Thus has the form , where the components satisfy and . We say that a sequence of measures in converges weakly to , and write , if, for any bounded function mapping into , as . is topologized by the topology of weak convergence. There is a standard technique for introducing a metric structure on for which we quote the main facts. Because is a complete, separable metric space with metric , there exists a metric on called the Prohorov metric with the following two properties: convergence with respect to the Prohorov metric is equivalent to weak convergence [32, Thm. 3.3.1]; with respect to the Prohorov metric, is a complete, separable metric space [32, Thm. 3.1.7].

We denote by the set of measures in having mean . Thus has the form , where the components satisfy , , and . The number-density measures defined in (5) take values in .

According to part (a) of Theorem 4, is not a closed subset of . Hence it is natural to introduce the closure of in . As we prove in part (b) of Theorem 4, the closure of in equals , which is the set of measures in having mean lying in the closed interval . Being the closure of the relatively compact, separable metric space , is a compact, separable metric space with respect to the Prohorov metric. This space appears in the formulation of the large deviation upper bound in part (c) of Theorem 1.

We next state Theorem 1, which is the LDP for the sequence of distributions on as . The rate function in the LDP is the relative entropy of with respect to the Poisson distribution defined in (7), where each . Thus any is absolutely continuous with respect to . For the relative entropy of with respect to is defined by If , then . For the components of the measure appearing in the LDP have the form where is chosen so that has mean and is the normalization making a probability measure; thus and, for , . As we show in Theorem A.2, there exists a unique value of .

As a consequence of the fact that is not closed in , the large deviation upper bound takes two forms depending on whether the subset of is compact or whether is closed. When is compact, in part (b) we obtain the standard large deviation upper bound for . When is closed, in part (c) we obtain a variation of the standard large deviation upper bound, which, when is compact, coincides with the upper bound in part (b). The refinement in part (c) is important. It is applied in the proof of Theorem 2 to show that is the equilibrium distribution of the number-density measures . In turn, Theorem 2 is applied in the proof of Corollary 3 to show that is the equilibrium distribution of the droplet-size random variables .

In the next theorem we assume that is the function appearing in the definition of in (3) and satisfying and as . The assumption that is used to control error terms in Lemmas 6 and 7 in the present paper and in Lemma in [7]. This assumption on is optimal in the sense that it is a minimal assumption guaranteeing that error terms in parts (a) and (b) of Lemma in [7] converge to 0. In the next theorem, for a subset of or we denote by the infimum of over .

Theorem 1. Fix a nonnegative integer and a rational number . Let be the function appearing in the definition of in (3) and satisfying and as . Let be the distribution having the components defined in (7). Then as , with respect to the measures , the sequence satisfies the LDP on with rate function in the following sense.(a) maps into and has compact level sets in ; that is, for any the set is compact.(b)For any compact subset of we have the large deviation upper bound(c)For any closed subset of , let denote the closure of in . We have the large deviation upper bound(d)For any open subset of we have the large deviation lower bound

The properties of in part (a) are proved in [33, Lem. 1.4.1] and part (a) of Theorem A.1. The basic step in proving the large deviation bounds in parts (b)–(d) is the local large deviation estimate in part (b) of Theorem 5. As explained in Section 4, this local estimate is lifted to large deviation limits involving open balls stated in Theorem 8, which in turn are used to derive the bounds in parts (b)–(d) of Theorem 1.

In the next theorem we use the large deviation upper bound in part (c) of Theorem 1 to prove that the Poisson distribution is the equilibrium distribution of the number-density measures . In this theorem denotes the complement in of the open ball . denotes the complement in of the open ball .

Theorem 2. One assumes the hypotheses of Theorem 1. The following results hold for any .(a)The quantity is strictly positive.(b)For any number in the interval and all sufficiently large This upper bound implies that, as , and for any bounded, continuous function mapping into These two limits allow us to interpret the Poisson distribution as the equilibrium distribution of the number-density measures with respect to .

Proof. The starting point is the large deviation upper bound in part (c) of Theorem 1 applied to the closed set , which is a subset of . We denote the closure of in by . Since , the large deviation upper bound in part (c) of Theorem 1 takes the form We now prove part (a) of Theorem 2. Since is lower semicontinuous on and has compact level sets in [33, Lem. 1.4.3(b)–(c)], it attains its infimum on the closed set . If , then there would exist such that . But on , attains its infimum of 0 at the unique measure [33, Lem. 1.4.1]. This contradicts the fact that , completing the proof of part (a). The inequality in part (b) is an immediate consequence of part (a) and the large deviation upper bound (13). This inequality yields the limit , which in turn implies (12). The proof of Theorem 2 is complete.

We now apply Theorem 2 to prove that is also the equilibrium distribution of the random variables , which count the droplet sizes at the sites of . This is the content of the next corollary. A fact needed in the proof is that is the empirical measure of these random variables; that is, for , assigns to subsets of the probability . This representation is valid because both and the empirical measure assign to the probability .

Corollary 3. One assumes the hypotheses of Theorem 1. Then for any site and any

Proof. Since the random variables are identically distributed, it suffices to prove the corollary for . For fixed , the limit (12) with yields This completes the proof.

The last theorem in this section proves several properties of and with respect to the Prohorov metric that are needed in the paper.

Theorem 4. Fix a nonnegative integer and a real number . The metric spaces and have the following properties.(a), the set of probability measures on having mean , is a relatively compact, separable subset of . However, is not a closed subset of and thus is not a compact subset or a complete metric space.(b), the set of probability measures on having mean lying in the closed interval , is the closure of in . is a compact, separable subset of .

Proof. (a) For satisfying let denote the compact subset of , and let denote its complement. For any It follows that is tight; that is, for any there exists such that for all . Prohorov’s theorem implies that is relatively compact [32, Thm. 3.2.2]. The separability of is proved in Corollary in [7].
We now prove that is not a closed subset of by exhibiting a sequence having a weak limit that does not lie in . Let be any measure in with mean ; thus . The sequence has the property that and that . This completes the proof of part (a).
(b) Since is a separable subset of and is dense in , it follows that is separable. We prove that is the closure of in . Let be a sequence in converging weakly to . Since implies that for each , Fatou’s lemma implies that , where and denote the means of and . Since for any we have , it follows that . This shows that the closure of in is a subset of . We next prove that is a subset of the closure of in by showing that for any there exists a sequence such that . If , then we choose for all . If , then we use the sequence in (17), which converges weakly to . We conclude that lies in the closure of and thus that is a subset of the closure of in . This completes the proof of part (b). The proof of Theorem 4 is done.

In the next section we present the local large deviation estimate that will be used in Section 4 to prove the LDP for in Theorem 1.

3. Local Large Deviation Estimate Yielding Theorem 1

The main result needed to prove the LDP in Theorem 1 is the local large deviation estimate stated in part (b) of Theorem 5. The first step is to introduce a set that plays a central role in this paper. Fix a nonnegative integer and a rational number . Given define and let be the function appearing in the definition of in (3) and satisfying and as . Define ; thus is the set of nonnegative integers. Let be a sequence for which each ; thus . We define to be the set of satisfying where . Because , the two sums involve only finitely many terms.

For the components of the number-density measure defined in (5) are for , where denotes the number of sites in containing particles in the configuration . We denote by the sequence . By definition, for every each site is occupied by at least particles, and . It follows that is the range of for ; the two sums involving in (18) correspond to the two sums involving in (1).

Since the range of is , for the range of is the set of probability measures whose components for have the form for . By (18) takes values in , the set of probability measures on having mean . It follows that the set is the range of for .

In part (b) of the next theorem we state the local large deviation estimate for the event . In part (a) we introduce the Poisson distribution that appears in the local estimate; is defined in terms of a parameter guaranteeing that it has mean .

In part (a) of Theorem in [7] we give the straightforward proof of the existence of for . The proof of the existence of for general is much more subtle than the proof for . The proof for general is given in Theorem A.2 in the present paper.

Theorem 5. (a) Fix a nonnegative integer and a real number . For let be the measure on having components for , where , and, for , . Then there exists a unique value such that lies in the set of probability measures on having mean . If , then . If , then is the unique solution in of .
(b) Fix a nonnegative integer and a rational number . Let be the function appearing in the definition of   in (3) and satisfying and as . For any we define to have the components for . Then is finite because it involves only finitely many components of , and uniformly for as .

We now prove the local large deviation estimate in part (b) of Theorem 5. This proof is based on a combinatorial argument that is reminiscent of and is as natural as the combinatorial argument used to prove Sanov’s theorem for empirical measures defined in terms of i.i.d. random variables having a finite state space [1, 3]. Part (b) of Theorem 5 is proved by analyzing the asymptotic behavior of the product of two multinomial coefficients that we now introduce.

Given , our goal is to estimate the probability , where has the components for . A basic observation is that coincides with It follows that Our first task is to determine the asymptotic behavior of . In determining the asymptotic behavior of , we will use the fact that can be written as the disjoint union

Let be given. We start by expressing the cardinality of as a product of two multinomial coefficients. For each configuration , particles are distributed onto the sites of the lattice with particles going onto sites for . We carry this out in two stages. In stage one particles are placed into bins, of which have particles for . The number of ways of making this placement equals the multinomial coefficient . This multinomial coefficient is well-defined since . Given this placement of particles into bins, the number of ways of moving the particles from the bins onto the sites of the lattice equals the multinomial coefficient . This second multinomial coefficient is well-defined since . We conclude that the cardinality of is given by the product of these two multinomial coefficients: Since , at most of the components are positive. Such a product of multinomial coefficients is well known in combinatorial analysis [8, Thm. 2.10]. A related version of this formula is derived in Example of [34]. See also [35, p. 115] and formula in [36, p. 36].

The next two steps in the proof of the local estimate given in part (b) of Theorem 5 are to prove the asymptotic formula for in Lemma 6 and the asymptotic formula for in part (b) of Lemma 7. The proof of Lemma 6 is greatly simplified by a substitution in line 4 of (34). This substitution involves a parameter , which, we emphasize, is arbitrary in this lemma. The substitution in line 4 of (34) allows us to express the asymptotic behavior of both in Lemma 6 and in Lemma 7 directly in terms of the relative entropy , where is the probability measure on having the components defined in part (a) of Theorem 5. One of the major issues in the proof of part (b) of Theorem 5 is to show that the arbitrary parameter appearing in Lemmas 6 and 7 must take the value , which is the unique value of guaranteeing that [Theorem 5(a)]. We show that must equal after the statement of Lemma 7.

Lemma 6. Fix a nonnegative integer and a rational number . Let be any real number in , and let be the function appearing in the definition of in (3) and satisfying and as . We define For any , we define to have the components for . Then The quantity uniformly for as .

Proof. The proof is based on a weak form of Stirling’s approximation, which states that, for all satisfying and for all satisfying , . We summarize the last formula by writing The term denoted by satisfies .
To simplify the notation, we rewrite (24) in the form , where denotes the first multinomial coefficient on the right side of (24), and denotes the second multinomial coefficient on the right side of (24). We have The asymptotic behavior of the first term on the right side of the last display is easily calculated. Since , there are positive components . Because of this restriction on the number of positive components of , we are able to control the error in line 3 of (29). We define . For each , since the components satisfy , we have for all . Using the fact that , we obtain where as and . By the inequality noted after (27) and the fact that Since as , we conclude that uniformly for as .
We now study the asymptotic behavior of the second term on the right side of (28). Since , we obtain for all where as . The weak form of Stirling’s formula is used to rewrite the term in the last display, but not to rewrite the terms , which we leave untouched.
Substituting (29) and (31) into (28), we obtain In this formula . As , We conclude that uniformly for as .
Now comes the key step, the purpose of which is to express the sum in the next-to-last line of (32) as the relative entropy , where is arbitrary. To express the sum in the next-to-last line of (32) as , we rewrite the sum as shown in line 4 of the next display: The facts that and are used to derive the next-to-last equality. The proof of Lemma 6 is complete.

The next step in the proof of the local large deviation estimate in part (b) of Theorem 5 is to prove the asymptotic formula for stated in part (b) of the next lemma. The proof of this lemma uses Lemma 6 in a fundamental way. After the statement of this lemma we show how to apply it and Lemma 6 to prove part (b) of Theorem 5.

Lemma 7. Fix a nonnegative integer and a rational number . The following conclusions hold:(a).(b)Let be the positive real number in Lemma 6, and let be the function appearing in the definition of in (3) and satisfying and as . We define . Then attains its infimum over , and The quantity as .

Before proving Lemma 7, we derive the local large deviation estimate in part (b) of Theorem 5 by applying Lemmas 6 and 7. An integral part of the proof is to show how the arbitrary value of appearing in these lemmas is replaced by the specific value appearing in Theorem 5. As in the statement of part (b) of Theorem 5, let be any vector in and define to have the components for . By (22) Substituting the asymptotic formula for derived in Lemma 6 and the asymptotic formula for given in part (b) of Lemma 7 yields The error term equals ; is the error term in Lemma 6, and is the error term in Lemma 7. As , uniformly for , and . It follows that uniformly for as .

We now consider the first two terms on the right side of (37). By part (b) of Theorem A.1 applied to , for any With this step we have succeeded in replacing the relative entropy with respect to , which appears in Lemma 6, by the relative entropy with respect to , which appears in Theorem 5. Substituting the last equation into (37) gives where uniformly for as . This is the conclusion of part (b) of Theorem 5.

We now complete the proof of part (b) of Theorem 5 by proving Lemma 7.

Proof of Lemma 7. (a) We write . By [8, Cor. 2.5] the number of elements in the set indexed by equals the binomial coefficient . Since by assumption as , for all sufficiently large , the quantities are increasing and are maximal when . Since , it follows thatAn application of the weak form of Stirling’s formula yields for all and all Since as , we conclude that as . This completes the proof of part (a).
(b) The starting point is (23), which states that . For distinct the sets are disjoint. Hence where It follows from part (a) that as .
We continue with the estimation of . By Lemma 6As proved in Lemma 6, as . Hence by (42) Under the assumption that attains its infimum over , we define In the last two paragraphs of this proof, we show that as . Given this fact, the last equation yields the asymptotic formula (35) in part (b).
We now prove that as . To do this, we use (45) to write Like the second and third terms on the right side, the first term on the right side is nonnegative because is a subset of . Since and as , it will follow that if we can show that attains its infimum over and that We now prove (48). is lower semicontinuous on [33, Lem. 1.4.3(b)] and thus on . Since has compact level sets in [Theorem A.1(a)], it attains its infimum over at some measure . We apply Theorem B.1 in [7] to , obtaining a sequence with the following properties: for , has components for , where is an appropriate sequence in ; as ; as . The limit in (48) follows from the inequalities and the limit as . This completes the proof of Lemma 7 and thus the proof of the local estimate in part (b) of Theorem 5.

In the next section we explain how the local large deviation estimate in part (b) of Theorem 5 yields the LDP in Theorem 1.

4. Proof of Theorem 1 from Part (b) of Theorem 5

In Theorem 1 we state the LDP for the sequence of number-density measures. This sequence takes values in , which is the set of probability measures on having mean . The purpose of the present section is to explain how the local large deviation estimate in part (b) of Theorem 5 yields the LDP for . All details appear in Section of [7]. The basic idea is first to prove the large deviation limit for lying in open balls in and in other subsets defined in terms of open balls and then to use this large deviation limit to prove the LDP in Theorem 1.

In Theorem 8 we state the large deviation limit for open balls and other subsets defined in terms of open balls. Two types of open balls are considered. Let be a measure in , and take . Part (a) states the large deviation limit for open balls , where denotes the Prohorov metric on . This limit is used to prove the large deviation upper bound for compact subsets of in part (b) of Theorem 1 and the large deviation lower bound for open subsets of in part (d) of Theorem 1. Now let be a measure in . Part (b) states the large deviation limit for sets of the form , where . This limit is used to prove the large deviation upper bound for closed subsets in part (c) of Theorem 1. If , then , and the conclusions of parts (a) and (b) of the next theorem coincide.

Theorem 8. Fix a nonnegative integer and a rational number . Let be the function appearing in the definitions of in (3) and satisfying and as . The following conclusions hold:(a)Let be a measure in and take . Then for any open ball in , is finite, and one has the large deviation limit(b)Let be a measure in and take . Then the set is nonempty, is finite, and one has the large deviation limit

We prove Theorem 8 by applying the local large deviation estimate in part (b) of Theorem 5. A key step is to approximate probability measures in and in by appropriate sequences of probability measures in the range of . This procedure allows one to show in part (a) that the infimum can be approximated by the infimum of over lying in the intersection of and the range of ; a similar statement holds for the infimum in part (b). A set of hypotheses that allow one to carry out this approximation procedure is given in Theorem   in [7], a general formulation that yields Theorem 8 as a special case.

Theorem 1 states the LDP for the number-density measures . In order to complete the proof of Theorem 1, we must lift the large deviation limits in Theorem 8 to the large deviation upper bound for compact sets and for closed sets and the large deviation lower bound for open sets. The large deviation lower bound for open sets is immediate from the limit in part (a). To prove the large deviation upper bound for compact sets, we cover the compact set by open balls and use the limit in part (a); the large deviation upper bound for closed sets follows by a similar procedure involving part (b). The details of this procedure are carried out as an application of general formulation in Theorem   in [7].

In the Appendix we prove two properties of the relative entropy and prove the existence of the quantity appearing in part (a) of Theorem 5.

Appendix

Properties of Relative Entropy and Existence of

We fix a nonnegative integer and a real number . Given a probability measure on , the mean of is denoted by . In Theorem A.1 we present two properties of the relative entropy and for in each of the following three spaces, which are introduced in Section 2: , the set of probability measures on ; , the set of satisfying ; and , the set of satisfying .

We recall that, for , denotes the Poisson distribution on having components for , where , and, for , . According to part (a) of Theorem 5 there exists a unique value for which ; thus lies in . In Theorem A.2 we prove the existence of . In part (a) of the next theorem we show that has compact level sets in , , and . After the statement of Lemma 7 we use part (b) of the next theorem to show that the arbitrary parameter in Lemmas 6 and 7 must have the value .

Theorem A.1. Fix a nonnegative integer and a real number . For any the relative entropy has the following properties:(a) has compact level sets in , , and .(b)For any , .

Proof. (a) The fact that has compact level sets in is proved in part (c) of Lemma   in [33]. Since is a compact subset of [Theorem 4(d)], also has compact level sets in . Because is not a closed subset of [Theorem 4(a)], the proof that has compact level sets in is more subtle. If is any sequence in satisfying , then since and has compact level sets in , there exist and a subsequence such that and . To complete the proof that has compact level sets in , we must show that ; that is, . By Fatou’s lemma . In addition, for any Lemma   in [37] shows that the sequence is uniformly integrable, implying that [32, Appendix, Prop. 2.3]. This completes the proof that has compact level sets in . The proof of part (a) is finished.
(b) We define . Step is to prove that for any For any we have and . Hence Since the last two lines equal , the proof of (A.2) is complete. Step is to prove that attains its infimum over at the measure , and Given these two assertions part (b) of the theorem follows by substituting into (A.2).
We now prove the two assertions in Step . is lower semicontinuous on [33, Lem. 1.4.3(b)] and thus on . Since has compact level sets in , it attains its infimum over . The relative entropy attains its minimum value of 0 over at the unique measure [33, Lem. 1.4.1]. Hence (A.2) implies that the minimum value of over equals The last equality follows by applying (A.2) with . This display shows that attains its infimum over at and yields (A.4). The proof of part (b) is finished, completing the proof of the theorem.

We now prove that there exists a unique value of for which . The conclusion of the next theorem is part (a) of Theorem C.1 in [7]. In part (b) of that theorem we derive two sets of bounds on and use these bounds to show that is asymptotic to as . In part (d) of Theorem C.1 in [7] we make precise the relationship between and a Poisson random variable having parameter .

Theorem A.2. Fix a nonnegative integer and a real number . There exists a unique value such that lies in the set of probability measures on having mean . If , then . If , then is the unique solution in of .

According to this theorem, for , is the unique solution of . The heart of the proof of Theorem A.2, and its most subtle step, is to prove that the function satisfies for and thus is monotonically increasing on this interval. This fact is proved in the next lemma.

Lemma A.3. Fix a positive integer and a real number . For the function satisfies .

Proof. For and for , we have . Thus . The key to proving that is to represent in terms of the moment generating function of a probability measure. We do this by first expressing in terms of the upper incomplete gamma function via the formula . As suggested in [38], we now make the change of variables , obtaining the representation The function is the moment generating function of the probability measure on having the density on . For let be the probability measure on having the density on . A straightforward calculation shows that It follows that for all .
Using (A.6) and the formulas and , we calculate This completes the proof of the lemma.

We are now ready to prove Theorem A.2.

Proof of Theorem A.2. We first consider . In this case is a standard Poisson distribution on having mean . It follows that is the unique value for which has mean and thus lies in . This completes the proof for .
We now consider . In this case is a probability measure on having mean Thus has mean if and only if satisfies , where . We prove the theorem by showing that has a unique solution for all and any . This assertion is a consequence of the following three steps: ; ; for all , . Steps and follow immediately from the definition of , and Step is proved in Lemma A.3.
We have proved the theorem for all . Since we also validated the conclusion of the theorem for , the proof for all nonnegative integers is done.

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

Acknowledgments

The research of Shlomo Ta’asan is supported in part by a grant from the National Science Foundation (NSF-DMS-1216433). Richard S. Ellis thanks Jonathan Machta for sharing his insights into statistical mechanics and Michael Sullivan for his generous help with a number of topological issues arising in this paper. Both authors thank the referee for a careful reading of the paper and for suggesting a number of references.