Abstract

This paper studies large deviations properties of vectors of empirical means and measures generated as follows. Consider a sequence of independent and identically distributed random variables partitioned into -subgroups with sizes . Further, consider a -dimensional vector whose coordinates are made up of the empirical means of the subgroups. We prove the following. The sequence of vector of empirical means satisfies large deviations principle with rate and rate function , when the sequence is valued, with . Similar large deviations results hold for the corresponding sequence of vector of empirical measures if ’s, , take on finitely many values. The rate functions for the above large deviations principles are convex combinations of the corresponding rate functions arising from the large deviations principles of the coordinates of and . The probability distributions used in the convex combinations are given by These results are consequently used to derive variational formula for the thermodynamic limit for the pressure of multipopulation Curie-Weiss (I. Gallo and P. Contucci (2008), and I. Gallo (2009)) and mean-field Pott’s models, via a version of Varadhan’s integral lemma for an equicontinuous family of functions. These multipopulation models serve as a paradigm for decision-making context where social interaction and other socioeconomic attributes of individuals play a crucial role.

1. Introduction

The early 1970s saw the utilization of two-population mean-field models in the study of the phase transitions and critical behaviour of antiferromagnetic systems. Such mean-field models were used as mean-field approximations of bipartite lattice systems for studying metamagnets [1, 2]. The two-population mean-field ideology was used in [3] to investigate the gibbs-non-gibbs transitions in gibbs measures for Curie-Weiss model subjected to Glauber dynamics. Here the analysis was based on complete analysis of the phase diagram of the evolving spins constrained to having a given magnetization. Phase transitions in such a constrained system is an indication of loss of the gibbs property for the evolving system and it is preserved otherwise.

Statistical mechanical models have seen applications in the socioeconomic literature. Here the focus is on how decisions of individuals are influenced by their socioeconomic environment. For instance, how one’s choice of employment, residence, eduction, etc. is influenced by the social and economic environments. Spin models have appeared as natural models for such discrete choice context where social interactions play a crucial role [47]. More recently, the authors of [810] have introduced two-population mean-field models for a binary choice context where the reference population is partitioned into subgroups of individuals sharing the same socioeconomic attributes. The key assumption here is that individuals with the same attributes tend to behave the same way. In these papers it is assumed that the fractions of the individuals in the subgroups are independent of the size of the reference population. The thermodynamic limit for the pressure of the models was proved for many-body interaction version of the interacting Curie-Weiss model. But variational expression for the pressure and almost sure factorization of correlation function were proved for the case of one and two-body interactions case [10].

The aim of this paper is to set up large deviations machinery for assessing the variational formula for the pressure of the general model introduced in [10] and even extend the model there to the case where the fractions of the individuals in the subgroups are dependent on the size of the reference population. We also employ the tools developed here to derive variational formula for the pressure of a multibody multipopulation mean-field Potts’ model.

We establish large deviation results for vectors of empirical means associated with a collection of random variables modelling the behaviour of interest of the subgroups that constitute the reference group. These empirical means are derived from uneven numbers of random variables. Thus the vector components are given by empirical mean of different numbers of independent and identically distributed random variables. Due to the variations in the sizes of the subgroups, the large populations asymptotics of the free energy results in proving a version of Varadhan’s integral lemma for a sequence of functionals of the vector of empirical measures instead of the usual case where this functional is fixed throughout the asymptotics. We provide a necessary condition for such a sequence to admit the desired asymptotic result.

The rest of the paper is organized as follows: Section 2 discusses the generalities on large deviations theory and main results of the paper. In Section 3 we introduce the multipopulation Curie-Weiss and mean-field Potts’ models to motivate the large deviations problem we address in this paper. The proofs of the results in Section 3 are given in Section 4.

2. Generalities on Large Deviations Theory and Main Results

Large deviation theory tells how, on an exponential scale, the probability for an atypical event decays to zero. More formerly, large deviations are defined as follows.

Definition 1. Let be a complete separable metric space, the Borel -algebra of , and a sequence of probability measures on . (1) is said to have a large deviation property if there exist a sequence of positive numbers which tends to and a function which maps into such that the following hypothesis holds:(a) is lower semicontinuous on .(b) has compact level sets.(c), for each closed set in .(d), for each open set in .(2) is called the entropy/rate function of .

In the above definition condition (b) implies that the rate function is good.

2.1. The Set Up

Suppose , is a sequence of independent and identically distributed -valued random variables, for a positive integer . Let be a partition of the set . The partition may be interpreted as the indexing set of the subpopulations in a population of size . Here we denote by the size of the th subpopulation and we assume thatfor any . For each subpopulation we are interested in the empirical meanand the vector of empirical means for the multipopulation is given byFor the case , it is clear that , but it is different from the empirical mean of a sequence of -valued random variables. Note that in the latter each coordinate of the empirical mean is a sum of random variables. In our case the coordinates of the empirical mean vector are made up of sums of uneven numbers of random variables. In what follows we write for the distribution of and the space .

2.2. The -Valued Case

Suppose the sequence of -valued random variables is independent and identically distributed with common distribution . Suppose the logarithmic moment generating function associated with is given byWe assume , for all . The Fenchel-Legendre transform of is defined asWe now state our first large deviations result for the vector of empirical means . Recall that is the law of .

Theorem 2. The sequence of vector of empirical means satisfies large deviations principle with rate and rate function , given by

2.3. The -Valued Case

In what follows we will consider the large deviations principle for the case of a sequence of independent and identically distributed (i.i.d.) -valued random variables . Here . As before, we denote by , a probability measure on , the common law of the i.i.d. sequence Here again, we letand Further, we assume thatIn what follows we will write for every , asIn view of the above, we will write the inner product for any pair as follows:We define the -valued vectors and . The large deviations principle for the vector for case is summarized in the results below.

Theorem 3. Let be defined as above. Then satisfies a large deviations principle with rate and rate function given by

2.4. Large Deviations Principle for a Vector of Empirical Measures

In this section we consider the large deviations principle for a vector of empirical measures of the subgroups when the sequence of i.i.d. random variables takes finitely many values. Specifically, let be a finite set and be an independent and identically distributed sequence of -valued random variables with common law . Here , for any . The empirical measure for the random variables indexed by is given byHere we have put , the indicator function for Note that is the relative frequency of among the random sample .

The vector of empirical measures is then given byNote that is a vector whose coordinates are probability measures on . To see the connection between our vector of empirical measures and the vector of empirical means we considered earlier, we introduce the following sequence of random variables: for positive integer , defineThen the sequence is an i.i.d. valued random variables are with the property thatThus the empirical mean of the -sequence is the same as the empirical measure of the -sequence and the vector of empirical means associated with the -sequence coincides with the vector of empirical measures of the -sequence.

For every , note thatFurther, for any probability measure on , the relative entropy of relative to is given bySuppose is the set of all probability measures on . Then and For any , it can be written in the form , where , for any . The following result is the large deviations principle for the vector of empirical measures .

Theorem 4. The sequence of vectors of empirical measures satisfies a large deviations principle with rate and rate function for every . In particular,

2.5. Varadhan’s Integral Lemma

In this section we consider Varadhan’s integral lemma for a sequence of equicontinuous functions. Here we put .

Theorem 5. Suppose satisfies a large deviation principle with rate and a good rate function Further, let the sequence of functions be equicontinuous converging point-wise to a function . Assume either the tail conditionor the following moment generating condition for some ,Then

3. Applications

Let us now introduce the models that motivated the large deviations questions addressed in this paper. We are interested in a model that will capture how individual decisions or choices are influenced by the choices of the rest of the people in their reference group. Additionally, individuals do have attributes, such as gender, place of residence, level of education, and ethnicity, that also influence their decisions. The Curie-Weiss case introduced in [10] is discussed first and the corresponding mean-field Potts’ case will be discussed after that. The method discussed here could also apply to continuous spin models with compact support. In particular, it will apply to the mean-field versions of the following models: the model [11], the spherical model [12, 13], the liquid crystal model [11, 14, 15], and the Kuramoto model [16]. The detail analysis for continuous spin models and their phase diagram will be carried out in a future paper. We present the results for the Curie-Weiss and Potts’s models here because we have already studied the thermodynamic limits of these models in [17, 18].

3.1. Multipopulation Curie-Weiss Model

Suppose each individual in a population of agents chooses a binary action, such as voting YES or NO on some issue at some common time. This binary action is coded by withThe choices made by all the individuals are also coded by The level of satisfaction of the population for deciding on is given by the Curie-Weiss Hamiltonian

The function on the configurations represents the utility of individuals as a result of their choices and the influences on them while making those choices [19]. measures the level of satisfaction of the entire population for making the choice . The higher is the , the higher is the level of satisfaction of the population. It has two parts; the first part models the social incentive of individuals in the population and the second part models the private incentive of individuals. Here measures the influence of individual on individual . When is positive means conformity or imitation is rewarded and conformity is not rewarded when is negative. controls the part of the utility that is specific to individual .

Next we reparametrize the parameters and as follows: suppose that each individual in the reference population has attributes . For instance, suppose that attributes 1 and 2 are, respectively, employment and marital status , thenand

Therefore, with respect to the attributes, the reference group can be partitioned into nonoverlapping subgroups. Members in a given subgroup share the same attributes and it is therefore reasonable to assume that they also behave the same way. In view of this, we shall assume in what follows that , for all choices of coming from subgroup and all taken from subgroup . Further, we assume that for all individuals in group . In the sequel we will let be the set of individuals in subgroup , for , with =. Therefore, . Note that with , for . Define for every Note that is the proportion of the individuals found in subgroup and is the magnetization of the spins in subgroup .

Therefore, it follows from (25) and the above parametrization of and thatNote that if , we get the original Curie-Weiss Hamiltonian. For the case , we end up with Curie-Weiss models on the subgroups , that are interacting with one another. Here we have subgroups because we have attributes that are binary. We could allow the attributes to have any finite number of alternatives and the alternatives for the attributes need not to come from the same set. Therefore, in what follows we will assume there are subgroups and that the Hamiltonian takes the formThe Hamiltonian in (30) consists of one-body and two-body interactions. In what follows we will extend the number of bodies in the interaction to range from 1 to , where is the number of subgroups [10]. We consider Hamiltonian of the formNote from (30) and (31) that for , and for , . The are interaction coefficients associated with the -body interaction among individuals coming from the subgroups , respectively. Thus the interaction is defined with the help of a tensor of rank for each of the -body interactions [10].

Further, we assume that there is a probability measure on the set , such that andNote that the model we consider here is more general than the cases in [8, 10], in that for any finite the fractions of the subgroups are dependent on . In [8, 10] these fractions are chosen to be independent of . This simplifies the proofs, especially the existence of the thermodynamic limit.

In the sequel we will use the following notation: let Note that . Further, for every positive integer , define a map asNote that the ’s are uniformly bounded by It is also clear thatSince the maps are continuous for every , and is a compact subset of , it follows from Theorems 7.13 and 7.24 of [20] that the sequence is equicontinuous. Further, the Hamiltonian in (31) become

Suppose the spins are independent and identically distributed sequence of random variables withWe denote by the corresponding product measure on . The equilibrium state associated with the Hamiltonian in (31) is given bywhereis the partition function of the model and is the law of vector of empirical means under . In (38) we have used (35). In what follows , , and shall be as follows:

The pressure function of the model is then given byThe large behaviour of the model is governed by the pressure function. It is known from [18] that the thermodynamic limitexists. The proof of the case was earlier given in [8, 10].

Theorem 6. For choice of the parameters , , and , the limiting pressure admits the following variational representation:whereand is given in (34).

Proof. The proof of this theorem follows from (38), (40), and (41) and Theorems 2 and 5 upon setting , , , , , and , and noting that the ’s form an equicontinuous family and they are uniformly bounded.

3.2. Multipopulation Mean-Field Potts’ Model

Suppose this time round that the individuals in the population of agents choose from finite number of alternatives, say various alternatives of employment, at some common time. This discrete choice action of individual is coded by , where , withThe choices made by all the individuals are also coded by The level of satisfaction of the population for deciding on is given by the mean-field Potts’ HamiltonianHere is the Dirac-delta measure. The function on the configurations represents the utility of individuals as a result of their choices and the influences on them while making a decision [19]. and its parameters have the usual interpretation given for the Curie-Weiss model.

Using the partition of , we define for every Thereforeis the empirical measure for the choices of the individuals in subgroup . It is an element in , the set of probability measures on . Further,is the vector of empirical measures. Note that and is the proportion of the individuals found in subgroup .

Therefore, it follows from (45) and the above parametrization of and thatIn the second equation above we have used thatNote that if , we get the original mean-field Potts’ Hamiltonian. For the case , we end up with mean-field Potts’ models on the subgroups , that are interacting with one another. Here we have subgroups because we have attributes that are binary. We could allow the attributes to have any finite number of alternatives and the alternatives for the attributes need not come from the same set. Therefore, in what follows we will assume there are subgroups and this gives rise to the HamiltonianThe Hamiltonian in (51) consists of one-body and two-body interactions. In what follows we will extend the number of bodies in the interaction to range from 1 to , where is the number of subgroups. We consider Hamiltonian of the formIn the above we have used that

Note from (51) and (52) that for , and . The are interaction coefficients associated with the -body interaction among individuals coming from subgroups , respectively. Thus the interaction is defined with the help of a tensor of rank for each of the -body interactions. The model considered here is mean-field Pott’s version of the Curie-Weiss model considered in [8, 10, 17]. Further, for any finite the fractions of the subgroups are dependent on . In [8, 10] these fractions were chosen to be independent of , which simplified the proofs, especially the proof of the existence of the thermodynamic limit.

Note that . Further, for every positive integer , define a map asNote that the ’s are uniformly bounded by It is also clear thatSince the maps are continuous for every , and is a compact subset of , it follows from Theorems 7.13 and 7.24 of [20] that the sequence is equicontinuous. Further, the Hamiltonian in (52) become

Suppose the spins are independent and identically distributed sequence of -valued random variables withWe denote by the corresponding product measure on . The equilibrium state associated with the Hamiltonian in (52) is given bywhereis the partition function of the model. Here is the law of the vector of empirical measures . The pressure function of the model is then given byIt follows from [17] that the thermodynamic limitexists.

The limiting pressure admits the following variational formula representation.

Theorem 7. For any choice of the parameters , , and , the limiting pressure admits the following variational representation:where is the relative entropy of with respect to and is given in (55).

Proof. The proof of this theorem follows from (59) to (61) and Theorems 4 and 5 upon setting , , , , , and , and noting that the ’s form an equicontinuous family and they are uniformly bounded.

4. Proofs

The proofs of the results of this paper are given in this section. In the proof below we will use the following properties of the functions and . For proof of these properties we refer the reader to the proof of Lemma 2.2.5 of [21].

Lemma 8.  (1) is a convex function and is a convex rate function. (2) If is only finite at , then is identically zero. If for some , then (possibly ), and for all ,is for , a nondecreasing function. Similarly, if for some , then (possibly ), and for all ,is for , a nonincreasing function.When is finite, , and always

4.1. Proof of Theorem 2

Proof. The proof comes in two steps. In step one we will establish a large deviations upper bound and the corresponding lower bound is proved in step two. The proof is an adaptation of arguments used to prove the large deviations principle for the empirical mean of i.i.d sequence of -valued random vectors.
Step 1. Let be the intended rate function of the problem and define the -rate function as follows: The proof of the upper bound will follow if we can show for every and every closed subset of that We will first of all prove this inequality for compact subsets and extend it to closed subsets with the help of exponential tightness argument. Suppose is a compact subset of . Then for every and , we can find a such that It follows from the definition of that such a exists. For any , define Therefore if , then . For each , choose such that and define where Note that for any since for each Therefore it follows from the exponential Chebycheff inequality that The second equality uses that the sequence is i.i.d. and the second inequality uses that . Since is compact, it has a finite covering consisting of open balls centred at . It follows from the subadditivity property of probability measures and the choice of ’s that ThereforeThe equality above uses that the quantity is finite, by the finiteness assumption on , which is a concave function of and consequently it is continuous of .
To extend the proof of the above to all closed subsets of , we need to establish exponential tightness of the measures . Let and define . Then andwhere is the law of the th coordinate of , i.e., the law of . Thus . Therefore for any Therefore, for every , it follows from (64) thatSimilarly, using (65) we get thatIt follows from the fact that , for every , and Lemma 2.2.20 of [21] that and Therefore it follows from (78), (79), and (80) thatThis then implies that the sequence of measures is exponentially tight. Hence the upper bound (76) holds for every closed set of .
Step 2. Suppose and for some , , for any . Define, for any , a probability measure on , as and suppose be an independent sequence of random variables with the law of being if . Let be the corresponding law for Then This then implies that Therefore where For any , let be such that Then and by the dominated convergence theorem Therefore by the weak law of large numbers, for all , since the th component of converges in probability to Since we have that and by the definition of Hence Therefore Next, suppose that , but there is no such that for all Here we change the common law of the sequence by adding a small normal random variable so that the perturbed will admit logarithmic moment generating function , for which we can find an that satisfies , for every With this we get a lower bound for the large deviations principle which is later used to deduce the lower for the unperturbed case. Formerly, let be an i.i.d sequence given by where is a sequence of independent and identically distributed standard normal random variables that are independent of the sequence. The quantity . Let be the common law of the ’s and , where Then Therefore Recall that Hence by the Jensen’s inequality we have that , for all Therefore the function is finite, differentiable and it satisfies Therefore g() attains its supremum over at some , for which This implies that Therefore by the proceeding proof we have large deviations lower bound for the sequence of probability measures , i.e., for every , Note that the process has the same distribution as , where is the standard multivariate normal random vector. Therefore, in distributional sense and using that we have Therefore Thus

4.2. Proof of Theorem 3

Proof. The proof of this theorem follows from that of the -valued case upon making appropriate substitutions. For instance, every , , , and in the proof of the valued case should be replaced with , , , and , respectively. In particular, in establishing the exponential tightness of the ’s we use that for any , we define . Then andwhere is the th coordinate of the th-component of and is the law of the empirical mean Here is the th coordinate of .

4.3. Proof of Theorem 4

Proof. The proof follows from the large deviations principle for vector of empirical means of -valued random vectors restricted to probability vectors in . Recall from (17) that is finite and differentiable convex function. Therefore the map is concave. Therefore, the supremum of this map is attained at the value of such thatLet . Here is the th component of the vector Then for every and it follows from (109) that Substituting this form of into leads to

5. Proof of Theorem 5

Proof. The proof comes in three steps. In step one we proof a lower for (23). Using condition (21), we proof an upper bound for (23) in step two. Step three shows that condition (22) implies condition (21) and that completes the proof.
Step One. Fix and . Due to the equi-continuity of the sequence we have that the functions are lower semicontinuous. Thus, there exists a neighbourhood of such that We then have that We get from here that Since and were arbitrary chosen, we have thatStep Two. Suppose that the functions are and uniformly bounded; i.e., there is a constant , such that . Then condition (21) holds. Let and and define Then is a compact level set since is a good rate function. It follows from the lower semicontinuity property of , the upper semicontinuity property of , and the fact that is a regular topological space that for every , there is a neighbourhood of such thatSince is compact, there are finite number of points such that the neighbourhoods cover . Therefore The last inequality follows from (116). Note that and Since the ’s are bounded above, so is the limiting . The upper bound of interest follows upon taking the limits and
If the ’s are not uniformly bounded above, we set . Then ’s are uniformly bounded above and using the arguments for the above proof we get that The result for this case follows from the tail condition (21) as we let .
Step Three. In this step we show that condition (22) implies the tail condition (21). In this regard we let , , and define . Then It follows from here that Therefore condition (22) implies condition (21) upon taking limit.

6. Conclusion

This paper has developed large deviations machinery for the empirical means and measures for partitions of independent and identically distributed sequence of random variables. The large deviations result is further applied to derive the limiting free energy for multipopulation Curie-Weiss and Potts’ models. The method proposed here can be applied to multipopulation versions of spin models with continuous spins such as the model [11], the spherical model [12, 13], the liquid crystal model [11, 14, 15], and the Kuramoto model [16]. The multipopulation Potts’ model may have applications in discrete choice context with more than two alternatives to choose from. This serves as a natural extension to the Ising cases considered in [810, 22, 23].

The knowledge gained from the study of the minimizers of the associated minimization problem that leads to the limiting pressure will offer insight to the scaling limit behaviour of the empirical measures associated with the multipopulation Potts’ model. This will be a natural extension of the work in [24].

Data Availability

This paper uses no data.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

The authors thank the staff at the Mathematics Department of University of Energy and Natural Resources for their support and kindness during the period in which this paper was written.