Abstract
We present a class of spherically symmetric random variables defined by the property that as dimension increases to infinity the mass becomes concentrated in a hyperspherical shell, the width of which is negligible compared to its radius. We provide a sufficient condition for this property in terms of the functional form of the density and then show that the property carries through to equivalent elliptically symmetric distributions, provided that the contours are not too eccentric, in a sense which we make precise. Individual components of such distributions possess a number of appealing Gaussian-like limit properties, in particular that the limiting one-dimensional marginal distribution along any component is Gaussian.
1. Introduction
Any spherically symmetric random variable can be represented as a mixture of spherical βshellsβ with distribution function proportional to. We consider a class of the spherically symmetric random variables for which as dimension the effective range of the mixture of βshellsβ becomes infinitesimal relative to a typical scale from the mixture. We then generalise this class to include a subset of the corresponding elliptically symmetric random variables. This offers a relatively rich class of random variables, the components of which are shown to possess appealing Gaussian-like limit properties.
Specifically we consider sequences of spherically symmetric random variables which satisfy for some positive sequence . Here and throughout this paper refers to the Euclidean norm. The set of such sequences includes, for instance, the sequence of standard -dimensional Gaussians, for which ; indeed the Gaussian-like limit properties of the whole class arise from this fact. More generally, we provide a sufficient condition for (1.2) for sequences of random variables with densities of the form
We then consider elliptically symmetric random variables, which are obtained by a sequence of (potentially) random linear transformations of spherically symmetric random variables satisfying either (1.1) or (1.2) and show that the properties (1.1) and (1.2) are unaffected by the transformation provided that the eccentricity of the elliptically symmetric random variable is not too extreme, in a sense which we make precise. Finally we show Gaussian-like limiting behaviour for individual components of a random variable from this class, both in terms of their marginal distribution, and in terms of their maximum.
Section 2 presents the main results, which are briefly summarised and placed in context in Section 3; proofs are provided in Section 4.
2. Results
Our first result provides a class of densities and associated scaling constants that satisfy (1.2).
Theorem 2.1. Let be a sequence of spherically symmetric random variables with density given by (1.3). Let satisfy and let be a solution to Then there is a sequence of solutions which satisfies , where is unique for sufficiently large . Elements of this sequence and together satisfy (1.2).
The class of interest therefore includes the exponential power family, which has densities proportional to , and ; indeed the class includes any density with polynomial exponents.
Heuristically, the mass of must concentrate around a particular radius, , so that the effective width of the support becomes negligible compared to as . Essentially (2.2) ensures that is at least a local mode of the density of , and (2.1) together with the existence of a sequence of solutions forces the curvature (compared to the scale of ) of the log-density of at this sequence of modes to increase without bound.
Condition (2.1) fails for densities where the radial mass does not become concentrated, such as the log-normal, . To see this explicitly for the log-normal form for the density of , note that the marginal radial density, that is, the density of , is proportional to which is itself a log-normal density with parameters . Taking we therefore find that for all
Theorem 2.1 requires ; however, other functional forms can also lead to the desired convergence, although not necessarily with . For example, if then the marginal radial density is proportional to ; trivially, in this case, the mass therefore concentrates around as .
We next show that (1.1) and (1.2) continue to hold after a linear transformation is applied to each , providing that the resulting sequence of elliptically symmetric random variables is not too eccentric.
Theorem 2.2. Let be a sequence of spherically symmetric random variables and a sequence of positive constants. Further let be a sequence of random linear maps on which are independent of . Denote the eigenvalues of by , and set . If then
The class of elliptically symmetric random variables therefore includes, for example, densities of the form , for symmetric for which the sum of the eigenvalues is much larger than their maximum.
Our final theorem demonstrates that even if the weaker condition (1.1) is satisfied by a spherically symmetric sequence, then any limiting one-dimensional marginal distribution is Gaussian; it also provides a slightly weaker result for elliptically symmetric sequences as well as a limiting bound on the maximum of all of the components.
Theorem 2.3. Let the sequence of spherically symmetric random variables and the sequence of positive constants satisfy (1.1), and let the sequence of -dimensional linear maps, , satisfy (2.5).
(1)For any sequence of unit vectors , which may be random, but is independent of ,(2)For any sequence of random unit vectors , with uniformly distributed on the surface of a unit -sphere and independent of and ,(3)Denote the component of as . Then
It should be noted that the first part of Theorem 2.3 is not simply a standard consequence of the central limit theorem. Rather it results from the fact that the standard -dimensional Gaussian satisfies condition (1.1), and hence any other sequence which satisfies (1.1) becomes in some sense βcloseβ to a -dimensional Gaussian as , close enough that the marginal one-dimensional distributions start to resemble each other.
The resemblance to a standard multivariate Gaussian is sufficient for a similar deterministic limit on the maximum of all of the components (Part 3); however, the well-known limiting Gumbel distribution for the maximum of a set of independent Gaussians (see Section 4.3) is not shared by all members of this class.
3. Discussion
It is well known (e.g., [1]) that any given spherically (or elliptically) symmetric random variable can be represented as a mixture of Gaussians; the marginal distribution of any given component is therefore also a mixture of Gaussians. The authors in [2] consider spherically symmetric distributions with support confined to the surface of a sphere and show that the limiting distribution of any fixed components as total the number of components is muitivariate normal. Further, in [3] they show that for a sequence of independent and identically distributed components, the marginal one-dimensional distribution along all but a vanishingly small fraction of random unit vectors becomes closer and closer to Gaussian as dimension .
In a sense we have presented an intersection of these ideas: a class of spherical and elliptical distributions, which are not confined to a spherical or elliptical surface, but which become concentrated about the surface as , and for which the limiting marginal distribution is Gaussian, not a mixture. Moreover, the maximum component size is bounded in proportion to , in a similar manner to the maximum component size of a high-dimensional Gaussian. A sufficient condition for the functional form has been provided, and this is satisfied, for example, by the exponential power distribution.
The Gaussian-like limit properties are fundamental to results in [4, 5] where, it is shown that if the proposal distribution for a random walk Metropolis algorithm is chosen from this class then some aspects of the behaviour of the algorithm can become deterministic and, in particular, that the optimal acceptance rate approaches a known fixed value as .
4. Proofs of Results
4.1. Proof of Theorem 2.1
It will be helpful to define and and to transform the problem to that of approximating a single integral: Here and elsewhere for clarity of exposition we sometimes omit the subscript, .
Theorem 2.1 is proved in three parts.(i)We first show that, for (for some ), the density attains a unique maximum in for some fixed . We will denote the value at which this maximum occurs as . The required sequence of scalings will turn out to be .(ii)Convexity arguments are then applied to show that (iii)It is then shown that for any fixed
Applying this with and provides the required result.
4.1.1. Existence of a Unique Maximum in
Define . Clearly and ; also condition (2.1) is equivalent to Hence, we may define
Lemma 4.1. Subject to condition (4.4), such that for all there is a solution to the equation which is unique in . Moreover, .
Proof. For , . Let be the first positive integer greater than then clearly there is a solution to .
If there are two such solutions, and with , then we obtain a contradiction since, by the intermediate value theorem.
Next consider successive solutions, and for and again apply the intermediate value theorem.
for some , since . Therefore, and the sequence is monotonic and therefore must approach a limit. Suppose that this limit is finite, . Then, since is continuous, . This contradicts the fact that , hence .
4.1.2. Convergence in Probability
Lemma 4.2. Let be a sequence of spherically symmetric random variables with density given by (1.3). If and satisfies (2.1), then there is a sequence such that
In proving Lemma 4.2 we consider the log-density (up to a constant) of :
Note that condition (4.4) implies that as , and .
We now assume and consider the integral . This integral must be finite for all greater than some , since otherwise cannot be an infinite sequence of random variables. For a given , the area of integration is partitioned into five separate regions:(i);(ii);(iii);(iv);(v).
It will be convenient to define the respective integrals
Note that where is the density of . The required convergence in probability will therefore be proven if we can show that, by taking large enough, each of , and can be made arbitrarily small compared with either or .
The next three propositions arise from convexity arguments and will be applied repeatedly to bound certain ratios of integrals.
Proposition 4.3. Let have . For any ,
Proof. Define the interval if , and otherwise. By the concavity of , Hence, The result follows on evaluating the right-hand integral.
Proposition 4.4. Let have . For any with and ,
Proof. By the concavity of , Hence, Since is negative, the result follows on evaluating the right-hand integral.
The proof for the following is almost identical to that of Proposition 4.4 and is therefore omitted.
Proposition 4.5. Let have . For any with and ,
Corollary 4.6. One has
Proof. Set and in Propositions 4.3 and 4.4 to obtain But and so the result follows.
Corollary 4.7. One has
Proof. Define
By definition, , and . Let
and note that since , and with . Hence, exists.
Set and in Propositions 4.3 and 4.5 to obtain
But
and so the result follows.
We now consider and use the fact that must exist for all (for some ) for to be an infinite sequence of random variables. Also note that , which is an increasing function for .
Corollary 4.8. If for some and if for all , is an increasing function of , then
Proof. By the monotonicity of ,
By Proposition 4.3 with and
where the last statement follows since for , .
The result follows from combining the two inequalities.
We next combine Corollaries (1.1), (1.2), and (1.3) to prove the sufficient condition for the required convergence in probability. We show that if Condition (4.4) is satisfied, then
By Lemma 4.1ββ as , and so from Corollary 4.8
Since , given some and any , we may choose a such that, for all and all ,
Taylor expand about , recalling that and :
for some . From Corollary 4.6 we therefore see that
Similarly, from Corollary 4.7
But
and each of the terms on the right-hand side can be made as small as desired by taking large enough.
4.1.3. Convergence of th Moment
Proposition 4.9. Let be the (eventually) unique solution to the equation If satisfies (2.1) then for any fixed
Proof. Without loss of generality assume that . Hence, by the Intermediate Value Theorem, there exists a value such that Thus, and the result follows.
Lemma 4.10. For fixed ,
Proof. Set If satisfies (2.1) then so does . Therefore, from Lemma 4.2, given and there is a such that, for all , Furthermore, by Proposition 4.9, there is a such that, for all , Therefore, since the integrand is positive, for all , Similarly Applying Lemma 4.2 again, there is a such that, for all , Therefore, for all , Hence, The result follows since and can be made arbitrarily small.
4.2. Proof of Theorem 2.2
Any spherically symmetric random variable can be decomposed into a uniform angular component and a radial distribution. We may therefore create an invertible map from any -dimensional spherically symmetric random variable with a continuous radial distribution function to a standard -dimensional Gaussian, . We will apply the following map: set where and are the distribution functions of and , respectively, and then fix
This mapping is key both to the proofs of both Theorems 2.2 and 2.3. To simplify the exposition in both this section and Section 4.3 we define
The following is therefore equivalent to (2.6).
Lemma 4.11. Define , , , and as in (4.52) and the statement of Theorem 2.2. If (2.5) holds and 1, then
Proof. For some , let
For now fix and , and suppress the subscript . Denote the spectral decomposition of as , where . We will initially consider the Gaussian and define ; since is orthonormal, it follows that .
Define
Then, for fixed ,
Chebyshevβs inequality gives
By (2.5) there is a such that, for all , . Thus, for all ,
Hence, . Now
and since each of the three terms converge in probability to 1, so does the product.
We now turn to the proof of convergence in mean square and first show an equivalence of the expected second moments of the norms.
Proposition 4.12. For , , , and to be as defined in (4.52) and the statement of Theorem 2.2,
Proof. For clarity of exposition we suppress the subscript . Since is spherically symmetric we may without loss of generality consider it with axes along the principle components of . Then But, again, is spherically symmetric so this is Turning now to convergence in mean square itself, note that, by Proposition 4.12, But (1.2) implies that , and hence it is sufficient to show that Now, by Lemma 4.11 and Proposition 4.12, We now require Scheffeβs Lemma, which states that, for any sequence of random variables , if and , then . Hence . Now (1.2) also implies that , and hence, (4.64) is satisfied.
4.3. Proof of Theorem 2.3
Throughout this section we define and as in Section 4.2. We first prove Part 1.
Given , it will be convenient to define the following event:
Now, for independent of (and ), and so
For any event , and, in particular,
Given , by (1.1) we may define such that, for all , . Thus, for all ,
By taking large enough we can make and as small as desired. Moreover, since is bounded and monotonic, such that with , and hence
To prove Part 2, first note that, whereas , , and so
But a unit vector chosen uniformly at random can be written as for some standard -dimensional Gaussian . Hence, by Theorem 2.2,
We now define the event and the proof follows as for Part 1.
In proving Part 3 we require the following standard result (e.g., Theoremββ1.5.3, [4]). Set
Also let be the distribution function of a Gumbel random variable, and let be independent and identically distributed random variables. Then
Replacing with or with () gives
Choose in (4.66) small enough that . Then
Similarly by choosing in (4.66) small enough that for some small ,
In each case the first term tends to 1 and , proving the desired result.