We review the basic hypotheses which motivate the statistical framework used to analyze the cosmic microwave background, and how that framework can be enlarged as we relax those hypotheses. In particular, we try to separate as much as possible the questions of gaussianity, homogeneity, and isotropy from each other. We focus both on isotropic estimators of nongaussianity as well as statistically anisotropic estimators of gaussianity, giving particular emphasis on their signatures and the enhanced “cosmic variances” that become increasingly important as our putative Universe becomes less symmetric. After reviewing the formalism behind some simple model-independent tests, we discuss how these tests can be applied to CMB data when searching for large-scale “anomalies”.

1. Introduction

According to our current understanding of the Universe, the morphology of the cosmic microwave background (CMB) temperature field, as well as all cosmological structures that are now visible, like galaxies, clusters of galaxies, and the whole web of large-scale structure, are probably the descendants of quantum process that took place some 1035 seconds after the Big Bang. In the standard lore, the machinery responsible for these processes is termed cosmic inflation and, in general terms, what it means is that microscopic quantum fluctuations pervading the primordial Universe are stretched to what correspond, today, to cosmological scales (see [13] for comprehensive introductions to inflation.) These primordial perturbations serve as initial conditions for the process of structure formation, which enhance these initial perturbations through gravitational instability. The subsequent (classical) evolution of these instabilities preserves the main statistical features of these fluctuations that were inherited from their inflationary origin–provided, of course, that we restrain ourselves to linear perturbation theory.

However, given that matter has a natural tendency to cluster, and this inevitably leads to nonlinearities (not to mention the sorts of complications that come with baryonic physics), the structures which are visible today are far from ideal probes of those statistical properties. CMB photons, on the other hand, to an excellent approximation experience free streaming since the time of decoupling (𝑧1100) and are therefore exempt from these non-linearities (except, of course, for secondary anisotropies such as the Rees-Sciama effect or the Sunyaev-Zel'dovich effect), which implies that they constitute an ideal window to the physics of the early Universe—see, for example, [46]. In fact, we can determine the primary CMB anisotropies as well as most of the secondary anisotropies on large scales, such as the Integrated Sachs-Wolfe effect, completely in terms of the initial conditions by means of a linear kernel as follows:Θ(̂𝑛)Δ𝑇̂𝑛;𝜂0𝑇𝜂0=𝑑3𝑥𝜂00𝑑𝜂𝑖𝐾𝑖𝑥,𝜂𝑆;̂𝑛𝑖𝑥,𝜂,(1) where 𝜂 is conformal time, and 𝒮𝑖 denote the initial conditions of all matter and metric fields (as well as their time derivatives, if the initial conditions are nonadiabatic). Here 𝐾𝑖 is a linear kernel, or a retarded Green's function, that propagates the radiation field to the time and place of its detection, here on Earth. Since that kernel is insensitive to the statistical nature of the initial conditions (which can be thought of as constants which multiply the source terms), those properties are precisely transferred to the CMB temperature field Θ.

The statistical properties of the primordial fluctuations are, to lowest order in perturbation theory, quite simple; because the quantum fluctuations that get stretched and enhanced by inflation are basically harmonic oscillators in their ground state, the distribution of those fluctuations is Gaussian, with each mode an independent random variable. The Fourier modes of these fluctuations are characterized by random phases (corresponding to the random initial values of the oscillators), with zero mean, and variances which are given simply by the field mass and the mode's wavenumber 𝑘=2𝜋/𝜆. The presence of higher-order interactions (which exist even for free fields, because of gravity) changes this simple picture, introducing higher-order correlations which destroy gaussianity—even in the simplest scenario of inflation [79]. However, since these interactions are typically suppressed by powers of the factor 𝐺𝐻21012, where 𝐺 is Newton's constant and 𝐻 the Hubble parameter during inflation, the corrections are small—but, at least in principle, detectable [1012].

Since these statistical properties are a generic prediction of (essentially) all inflationary models, they can also be inferred from two ingredients that are usually assumed as a first approximation to our Universe. First, since inflation was designed to stretch our Universe until it became spatially homogeneous and isotropic, it is reasonable to expect that all statistical momenta of the CMB should be spatially homogeneous and rotationally invariant, regardless of their general form. Second, in linear perturbation theory [13] where we have a large number of cosmological fluctuations evolving independently, we can expect, based on the central limit theorem, that the Universe will obey a Gaussian distribution.

The power of this program lies, therefore, in its simplicity: if the Universe is indeed Gaussian, homogeneous, and statistically isotropic (SI), then essentially all the information about inflation and the linear (low redshift) evolution of the Universe is encoded in the variance, or two-point correlation function, of large-scale cosmological structures and/or the CMB. As it turns out, the five-year dataset from the Wilkinson Microwave anisotropy probe (WMAP) strongly supports these predictions [11, 14]. Moreover, the measurements of the CMB temperature power spectrum by the WMAP team, alongside measurements of the matter power spectrum from existing survey of galaxies [15, 16] and data from type Ia supernovae [1719], have shown remarkable consistency with a concordance model (ΛCDM), in which the cosmos figures as a Gaussian, spatially flat, approximately homogeneous, and statistically isotropic web of structures composed mainly of baryons, dark matter, and dark energy.

However, while the detection of a nearly scale-invariant and Gaussian spectrum is a powerful boost to the idea of inflation, just knowing the variance of the primordial fluctuations is not sufficient to single out which particular inflationary model was realized in our Universe. For that, we will need not only the 2-point function, but the higher momenta of the distribution as well. Therefore, in order to break this model degeneracy, we must go beyond the framework of the ΛCDM, Gaussian, spatially homogeneous, and statistically isotropic Universe.

Reconstructing our cosmic history, however, is not the only reason to explore further the statistical properties of the CMB. The full-sky temperature maps by WMAP [11, 20] have revealed the existence of a series of large-angle anomalies—which, incidentally, were (on hindsight) already visible in the lower-resolution COBE data [21]. These anomalies suggest that at least one of our cherished hypotheses underlying the standard cosmological model might be wrong—even as a first-order approximation. Perhaps the most intriguing anomalies (described in more detail in other review papers in this volume) are the low value of the quadrupole and its alignment of the quadrupole (=2) with the octupole (=3) [2227], the sphericity [26] (or lack of planarity [28]), of the multipole =5, and the north-south asymmetry [2933]. In the framework of the standard cosmological model, these are very unlikely statistical events, and yet the evidence that they exist in the real data (and are not artifacts of poorly subtracted extended foregrounds—e.g., [34]) is strong.

Concerning theoretical explanations, even though we have by now an arsenal of ad hoc models designed to account for the existence of these anomalies, none has yet quite succeeded in explaining their origin. Nevertheless, they all share the point of view that the detected anomalies might be related to a deviation of gaussianity and/or statistical isotropy.

In this paper, we will describe, first, how to characterize, from the point of view of the underlying spacetime symmetries, both non-gaussianity and statistical anisotropy. We will adopt two guiding principles. The first is that gaussianity and SI, being completely different properties of a random variable, should be treated separately, whenever possible or practical. Second, since there is only one type of gaussianity and SI but virtually infinite ways away from them, it is important to try to measure these deviations without a particular model or anomaly in mind–although we may eventually appeal to particular models as illustrations or as a means of comparison. This approach is not new and, although not usually mentioned explicitly, it has been adopted in a number of recent papers [35, 36].

One of the main motivations for this model-independent approach is the difficult issue of aprioristic statistics; one can only test the random nature of a process if it can be repeated a very large (formally, infinite) number of times. Since the CMB only changes on a timescale of tens of millions of years, waiting for our surface of last scattering to probe a different region of the Universe is not a practical proposition. Instead, we are stuck with one dataset (a sequence of apparently random numbers), which we can subject to any number of tests. Clearly, by sheer chance, about 30% of the tests will give a positive detection with 70% confidence level (CL), 10% will give a positive detection with 90% CL, and so on. With enough time, anyone can come up with detections of arbitrarily high significance—and ingenuity will surely accelerate this process. Hence, it would be useful to have a few guiding principles to inform and motivate our statistical tests, so that we do not end up shooting blindly at a finite number of fish in a small wheelbarrow.

This paper is divided into two parts. We start Part I by reviewing the basic statistical framework behind linear perturbation theory (Section 2). This serves as a motivation for Section 3, where we discuss the formal aspects of non-Gaussian and statistically isotropic models (Section 2.1), as well as Gaussian models of statistical anisotropy (Section 2.2). Part II is devoted to a discussion on model-independent cosmological tests of non-gaussianity and statistical anisotropy and their application to CMB data. We focus on two particular tests, namely, the multipole vectors statistics (Section 3) and functional modifications of the two-point correlation function (Section 4). After discussing how such tests are usually carried out when searching for anomalies in CMB data (Section 6.1), we present a new formalism which generalizes the standard procedure by including the ergodicity of cosmological data as a possible source of errors (Section 6.2). This formalism is illustrated in Section 7, where we carry a search of planar-type deviations of isotropy in CMB data. We then conclude in Section 8.

Part I: The Linearized Universe

2. General Structure

We start by defining the temperature fluctuation field. Since the background radiation is known to have an average temperature of 2.725 K, we are interested only in deviations from this value at a given direction ̂𝑛 in the CMB sky. So let us consider the dimensionless function on 𝑆2 as follows:Θ(̂𝑛)𝑇(̂𝑛)𝑇0𝑇0,(2) where 𝑇0=2.725 K is the blackbody temperature of the mean photon energy distribution—which, if homogeneity holds, is also equal to the ensemble average of the temperature.

In full generality, the fluctuation field is not only a function of the position vector 𝑛, but also of the time in which our measurements are taken. In practice, the time and displacement of measurements vary so slowly that we can ignore these dependences altogether. Therefore, we can equally well consider this function as one defined only on the unit radius sphere 𝑆2, for which the following decomposition holds:Θ(̂𝑛)=,𝑚𝑎𝑚𝑌𝑚(̂𝑛).(3) Since the spherical harmonics 𝑌𝑚(̂𝑛) obey the symmetry 𝑌𝑚(̂𝑛)=(1)𝑚𝑌,𝑚(̂𝑛), the fact that the temperature field is a real function implies the identity 𝑎𝑚=(1)𝑚𝑎,𝑚. This means that each temperature multipole is completely characterized by 2+1 real degrees of freedom.

2.1. From Inhomogeneities to Anisotropies: Linear Theory

The ultimate source of anisotropies in the Universe is the inhomogeneities in the baryon-photon fluid, as well as their associated spacetime metric fluctuations. If the photons were in perfect equilibrium with the baryons up to a sharply defined moment in time (the so-called instant recombination approximation), their distribution would have only one parameter (the equilibrium temperature at each point), so that photons flying off in any direction would have exactly the same energies. In that case, the photons we see today coming from a line-of-sight ̂𝑛 would reflect simply the density and gravitational potentials (the “sources”) at the position 𝑅̂𝑛, where 𝑅 is the radius to that (instantaneous) last scattering surface. Evidently, multiple scatterings at the epoch of recombination, combined with the fact that anisotropies themselves act as sources for more anisotropies, complicate this picture, and in general the relationship of the sources with the anisotropies must be calculated from either a set of Einstein-Boltzmann equations or, equivalently, from the line-of-sight integral equations coupled with the Einstein, continuity, and Euler equations [6].

Assuming for simplicity that recombination was instantaneous, at a time 𝜂𝑅, the linear kernels of (1) reduce to 𝐾𝑖(𝑥,𝜂;̂𝑛)𝛽𝑖𝛿(𝜂𝜂𝑅)𝛿(𝑥̂𝑛𝑅), where 𝑅=𝜂0𝜂𝑅 and 𝛽𝑖 are constant coefficients. The photon distribution that we measure on Earth would therefore be given byΘ(̂𝑛)𝑖0𝑥0200𝑑𝛽𝑖𝑆𝑖𝑥=̂𝑛𝑅,𝜂=𝜂𝑅.(4) We can also express this result in terms of the Fourier spectrum of the sources as follows:Θ(̂𝑛)𝑖0𝑥0200𝑑𝛽𝑖𝑑3𝑘(2𝜋)3𝑒𝑖𝑘̂𝑛𝑅𝑆𝑖𝑘,𝜂𝑅.(5) Now we can use what is usually referred to as “Rayleigh's expansion” (though Watson, in his classic book on Bessel functions, attributes this to Bauer, J. f. Math. LVI, 1859) as follows:𝑒𝑖𝑘𝑥=4𝜋𝑚𝑖𝑗(𝑘𝑥)𝑌𝑚̂𝑘𝑌𝑚(̂𝑥),(6) where 𝑗(𝑧) are the spherical Bessel functions. Substituting (6) into (5) we obtain that𝑎𝑚=𝑑2̂𝑛𝑌𝑚(𝑑̂𝑛)Θ(̂𝑛)3𝑘(2𝜋)3Θ𝑘×4𝜋𝑖𝑗(𝑘𝑅)𝑌𝑚̂𝑘,(7) where we have loosely collected the sources into the term Θ(𝑘)𝑖𝛽𝑖𝑆𝑖(𝑘,𝜂𝑅). This expression conveys well the simple relation between the Fourier modes and the spherical harmonic modes. Therefore, up to coefficients which are known given some background cosmology, the statistical properties of the harmonic coefficients 𝑎𝑚 are inherited from those of the Fourier modes Θ(𝑘) of the underlying matter and metric fields. Notice that the properties of the 𝑎𝑚s under rotations, on the other hand, have nothing to do with the statistical properties of the fluctuations; they come directly from the spherical harmonic functions 𝑌𝑚.

2.2. Statistics in Fourier Space

The characterization of the statistics of random variables is most commonly expressed in terms of the correlation functions. The two-point correlation function is the ensemble expectation value, 𝐶𝑘𝑘,Θ𝑘Θ𝑘.(8) In the absence of any symmetries, this would be a generic function of the arguments 𝑘 and 𝑘, with only two constraints: first, because Θ(𝑥) is a real function, Θ(𝑘)=Θ(𝑘), hence, in our definition 𝐶(𝑘𝑘,𝑘)=𝐶(𝑘,); second, due to the associative nature of the expectation value, 𝑘𝐶(𝑘,𝑘)=𝐶(,𝑘). It is obvious how to generalize this definition to 3, 4, or an arbitrary number of fields at different 𝑘s (or “points”).

Let us first discuss the issue of gaussianity. If we say that the variables Θ(𝑘) are Gaussian random numbers, then all the information that characterizes their distribution is contained in their two-point function 𝑘𝐶(𝑘,). The probability distribution function (pdf) is then formally given by 𝑃Θ𝑘𝑘,ΘΘ𝑘Θ𝑘exp𝑘2𝐶𝑘,.(9) In this case, all higher-order correlation functions are either zero (for odd numbers of points) or they are simply connected to the two-point function by means of Wick's Theorem as follows: Θ𝑘1Θ𝑘2𝑘Θ2𝑁G=𝑁𝑖,𝑗𝛼=1𝐵𝛼𝑖,𝑗Θ𝑘𝑖Θ𝑘𝑗,(10) where the sum runs over all permutations of the pairs of wave vectors and 𝐵𝑖,𝑗 are weights.

Second, let us consider the issue of homogeneity. A field is homogeneous if its expectation values (or averages) do not dependent on the spatial points where they are evaluated. In terms of the 𝑁-point functions in real space, we should have the following Θ𝑥1Θ𝑥2Θ𝑥𝑁Homog.𝐶𝑁𝑥1𝑥2,,𝑥𝑁1𝑥𝑁.(11) Writing this expression in terms of the Fourier modes, we get the followingΘ𝑥1Θ𝑥2Θ𝑥𝑁=𝑑3𝑘1𝑑3𝑘1𝑑3𝑘𝑁(2𝜋)3𝑁𝑒𝑘𝑖1𝑥1𝑒𝑘𝑖2𝑥2𝑒𝑘𝑖𝑁𝑥𝑁×Θ𝑘1Θ𝑘2𝑘Θ𝑁.(12) Homogeneity demands that the expression in (12) is a function of the distances between spatial points only, not of the points themselves. Hence, the expectation value in Fourier space on the right-hand side of this expression must be proportional to 𝑘𝛿(1+𝑘2𝑘++𝑁). In other words, the hypothesis of homogeneity constrains the 𝑁-point function in Fourier space to be of the following form: Θ𝑘1Θ𝑘2𝑘Θ𝑁H=(2𝜋)3𝑁𝑘1,𝑘2𝑘,,𝑁𝑘×𝛿1+𝑘2𝑘++𝑁.(13) Notice that the “(𝑁1)-spectrum” in Fourier space, 𝑁, can still be a function of the directions of the wavenumbers 𝑘𝑖 (it will be, in fact, a function of 𝑁1 such vectors, due to the global momentum conservation expressed by the 𝛿-function.) Models which realize the general idea of (13) correspond to homogeneous but anisotropic universes [3740].

There is a useful diagrammatic illustration for the 𝑁-point functions in Fourier space that enforce homogeneity. Notice that we could use the 𝛿-function in (13) to integrate out any one of the momenta 𝑘𝑖 in (12). Let us instead rewrite the 𝛿-functions in terms of triangles, so for the 4-point function we have 𝛿𝑘1+𝑘2+𝑘3+𝑘4=𝑑3𝑘𝑞𝛿1+𝑘2𝛿𝑘𝑞3+𝑘4,+𝑞(14) whereas for the 5-point function we have 𝛿𝑘1+𝑘2+𝑘3+𝑘4+𝑘5=𝑑3𝑞𝑑3𝑞𝛿𝑘1+𝑘2𝛿𝑘𝑞𝑞+3𝑞×𝛿𝑞+𝑘4+𝑘5,(15) and so on, so that the 𝑁-point 𝛿-function is reduced to 𝑁2 triangles with 𝑁3 “internal momenta” (the idea is nicely illustrated in Figure 1.) Substituting the expression for the 𝑁-point 𝛿-function into (12) and integrating out all external momenta but the first (𝑘1) and last (𝑘𝑁), the result is as followsΘ𝑥1Θ𝑥2Θ𝑥𝑁=1(2𝜋)3𝑁𝑑3𝑘1𝑑3𝑞1𝑑3𝑞𝑁3𝑑3𝑘𝑁×𝑒𝑖𝑘1(𝑥1𝑥2)𝑒𝑖𝑞1(𝑥2𝑥3)𝑒𝑖𝑞𝑁3(𝑥𝑁2𝑥𝑁1)×𝑒𝑖𝑘𝑁(𝑥𝑁1𝑥𝑁)Θ𝑘1Θ𝑞1𝑘1𝑘Θ𝑁.(16) This expression shows explicitly that the real-space 𝑁-point function above does not depend on any particular spatial point, only on the intervals between points.

Finally, what are the constraints imposed on the 𝑁-point functions that come from isotropy alone? Clearly, no dependence on the directions defined by the points, 𝑥𝑖𝑥𝑗, can arise in the final expression for the 𝑁-point functions in real space, so from (12) we see that the 𝑁-point function in Fourier space should depend only on the moduli of the wavenumbers—up to some momentum-conservation 𝛿-functions, which naturally carry vector degrees of freedom.

In this paper, we will mostly be concerned with tests of isotropy given homogeneity (but not necessarily Gaussianity), so in our case we will usually assume that the 𝑁-point function in Fourier space assumes the form given in (13).

2.3. Statistics in Harmonic Space

In the previous Section, we characterized the statistics of our field in Fourier space, which in most cases is most easily related to fundamental models such as inflation. Now we will change to harmonic representation, because that is what is most directly related to the observations of the CMB, Θ(̂𝑛), which are taken over the unit sphere 𝑆2. We will discuss mostly the two-point function here, and we defer a fuller discussion of 𝑁-point functions in harmonic space to Section 3.

From (7), we can start by taking the two-point function in harmonic space, and computing it in terms of the two-point function in Fourier space as follows 𝑎𝑚𝑎𝑚=𝑑3𝑘𝑑3𝑘(2𝜋)6(4𝜋)2𝑖(𝑖)𝑗(𝑘𝑅)𝑗𝑘𝑅𝑌𝑚̂𝑘×𝑌𝑚̂𝑘Θ𝑘Θ𝑘.(17) Under the hypothesis of homogeneity, this expression simplifies considerably, leading to 𝑎𝑚𝑎𝑚H=𝑑3𝑘2𝜋𝑖(𝑖)𝑗(𝑘𝑅)𝑗(𝑘𝑅)𝑌𝑚̂𝑘𝑌𝑚̂𝑘×𝑁2𝑘.(18) If, in addition to homogeneity, we also assume isotropy, then 𝑁2𝑃(𝑘), and the integration over angles factors out, leading to the orthogonality condition for spherical harmonics as follows 𝑑2̂𝑘𝑌𝑚̂𝑘𝑌𝑚̂𝑘=𝛿𝛿𝑚𝑚,(19) and as a result the covariance of the 𝑎𝑚s becomes diagonal as follows𝑎𝑚𝑎𝑚H,I=𝛿𝛿𝑚𝑚𝑑𝑘𝑘𝑗2(2𝑘𝑅)𝜋𝑘3𝑃(𝑘)=4𝜋𝛿𝛿𝑚𝑚𝑑log𝑘𝑗2(𝑘𝑅)Δ2𝑇(𝑘)𝐶𝛿𝛿𝑚𝑚,(20) where we have defined the usual temperature power spectrum Δ𝑇(𝑘)=𝑘3𝑃(𝑘)/2𝜋2 in the middle line, and the angular power spectrum 𝐶 in the last line of (20). As a pedagogical note, let us recall that the power spectrum basically expresses how much power the two-point correlation function has per unit log𝑘 as follows: ΘΘ𝑥𝑥H,𝐼=𝑘||𝑑log𝑘sin𝑥𝑥||𝑘||𝑥𝑥||Δ2𝑇(𝑘).(21)

In an analogous manner to what was done above, we can also construct the angular two-point correlation function in harmonic space as follows:Θ(̂𝑛)Θ̂𝑛=𝑚𝑚𝑎𝑚𝑎𝑚𝑌𝑚(̂𝑛)𝑌𝑚̂𝑛.(22) The hypothesis of homogeneity by itself does not lead to significant simplifications, but isotropy leads to a very intuitive expression for the angular two-point function as follows:Θ(̂𝑛)Θ̂𝑛H,I=𝑚0𝑥0200𝑑𝑚𝐶𝛿𝛿𝑚𝑚𝑌𝑚(̂𝑛)𝑌𝑚̂𝑛=𝐶2+1𝑃4𝜋̂𝑛̂𝑛.(23) Clearly, not only is this expression the analogous in 𝑆2 of (21), but in fact the Fourier power spectrum Δ2𝑇(𝑘) and the angular power spectrum 𝐶 are defined in terms of each other as indicated in (23) as follows: 𝐶=4𝜋𝑑log𝑘𝑗2(𝑘𝑅)Δ2𝑇(𝑘).(24) Now, using the facts that the spherical Bessel function of order peaks when its argument is approximately given by , and that 𝑑log𝑧𝑗2(𝑧)=1/(2(+1)), we obtain the following (this is one type of what has become known in the literature as Limber's approximations )𝐶2𝜋Δ(+1)2𝑇𝑘=𝑅.(25) Incidentally, from this expression it is clear why it is customary to define 𝒞(+1)𝐶2𝜋Δ2𝑇𝑘=𝑅.(26)

Using (12), we can easily generalize the results of this subsection to 𝑁-point functions in 𝑆2 and in harmonic space, however, the assumption of isotropy alone does very little to simplify our life. The hypothesis of homogeneity, on the other hand, greatly simplifies the angular 𝑁-point functions, and most of the work in statistical anisotropy of the CMB that goes beyond the two-point function assumes that homogeneity holds. Notice that the issue of gaussianity is, as always, confined to the question of whether or not the two-point function holds all information about the distribution of the relevant variables and is therefore completely separated from questions about homogeneity and/or isotropy.

Also notice that the separable nature of the definition (22) implies here as well, like in Fourier space, a reciprocity relation for the correlation function 𝐶̂𝑛1,̂𝑛2=𝐶̂𝑛2,̂𝑛1.(27) This symmetry must hold regardless of underlying models and is important in order to analyze the symmetries of the correlation function, as we will see later.

Before we move on, it is perhaps important to mention that the decomposition (22) is not unique. In fact, instead of the angular momenta of the parts, (1,𝑚1;2,𝑚2), we could equally well have used the basis of total angular momentum (𝐿,𝑀;1,2) and decomposed that expression as 𝐶̂𝑛1,̂𝑛2=𝐿,𝑀1,2𝒜𝐿𝑀12𝒴𝐿𝑀12̂𝑛1,̂𝑛2,(28) where 𝒴𝐿𝑀12 are known as the bipolar spherical harmonics, defined by [41]𝒴𝐿𝑀12̂𝑛1,̂𝑛2=𝑌1̂𝑛1𝑌2̂𝑛2𝐿𝑀,(29) where 𝐿 and 𝑀=𝑚1+𝑚2 are the eigenvalues of the total and azimuthal angular momentum operators, respectively. This decomposition is completely equivalent to (22), and we can exchange from one decomposition to another by using the relation𝒜𝐿𝑀12=𝑚1𝑚2𝑎1𝑚1𝑎2𝑚2(1)𝑀+122𝐿+112𝐿𝑚1𝑚2,𝑀(30) where the 3×2 matrices above are the well-known 3-j coefficients. At this point, it is only a matter of mathematical convenience whether we choose to decompose the correlation function as in (22) or as in (28). Although the bipolar harmonics behave similarly to the usual spherical harmonics in many aspects, the modulations of the correlation function as described in this basis have a peculiar interpretation. We will not go further into detail about this decomposition here, as it is discussed at length in another review article in this volume.

2.4. Estimators and Cosmic Variance

Returning to the covariance matrix (20), we see that, if we assume gaussianity of the 𝑎𝑚s, then the angular power spectrum suffices to describe statistically how much the temperature fluctuates in any given angular scale; all we have to do is to calculate the average (20). This can be a problem, though, since we have only one Universe to measure, and therefore only one set of 𝑎𝑚s. In other words, the average in (20) is poorly determined.

At this point, the hypothesis that our Universe is spatially homogeneous and isotropic at cosmological scales comes not only as simplifying assumption about the spacetime symmetries, but also as a remedy to this unavoidable smallness of the working cosmologist's sample space. If isotropy holds, different cosmological scales are statistically independent, which means that we can take advantage of the ergodic hypothesis and trade averaging over an ensemble for averaging over space. In other words, for a given we can consider each of the 2+1 real numbers in 𝑎𝑚 as statistically independent Gaussian random variables, and define a statistical estimator for their variances as the average𝐶12+1𝑚=||𝑎𝑚||2.(31) The smaller the angular scales ( bigger), the larger the number of independent patches that the CMB sky can be divided into. Therefore, in this limit we should havelim𝐶=𝐶.(32) On the other hand, for large angular scales (small 's), the number of independent patches of our Universe becomes smaller, and (31) becomes a weak estimation of the 𝐶s. This means that any statistical analysis of the Universe on large scales will be plagued by this intrinsic cosmic sample variance. Notice that this is an unavoidable limit as long as we have only one observable Universe.

Finally, it is important to keep in mind the clear distinction between the angular power spectrum 𝐶 and its estimator (31). The former is a theoretical variable which can be calculated from first principles, as we have shown in Section 2.1. The latter, being a function of the data, is itself a random variable. In fact, if the 𝑎𝑚s are Gaussian, then we can rewrite expression (31) as(2+1)𝐶𝐶=𝑋,𝑋=𝑚=||𝑎𝑚||2𝐶,(33) where 𝑋 is a chi-square random variable with 2+1 degrees of freedom. According to the central limit theorem, when , 𝑋 approaches a standard normal variable (A standard normal variable is a Gaussian variable 𝑋 with zero mean and unit variance. Any other Gaussian variable 𝑌 with mean 𝜇 and variance 𝜎 can be obtained from 𝑋 through 𝑌=𝜎𝑋+𝜇.) which implies that 𝐶 will itself follow a Gaussian distribution. Its mean can be easily calculated using (20) and (31) and is of course given by𝐶=𝐶,(34) which shows that the 𝐶s are unbiased estimators of the 𝐶s. It is also straightforward to calculate its variance (valid for any )𝐶𝐶𝐶𝐶=2𝐶2+12𝛿.(35) Because this estimator does not couple different cosmological scales, it has the minimum cosmic variance we can expect from an estimator due to the finiteness of our sample—so it is optimal in that sense. 𝐶 is therefore the best estimator we can build to measure the statistical properties of the multipolar coefficients 𝑎𝑚 when both statistical isotropy and gaussianity hold.

In later Sections, we will explore angular or harmonic 𝑁-point functions for which the assumption of isotropy does not hold. However, it is important to remember at all times that we have only one map, which means one set of 𝑎𝑚s. The estimator for the angular power spectrum, 𝐶, takes into account all the 𝑎𝑚s by dividing them into the different s and summing over all 𝑚(,). Clearly, it will inherit a sample variance for small s, when the 𝑎𝑚s can only be divided into a few “independent parts”. As we try to estimate higher-order objects such as the 𝑁-point functions, we will have to subdivide the 𝑎𝑚's into smaller and smaller subsamples, which are not necessarily independent (in the statistical sense) of each other. So, the price to pay for aiming at higher-order statistics is a worsening of the cosmic sample variance.

2.5. Correlation and Statistical Independence

The covariance given in (20) has two distinct, important properties. First, note that its diagonal entries, the 𝐶's, are 𝑚-independent coefficients; this is crucial for having statistical isotropy, as we will show latter. Second, statistical isotropy at the Gaussian level implies that different cosmological “scales” (understood here as meaning the modes with total angular momentum and azimuthal momentum 𝑚) should be statistically independent of each other—and this is represented by the Kronecker deltas in (20).

In fact, statistical independence of cosmological scales is a particular property of Gaussian and statistically isotropic random fields and is not guaranteed to hold when gaussianity is relaxed. We will see in the next Section that the rotationally invariant 3-point correlation function (and in general any 𝑁>2 correlation function) couples to the three scales involved. In particular, if it happens that the Gaussian contribution of the temperature field is given by (20), but at least one of its non-Gaussian moments are nonzero, then the fact that a particular correlation is zero, like for example 𝑎2𝑚1𝑎3𝑚2, does not imply that the scales =2 and =3 are (statistically) independent. This is just a restatement of the fact that, while statistical independence implies null correlation, the opposite is not necessarily true. This can be illustrated by the following example: consider a random variable 𝛼 distributed as[],𝑃(𝛼)=1,𝛼0,10,otherwise.(36) Let us now define two other variables 𝑥=cos(2𝜋𝛼) and 𝑦=sin(2𝜋𝛼). From these definitions, it follows that 𝑥 and 𝑦 are statistically dependent variables, since knowledge of the mean/variance of 𝑥 automatically gives the mean/variance of 𝑦. However, these variables are clearly uncorrelated1𝑥𝑦=2𝜋02𝜋cos𝜂sin𝜂𝑑𝜂=0.(37) Although correlations are among cosmologist's most popular tools when analyzing CMB properties, statistical independence may turn out to be an important property as well, specially at large angular scales, where cosmic variance is more of a critical issue.

3. Beyond the Standard Statistical Model

Until now we have been analyzing the properties of Gaussian and statistically isotropic random temperature fluctuations. This gives us a fairly good statistical description of the Universe in its linear regime, as confirmed by the astonishing success of the ΛCDM model. This picture is incomplete though, and we have good reasons to search for deviations of either gaussianity and/or statistical isotropy. For example, the observed clustering of matter in galactic environments certainly goes beyond the linear regime where the central limit theorem can be applied, therefore leading to large deviations of gaussianity in the matter power spectrum statistics. Besides, deviations of the cosmological principle may leave an imprint in the statistical moments of cosmological observables, which can be tested by searching for spatial inhomogeneities [42] or directionalities [43].

But how do we plan to go beyond the standard model, given that there is only one Gaussian and statistically isotropic description of the Universe, but infinite possibilities otherwise? This is in fact an ambitious endeavor, which may strongly depend on observational and theoretical hints on the type of signatures we are looking for. In the absence of extra input, it is important to classify these signatures in a general scheme, differentiating those which are non-Gaussian from those which are anisotropic. Furthermore, given that the signatures of non-gaussianity may in principle be quite different from that of statistical anisotropy, such a classification is crucial for data analysis, which requires sophisticated tools capable of separating these two issues.(Although gaussianity and homogeneity/isotropy are mathematically distinct properties, it is possible for a Gaussian but inhomogeneous/anisotropic model to look like an isotropic and homogeneous non-gaussian model. See, e.g., [44].)

We therefore start Section 3.1 by analyzing deviations of gaussianity when statistical isotropy holds. In Section 3.2, we keep the hypothesis of gaussianity and analyze the consequences of breaking statistical rotational invariance.

3.1. Non-Gaussian and SI Models
3.1.1. Rotational Invariance of 𝑁-Point Correlation Functions

We turn now to the question of non-Gaussian but statistically isotropic probabilities distributions. We will keep working with the 𝑁-point correlation function defined in harmonic space,𝑎1𝑚1𝑎2𝑚2𝑎𝑁𝑚𝑁,(38) since knowledge of these functions enables one to fully reconstruct the CMB temperature probability distribution. Specifically, we would like to know the form of any 𝑁-point correlation function which is invariant under arbitrary 3-dimensional spatial rotations. When rotated to a new (primed) coordinate system, the 𝑁-point correlation function transforms as𝑎1𝑚1𝑎2𝑚2𝑎𝑁𝑚𝑁=all𝑚𝑎1𝑚1𝑎2𝑚2𝑎𝑁𝑚𝑁𝐷1𝑚1𝑚1𝐷2𝑚2𝑚2𝐷𝑛𝑚𝑁𝑚𝑁,(39) where the 𝐷𝑚𝑖𝑚i(𝛼,𝛽,𝛾)s are the coefficients of the Wigner rotation-matrix, which depend on the three Euler-angles 𝛼, 𝛽, and 𝛾 characterizing the rotation. Notice that in this notation the primed (rotated) system is indicated by the primed 𝑚s. For the 2-point correlation function, we have already seen that the well-known expression𝑎1𝑚1𝑎2𝑚2=(1)𝑚2𝐶1𝛿12𝛿𝑚1,𝑚2(40) does the job 𝑎1𝑚1𝑎2𝑚2=𝐶1𝑚1(1)𝑚1𝐷1𝑚1𝑚1𝐷1𝑚2𝑚1𝛿12=(1)𝑚2𝐶1𝛿12𝛿𝑚1,𝑚2.(41) Note the importance of the angular spectrum, 𝐶, being an 𝑚-independent function.

What about the 3-point function? In this case, the invariant combination is found to be𝑎1𝑚1𝑎2𝑚2𝑎3𝑚3=𝐵123123𝑚1𝑚2𝑚3(42) which can be verified by straightforward calculations. Again, the nontrivial physical content of this statistical moment is contained in an arbitrary but otherwise 𝑚-independent function: the bispectrum 𝐵123 [4547]. As we anticipated in Section 2, rotational invariance of the 3-point correlation is not enough to guarantee statistical independence of the three cosmological scales involved in the bispectrum, although in principle a particular model could be formulated to ensure that 𝐵123𝛿12𝛿23, at least for some subset of a general geometric configuration of the 3-point correlation function.

These general properties hold for all the 𝑁-point correlation function. For the 4-point correlation function, for example, Hu [48] has found the following rotationally invariant combination𝑎1𝑚1𝑎4𝑚4=𝐿𝑀𝑄1234(𝐿)(1)𝑀12𝐿𝑚1𝑚2𝑀34𝐿𝑚3𝑚4𝑀,(43) where the 𝑄1234(𝐿) function is known as the trispectrum, and 𝐿 is an internal angular momentum needed to ensure parity invariance. In a likewise manner, it can be verified that the following expression 𝑎1𝑚1𝑎5𝑚5=𝐿𝑀0𝑥0200𝑑𝐿𝑀𝑃12345𝐿,𝐿(1)𝑀+𝑀12𝐿𝑚1𝑚2×𝑀34𝐿𝑚3𝑚4𝑀5𝐿𝐿𝑚5𝑀𝑀(44) gives the rotationally invariant quadrispectrum 𝑃12345(𝐿,𝐿).

The examples above should be enough to show how the general structure of these functions emerges under SI; apart from an 𝑚-independent function, every pair of momenta 𝑖 in these functions are connected by a triangle, which in turn connects itself to other triangles through internal momenta when more than 3 scales are present. In Figure 2, we show some diagrams representing the functions above.

Although we have always shown 𝑁-point functions which are rotationally invariant, the procedure used for obtaining them was rather intuitive, and therefore does not offer a recipe for constructing general invariant correlation functions. Furthermore, it does not guarantee that this procedure can be extended for arbitrary 𝑁s. Here we will present a recipe for doing that, which also guarantees the uniqueness of the solution.

The general recipe for obtaining the rotationally invariant 𝑁-point function is as follows: from the expression (39) above, we start by contracting every pairs of Wigner functions, where by “contracting” we mean using the identity 𝐷1𝑚1𝑚1(𝜔)𝐷2𝑚2𝑚2(𝜔)=𝐿,𝑀,𝑀0𝑥0𝑒𝑓1112𝐿𝑚1𝑚2𝑀12𝐿𝑚1𝑚2𝑀×(2𝐿+1)(1)𝑀+𝑀𝐷𝐿𝑀𝑀(𝜔),(45) and where 𝜔={𝛼,𝛽,𝛾} is a shortcut notation for the three Euler angles. Once this contraction is done, there will remain 𝑁/2𝐷-functions, which can again be contracted in pairs. This procedure should be repeated until there is only one Wigner function left, in which case we will have an expression of the following form:𝑎1𝑚1𝑎𝑁𝑚𝑁=all𝑚𝑎1𝑚1𝑎𝑁𝑚𝑁×geometricalfactors×𝐷𝐿𝑀𝑀(𝜔).(46) Now, we see that the only way for this combination to be rotationally invariant is when the remaining 𝐷𝐿𝑀𝑀 function above does not depend on 𝜔, that is, 𝐷𝐿𝑀𝑀(𝜔)=𝛿𝐿0𝛿𝑀0𝛿𝑀0. Once this identity is applied to the geometrical factors, we are done, and the remaining terms inside the primed 𝑚-summation will give the rotationally invariant (𝑁1)-spectrum.

As an illustration of this algorithm, let us construct the rotationally invariant spectrum and bispectrum. For the 2-point function, there is only one contraction to be done, and after we simplify the last Wigner function, we arrive at𝑎1𝑚1𝑎2𝑚2=𝑚1|||𝑎1𝑚1|||221+1(1)𝑚2𝛿12𝛿𝑚1,𝑚2,(47) where, of course𝐶12+1𝑚||𝑎𝑚||2(48) is the well-known definition of the temperature angular spectrum. For the 3-point function, there are two contractions, and the simplification of the last Wigner function gives𝑎1𝑚1𝑎2𝑚2𝑎3𝑚3=𝑚1,𝑚2,𝑚3𝑎1𝑚1𝑎2𝑚2𝑎3𝑚3123𝑚1𝑚2𝑚3123𝑚1𝑚2𝑚3.(49) From this expression and the ortoghonality of the 3-js symbols (see the Appendix), we can immediately identify the definition of the bispectrum as follows:𝐵123𝑚1,𝑚2,𝑚3𝑎1𝑚1𝑎2𝑚2𝑎3𝑚3123𝑚1𝑚2𝑚3.(50)

It should be mentioned that this recipe not only enables us to establish the rotational invariance of any 𝑁-point correlation function, but it also furnishes a straightforward definition of unbiased estimators for the 𝑁-point functions. All we have to do is to drop the ensemble average of the primed 𝑎𝑚's. So, for example, for the 2- and 3-point functions above, the unbiased estimators are given, respectively, by 𝐶=12+1𝑚𝑎𝑚𝑎𝑚,𝐵123=𝑚1,𝑚2,𝑚3𝑎1𝑚1𝑎2𝑚2𝑎3𝑚3123𝑚1𝑚2𝑚3.(51)

Notice that isotropy plays the same role, in 𝑆2, that homogeneity plays in 3. What enforces homogeneity in 3 is the Fourier-space 𝛿-functions, as in the discussion around Figure 1. However, in 𝑆2, the equivalents of the Fourier modes are the harmonic modes, for which there is only a discrete notion of orthogonality—and no Dirac 𝛿-function. What we found above is that the Wigner 3-j symbols play the same role as the Fourier space 𝛿-functions; they are the enforcers of isotropy (rotational invariance) for the 𝑁-point angular correlation function. Hence, the diagrammatic representations of the constituents of the 𝑁-point functions in Fourier (Figure 1) and in harmonic space (Figure 2) really do convey the same physical idea—one in 3, the other in 𝑆2.

3.2. Gaussian and Statistically Anisotropic Models

In the last Section, we have developed an algorithm which enables one to establish the rotational invariance of any 𝑁-point correlation function. As we have shown, this is also an algorithm for building unbiased estimators of non-Gaussian correlations. In this Section, we will change the perspective and analyze the case of Gaussian but statistically anisotropic models of the Universe.

There are many ways in which statistical anisotropy may be manifested in CMB. From a fundamental perspective, a short phase of inflation which produces just enough e-folds to solve the standard Big Bang problems may leave imprints on the largest scales of the Universe, provided that the spacetime is sufficiently anisotropic at the onset of inflation [39]. Another source of anisotropy may result from our inability to efficiently clean foreground contaminations from temperature maps. Usually, the cleaning procedure involves the application of a mask function in order to eliminate contaminations of the galactic plane from raw data. As a consequence, this procedure may either induce, as well as hide some anomalies in CMB maps [28].

It is important to mention that these two examples can be perfectly treated as Gaussian: in the first case, the anisotropy of the spacetime can be established in the linear regime of perturbation theory and therefore will not destroy gaussianity of the quantum modes, provided that they are initially Gaussian. In the second case, the mask acts linearly over the temperature maps, therefore preserving its probability distribution [49].

3.2.1. Primordial Anisotropy

Recently, there have been many attempts to test the isotropy of the primordial Universe through the signatures of an anisotropic inflationary phase [3840, 50, 51]. A generic prediction of such models is the linear coupling of the scalar, vector, and tensor modes through the spatial shear, which is in turn induced by anisotropy of the spacetime [38]. Whenever that happens, the matter power spectrum, defined in a similar way as in (13), will acquire a directionality dependence due to this type of see-saw mechanism. This dependence can be accommodated in a harmonic expansion of the form𝑃𝑘=,𝑚𝑟𝑚(𝑘)𝑌𝑚̂𝑘,(52) where the reality of 𝑃(𝑘) requires that 𝑟𝑚(𝑘)=(1)𝑚𝑟,𝑚(𝑘). Given that temperature perturbations Θ(𝑥) are real, their Fourier components must satisfy the relation Θ(𝑘)=Θ(𝑘). This property taken together with the definition (13) implies that𝑃𝑘𝑘=𝑃,(53) which in turn restricts the s in (52) to even values. Also, note that by relaxing the assumption of spatial isotropy, we are only breaking a continuous spacetime symmetry, but discrete symmetries such as parity should still be present in this class of models. Indeed, by imposing invariance of the spectrum under the transformation 𝑧𝑧, we find that (1)𝑚=1. Similarly, invariance under the transformations 𝑥𝑥 and 𝑦𝑦 implies the conditions 𝑟𝑚=(1)𝑚𝑟,𝑚 and 𝑟𝑚=𝑟,𝑚, respectively. Gathering all these constraints with the parity of , we conclude that𝑟𝑚,,𝑚2.(54) That is, from the initial 2+1 degrees of freedom, only /2+1 of them contribute to the anisotropic spectrum [39].

3.2.2. Signatures of Statistical Anisotropies

The selection rules (54) are the most generic predictions we can expect from models with global break of anisotropy. We will now work out the consequences of these rules to the temperature power spectrum and check whether they can say something about the CMB large-scale anomalies.

From the expressions (17) and (52), we can immediately calculate the most general anisotropic covariance matrix [37] as follows𝑎1𝑚1𝑎2𝑚2=3,𝑚3𝒢123𝑚1𝑚2𝑚3𝐻3𝑚312,(55) where𝒢123𝑚1𝑚2𝑚3=(1)𝑚121+122+123+1×4𝜋123000123𝑚1𝑚2𝑚3(56) are the Gaunt coefficients resulting from the integral of three spherical harmonics (see the Appendix). These coefficients are zero unless the following conditions are met:1+2+3𝑚2,1+𝑚2+𝑚3||=0,𝑖𝑗||𝑘𝑖+𝑗,𝑖,𝑗,𝑘{1,2,3}.(57) The remaining coefficients in (55) are given by𝐻3𝑚312=4𝜋𝑖120𝑑log𝑘𝑟3𝑚3(𝑘)𝑗1(𝑘𝑅)𝑗2(𝑘𝑅)(58) and correspond to the anisotropic generalization of temperature power spectrum (20).

The selection rules (57), taken together with (54), lead to important signatures in the CMB. In particular, since 3 is even, the quantity 1±2 must also be even, that is, multipoles with different parity do not couple to each other in this class of models𝑎1𝑚1𝑎2𝑚2=0,1+2=odd.(59) This result is in fact expected on theoretical grounds, because by breaking a continuous symmetry (isotropy) we cannot expect to generate a discrete signature (parity) in CMB. However, notice that the absence of correlations between, say, the quadrupole and the octupole, does not imply that there will be no alignment between them. One example of this would be a covariance matrix of the form 𝐶𝑚𝛿𝛿𝑚𝑚. If the 𝐶,0 happens to be zero, for example, then all multipoles will present a preferred direction (in this case, the 𝑧-axis).

3.2.3. Isotropic Signature of Statistical Anisotropy

We have just shown that a generic consequence of an early anisotropic phase of the Universe is the generation of even-parity signatures in CMB maps. Interestingly, these signatures may be present even in the isotropic angular spectrum, since the 𝐶s acquire some additional modulations in the presence of statistical anisotropies. In principle, these modulations could be constrained by measuring an effective angular spectrum of the form𝑎𝑚𝑎𝑚=𝐶+𝜖>0𝒢𝑚,𝑚,0𝐻0,(60) where we have introduced a small 𝜖 parameter to quantify the amount of primordial anisotropy.

In order to constrain these modulations, we have to build a statistical estimator for 𝑎𝑚𝑎𝑚. Since we are looking for the diagonal entries of the matrix (55), a first guess would be𝐶e=12+1𝑚=𝑎𝑚𝑎𝑚.(61) To check whether this is an unbiased estimator of the effective angular spectrum, we apply it to (55) and take its average𝐶e=𝐶+𝜖>0,𝑚𝒢𝑚,𝑚,0𝐻0.(62) Using the definition (56) of the Gaunt coefficients, the 𝑚-summation in the expression above becomes [41]𝑚(1)𝑚=𝑚𝑚02+1𝛿,0=0,(63) where the last equality follows because >0. Consequently, we conclude that𝐶e=𝐶𝑎𝑚𝑎𝑚.(64) At first sight, this result may seem innocuous, showing only that this is not an appropriate estimator for (60). Note, however, that (61) is in fact the estimator of the angular spectrum usually applied to CMB data under the assumption of statistical isotropy. In other words, by means of the usual procedure we may be neglecting important information about statistical anisotropy. Moreover, the cosmic variance induced by the application of this estimator on anisotropic CMB maps is small, because, as it can easily be checked,𝐶e𝐶𝐶e𝐶=2𝐶2+12𝛿𝜖+𝒪2.(65)

This result shows that the construction of statistical estimators strongly depends on our prejudices about what non-gaussianity and statistical anisotropy should look like. Consequently, an estimator built to measure one particular property of the CMB may equally well hide other important signatures. One possible solution to this problem is to let the construction of our estimators be based on what the observations seem to tell us, as we will do in the next section.

Part II: Cosmological Tests

So much for mathematical formalism. We will now turn to the question of how the hypotheses of gaussianity and statistical isotropy of the Universe can be tested. Though we are primarily interested in applying these tests to CMB temperature maps, most of the tools we will be dealing with can be applied to polarization (𝐸- and 𝐵-mode maps) and to catalogs of large-scale structures as well.

Testing the gaussianity and SI of our Universe is a difficult task. Specially because, as we have seen, there is only one Gaussian and SI Universe, but infinitely many universes which are neither Gaussian nor isotropic. So what type of non-gaussianity and statistical anisotropy should we test for? In order to attack this problem, we can follow two different routes. In the bottom-up approach, models for the cosmic evolution are formulated in such a way as to account for some specific deviations from gaussianity and SI. These physical principles range from nontrivial cosmic topologies [5254], primordial magnetic fields [5559], and local [43, 6062] and global [3840, 50] manifestation of anisotropy, to nonminimal inflationary models [51, 6367]. The main advantage of the bottom-up approach is that we know exactly what feature of the CMB is being measured. One of its drawbacks is the plethora of different models and mechanisms that can be tested.

The second possibility is the top-down, or model-independent approach. Here, we are not concerned with the mechanisms responsible for deviations of gaussianity or SI, but rather with the qualitative features of any such deviation. Once these features are understood, we can use them as a guide for model building. Examples here include constructs of a posteriori statistics [29, 35, 36] and functional modifications of the two-point correlation function [28, 37, 6871].

In the next Section, we will explore two different model-independent tests: one based on functional modifications of the two-point correlation function, and another one based on the so-called Maxwell's multipole vectors.

4. Multipole Vectors

Multipole vectors were first introduced in cosmology by Copi et al. [23] as a new mathematical representation for the primordial temperature field, where each of its multipoles are represented by unit real vectors. Later, it was realized that this idea is in fact much older [72], being proposed originally by J. C. Maxwell in his Treatise on Electricity and Magnetism.

The power of this approach is that the multipole vectors can be entirely calculated in terms of a temperature map, without any reference to external reference frames. This makes them ideal tools to test the morphology of CMB maps, like the quadrupole-octupole alignment.

The purpose of the following presentation is only comprehensiveness. A mathematically rigorous introduction to the subject can be found in references [7274], as well as other papers in this review volume.

4.1. Maxwell's Representation of Harmonic Functions

We start our presentation of the multipole vectors by recalling some terminology. A harmonic function in three dimensions is any (twice differentiable) function that satisfies Laplace's equation2=0,(66) where 2 is the Laplace operator. In spherical coordinates, the formal solution to Laplace's equation which is regular at the origin (𝑟=0) is==0(𝑟,𝜃,𝜑),=𝑚=𝑎𝑚𝑟𝑌𝑚(𝜃,𝜑).(67) The functions 𝑟𝑌𝑚 are known as the solid spherical harmonics [74]. Since they agree with the usual spherical harmonics on the unit sphere, it is sometimes stated in the literature that the latter form a set of harmonic functions. This is an abuse of nomenclature though, and the reader should be careful.

Given the scalar nature of Laplace's operator, it is possible to find solutions to (66) in terms of Cartesian coordinates. Such solutions can be constructed by combining homogeneous polynomials (i.e., sums of monomials of the same order) of order :==0(𝑥,𝑦,𝑧),=𝑎𝑏𝑐𝜆𝑎𝑏𝑐𝑥𝑎𝑦𝑏𝑧𝑐,(𝑎+𝑏+𝑐=).(68)

In three dimensions, the most general homogeneous polynomial of order contains (+2)!/(2!!) independent coefficients. However, since each polynomial must independently satisfy (66), precisely !/(2!(2)!) of these coefficients will depend on each other. This constraint leaves us with (+1)(+2)/2(1)/2=2+1 independent degrees of freedom in each multipole —which is, of course, the same number of independent degrees of freedom appearing in (67).

Based on this analysis, Maxwell introduced his own representation of harmonic functions. He noticed that by successively applying directional derivatives of the form 𝑣𝑣 over the monopole potential 1/𝑟, where 𝑟=𝑥2+𝑦2+𝑧2 and 𝑣 is a unit vector, he could construct solutions of the form (68). That is,𝑓(𝑥,𝑦,𝑧)=𝜆𝑣𝑣2𝑣11𝑟|||𝑟=1,(69) where 𝜆 are real constants. We are now going to show that this construction does indeed lead to solutions of the form (68). First, note that there is a pattern which emerges from successive application of directional derivatives over the monopole function𝑓0=1𝑟,𝑓1=𝑣1𝑓0=𝑣1𝑟𝑟3,𝑓2=𝑣2𝑓1=3𝑣2𝑣𝑟1𝑟𝑟2𝑣1𝑣2𝑟3,(70) and so on. By induction, one can show that the general expression will be given by𝑓=(1)(21)!!𝑖=1𝑣𝑖𝑟+𝑟2𝑄2𝑟2+1,(71) where 𝑄2 is a homogeneous polynomial of order 2 which only involves combinations of the vectors 𝑣𝑖 and 𝑟.

The numerator of the function 𝑓 given by (71) is clearly a homogeneous polynomial of order (as one can easily check for some s and also prove by mathematical induction). 𝐴 not so obvious result is that this polynomial is also harmonic. To prove that, let us define𝑔=(1)(21)!!𝑖=1𝑣𝑖𝑟+𝑟2𝑄2(72) and consider the application of the operator 2 over the combination 𝑟𝛼𝑔. For the 𝑥-component we get𝜕2𝑥𝑟𝛼𝑔=𝑟𝛼𝜕2𝑥𝑔+2𝛼𝑟𝛼2𝑥𝜕𝑥𝑔+𝛼𝑟𝛼2+𝛼(𝛼2)𝑥2𝑟𝛼4𝑔.(73) Repeating this process for the 𝑦- and 𝑧-components and then adding the results, we find 2𝑟𝛼𝑔=𝑟𝛼2𝑔+2𝛼𝑟𝛼2𝑥𝜕𝑥𝑔+𝑦𝜕𝑦𝑔+𝑧𝜕𝑧𝑔+𝛼(𝛼+1)𝑟𝛼2𝑔=𝑟𝛼2𝑔𝑟+𝛼(𝛼+2+1)𝛼2𝑔,(74) where in the last step we have used Euler's theorem on homogeneous functions, that is, 𝑟𝑔=𝑔. If we now choose 𝛼=(2+1), we find immediately that2𝑔𝑟2+1=2𝑔𝑟2+1.(75)

By construction, the left-hand side of the above expression is equal to 2𝑓. But according to the definition (69), this quantity is also zero, since Laplace's operator commutes with directional derivatives and 2(1/𝑟)=0 for 𝑟>0. Therefore, 2𝑔=0 and 𝑔 is harmonic, which completes our proof.

In conclusion, Maxwell's construction of harmonic functions, (69), is completely equivalent to the standard representation in terms of spherical harmonics. More importantly, this gives a one-to-one relationship between temperature maps (given by the 𝑎𝑚's) and unit vectors 𝑣𝑖. This means that the multipole vectors can be directly calculated from a CMB map, without any reference to external reference frames or additional geometrical constructs. The reader interested in algorithms to construct the vectors from CMB maps may check references [23, 72], as well as the other papers in this volume that review this approach.

4.2. Multipole Vectors Statistics

It should be clear from the discussion above that the multipole vectors give an intuitive way to discover, interpret, and visualize phase correlations between different multipoles in the CMB maps. But they also can reveal intramultipole features, such as planarity (when a given multipole presents a preferred plane). Some of the most conspicuous hints of statistical anisotropy in the CMB, like the quadrupole-octupole alignment, have indeed been first found with less mathematically elegant methods [22, 27], but are best described in terms of the multipole vector formalism [23, 24, 26, 72]. However, this case should also sound an alarm, because some feature of a (presumably) random realization was found, then further scrutinized with a certain test which was, to some extent (intentionally or not), tailored to single out that very feature.

The pitfalls of aprioristic approaches are sometimes unavoidable, and all we can do is to take a second look at our sample with a more generic set of tools, to try to assess how significant our result really is in the context of a larger set of statistical tests. Multipole vectors are, in fact, ideally suited to this, since they can be found for all multipoles, and it is relatively easy to construct scalar combinations of these vectors with the usual methods of linear algebra. Also convenient is the fact that simulating these vectors from maps, or even directly, is also relatively easy, so the standard model of a Gaussian random field can be easily translated into the pdfs for the tests constructed with the multipole vectors. An important drawback of the multipole vectors is that, because they are computed in terms of equations which are nonlinear in the temperature fields, the distribution functions of statistical tests involving these vectors are highly non-Gaussian [75]. This means that these statistics usually have to be estimated by means of simulations (usually assuming that the underlying temperature field is itself Gaussian). Another delicate issue is how stable the multipole vectors are to instrumental noise and sky cut—and here, again, we must rely on numerical simulations to compare the observations with theoretical models.

In order to see how one should go about constructing a general set of tests, it should be noted first that the multipole vectors define directions, but they have no sense (they are “headless vectors”). This follows from the fact that the sign of the constants 𝜆 is degenerated with the sign of the vectors 𝑣,𝑝 (𝑝=1,,). Hence, the first requirement is that our tests should be independent of this sign ambiguity. Notice that this ambiguity extends to ancillary constructs such as the normal vectors, defined as the vector product: 𝑤,𝑝𝑝𝑣,𝑝𝑣,𝑝.(76) Mathematically, the sign ambiguity implies that the multipole vectors belong to the quotient space 𝑆2/2 (also known as 𝑃2) while the normal vectors belong to 3/2.

In principle, any positive-definite scalar constructed through the multipole or the normal vectors is “fair game”, but there are some guidelines; for example, one should avoid double-counting the same degrees of freedom. Below we review some of these tests, based on the work of [35].

4.2.1. The 𝑅 Statistic

The first example of a test involving the multipole vectors would be a scalar product, such as 𝑣𝑣. The most natural test would involve asking whether the multipole vectors of the given multipole are especially aligned or not. This means computing𝑅=2(1)𝑝,𝑝>𝑝||𝑣,𝑝𝑣,𝑝||,(77) where the normalization was introduced to make 0𝑅1.

This idea could be generalized to test alignments between multipole vectors at different multipoles: 𝑅=1𝑝,𝑝||𝑣,𝑝𝑣,𝑝||.(78) In fact, the quadrupole-octupole alignment can already be seen with this simple test: for essentially all CMB maps the significance of the alignment as measured by 𝑅23 is of the order of 90%–95% CL [35].

4.2.2. The 𝑆 Statistic

The second most natural test does not involve directly the multipole vectors themselves, but the normal vectors that can be produced by taking the vector product between the multipole vectors. So, we take 𝑤,𝑝𝑝=𝑣,𝑝𝑣,𝑝.(79) Notice that the number of normal vectors for a given multipole is 𝑙=(1)/2—so the number of normal vectors grows rapidly for larger multipoles, making it harder to use and meaning that the same degrees of freedom may be overcounted, at least for >3.

Again, the best strategy is simplicity: we can ask whether the normal vectors are aligned, within and between multipoles. This means computing 𝑆=2𝑙(𝑙1)𝑙𝑝,𝑝>𝑝||𝑤,𝑝𝑤,𝑝||,(80) where the normalization was introduced again to make 0𝑆1.

And yet again this idea is easily generalized to test alignments between normal vectors at different multipoles: 𝑆=1𝑙𝑙𝑝,𝑝||𝑤,𝑝𝑤,𝑝||.(81) With this test, the quadrupole-octupole alignment is much more significant: for essentially all CMB maps the test 𝑆23 deviates from the expected range with 98% CL [35].

4.2.3. Other Tests with Multipole Vectors

One can go on and expand the types of tests using both multipole and normal vectors. One idea would be, for example, to disregard the moduli of the normal vectors: 𝐷=1𝑙𝑙𝑝,𝑝||𝑤,𝑝𝑤,𝑝||.(82) This test is therefore insensitive to the relative angle between the multipole vectors that produce any given normal vectors. This idea is similar to the planar modulations that will be discussed in the next section. With this test, the quadrupole-octupole alignment is significant to about 95%–98% CL This test can also be generalized to a self-alignment test (=), with just an adjustment to the normalization.

Another possibility would be to measure the alignment between multipole vectors and normal vectors: 𝐵=1𝑙𝑝,𝑝||𝑣,𝑝𝑤,𝑝||.(83) Of course, this test cannot be easily generalized to a self-alignment. With the 𝐵23 test, the quadrupole-octupole alignment is significant to about 95%–98% CL

We could go on here, but it should be clear that all the information (the 2+1 real degrees of freedom for each multipole ) has already been exhausted in the tests above.

5. Temperature Correlation Function

Despite their strong cosmological appeal, the multipole vectors have some limitations. Not only their directions in the CMB sky are sometimes difficult to interpret physically [26], but they also have the additional drawback of mixing, in a nontrivial manner, information on both gaussianity and SI of the map being analyzed [26, 75].

Another way of quantifying deviations in the standard statistical framework of cosmology is through functional modifications of the two-point correlation function [37, 68]. Although this approach does not offer an optimal separation between gaussianity and SI (which is, by the way, an open problem in this field), working with the two-point correlation function makes it easier to test Gaussian models of statistical anisotropy [28, 71].

The most general 2-point correlation function (2pcf) of two independent unit vectors is a function 𝐶 of the form𝐶𝑆2×𝑆2.(84) We have seen in Section 2.3 that if we choose spherical coordinates (𝜃𝑖,𝜑𝑖) to describe each vector ̂𝑛𝑖, the function above can be decomposed either in terms of two spherical harmonics 𝑌𝑖𝑚𝑖 or in terms of the bipolar spherical harmonic 𝒴𝐿𝑀12. In any case, the 2pcf will have the following functional dependence:𝜃𝐶=𝐶1,𝜑1,𝜃2,𝜑2.(85)

This function is absolutely general. If the Universe has any cosmological deviation of isotropy, whatever it is, it can be described by the function above (see also Figure 5).

Unfortunately, this function will be of limited theoretical interest unless we have some hints on how to select its relevant degrees of freedom. This difficulty is in fact a general characteristic of model-independent tools, which at some stages forces us to rely on our theoretical prejudices about the statistical nature of the Universe in order to construct estimators of non-Gaussianity and/or statistical anisotropy. Nonetheless, it is still possible to construct statistical estimators of anisotropy based on (85). For example, Hajian and Souradeep [68] have constructed an unbiased estimator 𝜅 for this function based solely on the requirement that this estimator should be rotationally invariant. Although it is true that any statistically significant 𝜅>0 will point towards anisotropy, it is not clear what type of anisotropy is being detected by this estimator.

We can still use (85) to search for deviations of SI if we restrict its domain to a smaller and nontrivial subdomain. For example, we can take the vectors ̂𝑛1 and ̂𝑛2 to be the same and expand a function of the form𝜃𝐶=𝐶1,𝜑1,(86) which is equivalent to 𝐶 : 𝑆2. This form of the 2pcf makes it ideal for searching for power multipole moments in CMB, once a suitable estimator is defined [37]. Unfortunately, when we take ̂𝑛1=̂𝑛2, we are in fact considering a one-point correlation function, which by construction does not allow us to measure correlations between different points in the sky.

It seems in principle that the functions (86) and (85) are the only possibilities besides the isotropic 2pcf (24). If not, what other combinations of the vectors ̂𝑛1 and ̂𝑛2 can we consider? As a matter of fact, these two vectors are geometrical quantities intuitively bound to our notion of two-point correlation functions on the sphere. From this perspective they are not fundamental quantities. In fact, we can equally well represent the 2pcf by a disc living inside the unit sphere, as shown in Figure 4.

In this representation, 𝜃 is the angle between the vectors ̂𝑛1 and ̂𝑛2, as usual. The normal to the plane, ̂𝑛, is represented by two spherical angles (Θ,Φ). Finally, there is an overall orientation 𝜑 of the disc around its unit vector which completes the four degrees of freedom contained in (85). We have found therefore another valid geometrical representation of the most general 2pcf:𝐶=𝐶(Θ,Φ,𝜃,𝜑).(87)

The main advantage of the above representation when compared to (85) is its straightforward geometrical interpretation. First, note that the angular separation 𝜃 of the isotropic 2pcf is trivially included in this definition and does not need to be obtained as a consequence of rotational invariance (see the discussion in Section 3.2.1.) Second, by characterizing the correlation function in terms of the geometrical components of a disc, we know exactly what are the degrees of freedom involved. This makes it easier to construct estimators of statistical anisotropy, alleviating the drawbacks of model-independent approaches mentioned above.

5.1. Anisotropy through Planarity

An immediate application of the representation (87) is its use in the search for planar deviations of isotropy in CMB [28, 71]. Planar modulations of astrophysical origin may play an important role to the CMB morphology. One example is the role played by the galactic and ecliptic plane in the quadrupole-octupole/north-south anomalies [26]. Also, it is well known that our galactic plane is sensible source of foreground contamination in the construction of cleaned CMB maps. These hints indicate that CMB modulations induced by the disc in Figure 4 are not only a mathematical possibility but perhaps also a symmetry of cosmological relevance.

Since we are primarily interested in measuring planar modulations of CMB, but including the usual angular modulation as the isotropic limit of the 2pcf, we can consider only the azimuthal average of (87):1𝐶2𝜋02𝜋𝐶(Θ,Φ,𝜃,𝜑)𝑑𝜑.(88) The resulting function can be easily expand in terms of simple special functions as𝐶(Θ,Φ,𝜃)=𝑙,𝑚2+1𝒞4𝜋𝑙𝑚𝑃(cos𝜃)𝑌𝑙𝑚(Θ,Φ),𝑙2,(89) where the restriction on the 𝑙-mode results from the symmetry ̂𝑛1̂𝑛2.(See [71] for more details. Incidentally, the functional dependence of (89) differs from that originaly shown in [71], which does not correctly take the planar dependence into account. We thank Yuri Shtanov for pointing this out.) The multipolar coefficients 𝒞𝑙𝑚 correspond to a generalization of the usual angular power spectrum 𝐶's. In fact, they can be seen as the coefficients of a spherical harmonic decomposition of the function 𝐶(̂𝑛), provided that this function suffers modulations as we sweep planes on the sphere.

5.1.1. Angular-Planar Power Spectrum

Since we are restricting our analysis to the Gaussian framework, the set of coefficients 𝒞𝑙𝑚 is all we need to characterize the two-point correlation function. However, the final product of CMB observations is temperature maps, and not correlation maps. What we need then is an algebraic relation between the multipolar coefficients 𝒞𝑙𝑚 and the temperature coefficients 𝑎𝑚 defined in (3). At first sight, this relation could be obtained by equating expression (89) to its standard definition in (22) and then using the orthogonality of the special functions to isolate the 𝒞𝑙𝑚's in terms of the 𝑎𝑚's. But for that to work we need the relation between the set of angles (Θ,Φ,𝜃) and (𝜃1,𝜑1,𝜃2,𝜑2) which, depending on the reference frame we choose, is extremely complicated. Fortunately, all we need is the following relation:̂𝑛1̂𝑛2=cos𝜃=cos𝜃1cos𝜃2+sin𝜃1sin𝜃2𝜑cos1𝜑2,(90) together with a suitable choice of our coordinate system. For example, we can use the invariance of the scalar product ̂𝑛1̂𝑛2 and choose our coordinate system such that the discs of Figure 4 lies in the 𝑥𝑦 plane. With this choice we will have𝜑(Θ,Φ)=(0,0),cos𝜃=cos1𝜑2,(91) and the integration over 𝜃 becomes simple. Once this is done, we make a passive rotation of the coordinate system and then we integrate over the remaining angles Θ and Φ, which will then be given precisely by the Euler angles used in the rotation. The details are rather technical and can be found in the Appendix. The final expression is(1)𝑚𝒞𝑙𝑚2𝑙+1=2𝜋1,𝑚12,𝑚2𝑎1𝑚2𝑎2𝑚2𝑙12𝑚𝑚1𝑚2𝐼𝑙,12,(92) where𝐼𝑙,12𝑚(1)𝑚𝜆1𝑚𝜆2𝑚𝑙120𝑚𝑚𝜋0𝑑(cos𝜃)𝑃(cos𝜃)𝑒𝑖𝑚𝜃,(93) and where the 𝜆𝑖𝑚's form a set of coefficients resulting from the 𝜃 integration, which are zero unless 𝑖+𝑚= even (see the Appendix for more details).

Expression (92) is what we were looking for. With this relation, the angular-planar power spectrum 𝒞𝑙𝑚 can be calculated from first principles for any model predicting a specific covariance matrix. Moreover, since the angular-planar function (89) is, after all, a correlation function, it should be possible to relate the angular-planar power spectrum 𝒞𝑙𝑚 to the bipolar power spectrum 𝒜𝐿𝑀12 of Hajian and Souradeep. In fact, by inverting expression (30) and plugging the result in (92), we find a linear relation between these two set of coefficients:𝒞𝑙𝑚=2𝜋1,2𝐴𝑙𝑚12(1)12(2𝑙+1)𝐼𝑙,12.(94) Here, the set of geometrical coefficients 𝐼𝑙,12 plays a similar role to the 3-j symbols in expression (30). Note also that the angular-planar power spectrum has only three free indices while the bipolar power spectrum has four. This is a consequence of the azimuthal average we took in (88), which further constrains the degrees of freedom of the correlation function.

5.1.2. Statistical Estimators and 𝜒2 Analysis

We have shown that the angular-planar power spectrum (92) is given in terms of an ensemble average of temperature maps. Evidently, we cannot calculate it directly from data, for we have only one CMB map (the one taken from our own Universe.) The best we can do is to estimate the statistical properties of (92), like its mean and variance, and see whether these quantities agree, in the statistical sense, with what we would expect to obtain from a particular model of the Universe.

The reader should note that this procedure is not new—its limitation is due to the same cosmic variance which lead us to construct an estimator for the angular power spectrum 𝐶 (see the discussion of Section 2.4). For the same reason, we will need to construct an estimator for the angular-planar power spectrum. An obvious choice is𝒞𝑙𝑚2𝜋2𝑙+11,𝑚12,𝑚2𝑎1𝑚1𝑎2𝑚2𝑙12𝑚𝑚1𝑚2𝐼𝑙,12,(95) for in this case we have an unbiased estimator:𝒞𝑙𝑚=𝒞𝑙𝑚,(96)

regardless of the underlying model.

If we now have a model predicting the angular-planar power spectrum, we can ask how good this model fits the observational data once it is calculated using (95). All we need is a simple chi-square (𝜒2) goodness-of-fit test, which in our case can be written in the following generalized form:𝜒2𝜈𝑙12𝑙+1𝑙𝑚=𝑙|||𝒞𝑙𝑚𝒞𝑙𝑚|||2𝜎𝑙𝑚2,(97) in which and 𝑙 are the angular and planar degrees of freedom, respectively, and where 𝜎𝑙𝑚 is just the standard deviation of the estimator 𝒞𝑙𝑚. The (2𝑙+1)1 factor accounts for the 2𝑙+1 planar degrees of freedom and was introduced for latter convenience.

In Section 7 we will apply this test to the 5-year WMAP temperature maps in order to check the robustness of the ΛCDM model against the hypothesis of SI. Before that, we will stop and digress a little about how observational uncertainties should be included in our analysis.

6. Theory v. Observations: Cosmic and Statistical Variances

Until now we have been concerned with the formal aspects of non-Gaussian and statistically anisotropic universes, and how model-independent tests might be designed to detect deviations of either gaussianity or statistical isotropy. We will now discuss how such tests can be carried out and interpreted once we possess cosmological data.

In a great variety of tests, statistical tools are designed to detect particular deviations of gaussianity or SI from cosmological data like, for example, the CMB. The final outcome of these tests is usually a probability (a pure number), which should be interpreted as the chance that a Universe like ours might result from an ensemble of “equally prepared” Gaussian and SI universes. An anomaly in CMB is usually understood as a measure of how unlikely a particular feature of our Universe is according to this specific test. The multipole vector statistics, for example, when applied to a large number of simulated (Gaussian and SI) CMB maps, show that only 0.01% of these maps have a quadrupole-octupole alignment as strong as WMAP maps [23, 26].

There are two points to keep in mind when carrying this type of analysis. The first is that a particular “detection” may always turn out to be a statistical fluctuation revealed by one specific tool. The robustness of an anomaly then depends on the number of independent tests pointing to the same result. The second is that the implementation of statistical tools is sensitive to the way we extract information from data, requiring an accurate separation between cosmological signal and astrophysical/instrumental noise.

In this Section we present a critical review of the standard procedure used to implement cosmological tests. We show that it does not account for the intrinsic uncertainties of cosmological observations, which may possibly lead to an under/overestimation of anomalies. We then present a generalization of this process which naturally accounts for these uncertainties.

6.1. Standard Calculations

Suppose 𝑥 is a random variable predicted by a particular model and that cosmological observations of this quantity returned the value 𝑥0. (Rigorously, 𝑥 is only one realization of a random variable 𝑋, which is a real-valued function defined on a sample space. By the same reason we should not call the 𝑎𝑚's in (3) random variables, though we will stick to this nomenclature throughout this text.)We would like to calculate the probability, according to this model, that in a random Universe we would have 𝑥𝑥0. Assuming that 𝒫th is the (normalized) probability density function (pdf) of 𝑥, this probability is commonly defined to be𝑃𝑥0𝒫th(𝑥)𝑑𝑥.(98) The probability of having 𝑥>𝑥0 is then simply given by 𝑃>=1𝑃. See Figure 5.

If 𝑃 is found to be too small or, equivalently, too high, we might be tempted to interpret 𝑥0 as “anomalous” according to this model. However, this definition of probability assumes that 𝑥0 was measured with infinite precision, and so it says nothing about an important question we must deal with: typically, the measurement of 𝑥0 has an uncertainty which needs to be folded into the final probabilities that the observations match the theoretical expectations.

Moreover, since no two equally prepared experiments will ever return the same value 𝑥0, our measurements should also be regarded as random events. In a more rigorous approach, we would have to consider 𝑥0 itself as one realization of a random variable, conditioned to the distribution of the signal. In the case of CMB, however, this would be only part of the whole picture, since the randomness of the measurements of 𝑥0 should also be related to the way this data is reduced to its final form. This happens because different map cleaning procedures will lead to slightly different values for 𝑥0. This difference induces a variance in the data which reflects the remaining foreground contamination of the temperature map. We will elaborate more on this point through a concrete example, after we show how (98) may be changed in order to include the indeterminacy of cosmological measurements.

6.2. Convolving Probabilities

The question we want to answer is how to calculate the probability of 𝑥 being smaller than our measurements when the latter are also random events. Let us suppose that our measurement is described by the random variable 𝑦 and that 𝑥0 is its most probable value. The probability of having 𝑥𝑦 is simply the probability of having𝑧𝑥𝑦0(99) for some particular realization of the variable 𝑦. It should be clear by now that if we know the pdf of 𝑧, the probability we are looking for is simply the area under this distribution for 𝑧0. The probability 𝑃 of 𝑧 being smaller or equal to zero can be calculated as𝑃𝑃{(𝑥,𝑦)𝑥𝑦0}=𝑥𝑦0𝒫(𝑥,𝑦)𝑑𝑥𝑑𝑦.(100) Now, under the hypothesis of independence of 𝑥 and 𝑦 we have 𝒫(𝑥,𝑦)=𝒫th(𝑥)𝒫obs(𝑦), where 𝒫obs is the (normalized) probability distribution function of the variable 𝑦. We can therefore rewrite the last expression as𝑃=𝑦𝒫th𝒫(𝑥)𝑑𝑥obs(𝑦)𝑑𝑦.(101) If we now hold 𝑦 fixed and do the change of variables 𝑥=𝑢+𝑦, we get𝑃=0𝒫th(𝑢+𝑦)𝒫obs(𝑦)𝑑𝑦𝑑𝑢,(102) where we have changed the position of the integrals. The reader will now notice that term inside brackets is precisely the pdf we were looking for. Since there is nothing special about the variable 𝑦, we can equally well hold 𝑥 fixed and repeat the calculus, obtaining the symmetric version of this result. In fact, the final pdf for 𝑧 is nothing else than the convolution of the pdfs of each variable [76]:𝒫𝒫(𝑧)obs𝒫th(𝑧)=𝒫obs(𝑦)𝒫th=(𝑧±𝑦)𝑑𝑦𝒫obs(𝑥𝑧)𝒫th(𝑥)𝑑𝑥,(103) where the plus (minus) sign refers to the difference (sum) of 𝑥 and 𝑦. Integrating this pdf from (,0] we get our answer𝑃0𝒫(𝑧)𝑑𝑧.(104) As a consistency check, notice that in the limit where observations are made with infinite precision, 𝒫obs(𝑦) becomes a delta function and we have𝑃=0+𝛿𝑦𝑥0𝒫th(𝑧±𝑦)𝑑𝑦𝑑𝑧=±𝑥0𝒫th(𝑥)𝑑𝑥(105) which agrees with our previous definition. The reader must be careful, though, not to think of (98) as some lower bound to (104). Since none of the pdfs appearing in (103) are necessarily symmetric, a large distance from 𝑥0 to the most probable (not the mean!) value 𝑥 would not, by itself, constitute sufficient grounds to claim that the measured value of this observable is “unusual” in any sense, simply because a large overlap between the two pdfs can render the result usual according to (104).

To illustrate this point, let us calculate 𝑃 using (98) and (104) for the pdf's which appear in Figure 6. For pedagogical reasons, we have chosen 𝒫th and 𝒫obs as positively (Maxwell distribution) and negatively (Gumbel distribution) skewed pdf's, respectively. The convolved distribution appears as the solid (black) line. For these pdf's, (98) gives 99.2%, while (104) gives 93.6% of chance of 𝑥 being smaller than the observed 𝑥0; all pdf's were normalized to one.

7. A 𝜒2 Test of Statistical Isotropy

Although the last example was constructed to emphasize an important feature of the formalism developed in Section 6, cosmological observables designed to measure deviations of either gaussianity or statistical isotropy will often follow asymmetric distributions. The intrinsic uncertainties of cosmological measurements, specially the ones originating from map cleaning procedures, may be crucial when searching for any map's anomalies. We will now make a concrete application of this formalism using the angular-planar chi-square test developed in Section 5.1.

7.1. ΛCDM Model

In the simplest realization of the ΛCDM model, the covariance matrix of temperature maps is determined by 𝑎1𝑚1𝑎2𝑚2=(1)𝑚2𝐶1𝛿12𝛿𝑚1,𝑚2. Using this expression in (92), we find that𝒞(𝑙𝑚SI)=𝐶𝛿𝑙0𝛿𝑚0.(106) On the other hand, if the only nonzero 𝒞𝑙𝑚's are given by 𝑙=𝑚=0, then 𝒞00=𝐶. Therefore, statistical isotropy is achieved if and only if the angular-planar power spectrum is of the form (106). Since we are only interested in nontrivial planar modulations, we will restrict our analysis to the cases where 𝑙0, that is,𝒞𝑙𝑚=0,(𝑙2),(107) where, we remind the reader, the parity of comes from the symmetry (27).

For this particular model, we can also calculate the covariance matrix of the estimator (95) explicitly. Using the null hypothesis above, we find after some algebra that𝒞𝑙𝑚𝒞𝑙𝑚=8𝜋21,2𝐶1𝐶2𝐼𝑙,122𝛿𝑙𝑙𝛿𝑚𝑚.(108) This matrix has some interesting properties. First, note that the planar degrees of freedom are independent in this case (which justifies the (2𝑙+1)1) factor introduced in (97)). Second, its diagonal elements are given by the variance (𝜎𝑙𝑚)2𝒞=(𝑙𝑚)𝒞𝑙𝑚, which now becomes 𝑚-independent:𝜎𝑙𝑚2𝜎𝑙2=8𝜋21,2𝐶1𝐶2𝐼𝑙,122.(109)

Therefore, for the particular case of the ΛCDM model, the chi-square test (97) gets even simpler. Using (107) and (109), we find𝜒2𝜈𝑙=12𝑙+1𝑙𝑚=𝑙|||𝒞𝑙𝑚|||2𝜎𝑙2.(110) It is now clear that if the data under analysis are really described by this model, then it must be true that𝜒2𝜈𝑙=12𝑙+1𝑙𝑚=𝑙𝒞𝑙𝑚𝒞𝑙𝑚𝜎𝑙2=1,(111) where we have used (109). This shows that any large deviation of this test from unity will be an indication of planar modulation in temperature maps, up, of course, to error bars. For convenience, let us define a new quantity as𝜒𝑙𝜒2𝜈𝑙1(112) which will quantify anisotropies whenever 𝜒𝑙 is significantly positive or negative.

This generalized chi-square test furnishes a complete prescription when searching for planar modulations of temperature in CMB maps. We emphasize, though, that for a given CMB map, the chi-square analysis must be done entirely in terms of that map's data. Since we are performing a model-independent test, we are not allowed to introduce fiducial biases in the analysis (e.g., by calculating 𝜎𝑙 using 𝐶model), which would only include our a priori prejudices about what the map's anisotropies should look like. Since the 𝐶's are, by construction, a measure of statistical isotropy, consequently, an “anomalous” detection of 𝐶's is by no means a measure of statistical anisotropy, and it is this particular value that should be used in (112) if we want to find deviations of isotropy, regardless of how high/low it is.

7.2. Searching for Planar Signatures in WMAP

In order to apply the test (112) to the 5-year WMAP data [11, 14], we will define two new variables:𝑥𝜒2𝑙(th),𝑦𝜒2𝑙(obs),(113) which will be jointly analyzed using the formalism of Section 6. Still, there remains the question of how to obtain their pdfs. These functions can be obtained numerically provided that the number of realizations of each variable is large enough, since in this case their histograms can be considered as piecewise constant functions which approximate the real pdfs. For the case of the (theoretical) variable 𝑥 defined above we have run 2×104 Monte Carlo simulations of Gaussian and statistically isotropic CMB maps using the ΛCDM best-fit 𝐶's provided by the WMAP team [77]. With these maps we have then constructed 2×104 realizations of the variable 𝑥.

The simulation of the (observational) variable 𝑦 is more difficult and depends on the way we estimate contamination from residual foregrounds in CMB maps. As is well known, not only instrumental noise, but also systematic errors (e.g., in the map-making process), the inhomogeneous scanning of the sky (i.e., the exposure function of the probe), or unremoved foreground emissions (even after applying a cut-sky mask) could corrupt—at distinct levels—the CMB data.

Foreground contamination, on the other hand, may have several different sources, many of which are far beyond our present scopes. However, since different teams apply distinct procedures on the raw data in order to produce a final map, we will make the hypothesis that maps cleaned by different teams represent—to a good extent—“independent” CMB maps. Therefore, we can estimate the residual foreground contaminations by comparing these different foreground-cleaned maps.

In fact, the WMAP science team has made substantial efforts to improve the data products by minimizing the contaminating effects caused by diffuse galactic foregrounds, astrophysical point sources, artifacts from the instruments and measurement process, and systematic errors [83, 84]. As a result, multifrequency foreground-cleaned full-sky CMB maps were produced, named Internal Linear Combination maps, corresponding to three- and five-year WMAP data [14, 79]. To account for the mentioned randomness, systematic, and contaminating effects of the CMB data, we will use in our analyses several full-sky foreground-cleaned CMB maps, listed in Table 1, which were produced using both the three- and five-year WMAP data.

The prescription we adopt to determine the distribution of the observational variable 𝑦 is as follows: we simulate Gaussian random 𝑎𝑚's in such a way that their central values are given by the five-year ILC5 data [14, 79], and with a variance which is estimated from the sample standard deviation of all the maps listed in Table 1. So, for example, suppose we have 𝑛 different full-sky temperature maps at hand and we want to estimate the randomness inherent in the determination of, let's say, 𝑎32; therefore, we take𝒩𝑎ILC532,𝜎32𝑎32,(114) with𝜎32=1𝑛1𝑛𝑖=1𝑎𝑖32𝑎322,𝑎32=1𝑛𝑛𝑖=1𝑎𝑖32,(115) where 𝒩(𝜇,𝜎) represents a Gaussian distribution with mean 𝜇 and standard deviation 𝜎. Note that if the residual contamination is indeed weak, then the sample variance above will be small, and our procedure will reduce to the standard way of calculating probabilities. As for the use of a Gaussian in (114), this choice was dictated not only by simplicity, but rather by the fact that the propagation of uncertainties in physical experiments is usually assumed to follow a normal distribution. However; note that there are some instrumental uncertainties, such as beam or gain uncertainties, which will not in general follow normal distributions. In fact, some of them may even fail to be additive, meaning that our convolution formula will be inapplicable in these cases. In our analysis, we have focused only on foreground residuals, where the normality hypothesis is reasonable. (We thank the referee for pointing out this important aspect of the formalism.)

7.3. Full-Sky Maps

Following this procedure, we have used the full-sky maps shown in Table 1 to construct 104 Gaussian random 𝑎𝑚's, which were then used to calculate 104 realizations of 𝑦=(𝜒2)𝑙(obs). With those variables we constructed histograms which, together with the histograms for the (full-sky) variable 𝑥, were used to calculate the final probability (104). We have restricted our analysis to the range of values (,𝑙)[2,10], since the low multipolar sector (i.e., large angular scales) is where most of the anomalies were reported. The resulting histograms and pdf's are shown in Figure 7, and the final probabilities we obtained are shown in Table 2.

Overall, our results show no significant planar deviations of anisotropy in WMAP data. The most unlikely individual values in Table 2 are in the sectors (𝑙,) given by (2,5), (10,5), (4,7), and (6,8) and are all above a relative chance of 5% of either being too negative [(2,5), (10,5)] or too positive [(4,7), (6,8)]. However, it is perhaps worth mentioning that not only the individual values of (𝜒2)𝑙 are relevant, but also their coherence over a range of angular or planar momenta carries interesting information. So, for example, a set of (𝜒2)𝑙's which are all individually within the cosmic variance bounds, but which are all positive (or negative), can be an indication of an excess (or lack) of planar modulation. This type of coherent behavior appears in the following cases: (𝜒2)𝑙2, (𝜒2)𝑙3 and, to a lesser extent, (𝜒2)4 (see Table 2). The angular quadrupole =2, as well as the angular octupole =3, have all positive planar spectra (for all values of 𝑙 which we were able to compute), indicated by probabilities larger than 50%. The planar hexadecupole 𝑙=4 also has 8 out of 9 angular spectra assuming positive values (only =5 is negative).

The data analyzed in this Section relates to the full-sky maps, which are certainly still affected by residual galactic foregrounds. The reader interested in the complete analysis, including data from masked CMB maps, can check reference [28].

8. Conclusions

We know our Universe is not perfectly Gaussian, homogeneous, or isotropic. The deviations from an idealized picture (or the lack thereof), whether predictably small or surprisingly large, can tell us a great deal about the Universe we live in. Since the types of physical mechanisms behind deviations from perfect gaussianity, homogeneity, or isotropy are typically very different, we should try to measure these individual features separately—whenever possible or practical.

Recently, it has been suggested that some of the most discussed anomalies in the CMB can be explained away [85], or that the evidence for them is statistically weak [86]. But even if it turns out that our Universe is a plain vanilla kind of place, where everything goes according to the inflationary theorist's dreams, we would still need to analyze it with tools that allow us to check the standard picture against the data. In addition, local physics (related to the solar system, or our galaxy), as well as instrumental quirks, tend to leave imprints on the CMB which are clearly anisotropic, but have a certain coherence which can be detected, and possibly corrected for, with the help of these checks.

However, in an era where at least the large-scale maps of the CMB are likely to remain basically unchanged, we should be careful not to over analize the data with the benefit of an ever greater hindsight (put another way, a posteriori conundrums only get worse with time.) This can only be achieved if we find natural and generally agreed-upon classifications of the types of deviations that may occur, without too much guidance from what the data is telling us. We believe that focusing on the possible underlying symmetries, with perhaps some guidance from group-theoretic arguments, is one way to settle these issues. We have presented a few methods along these lines, one using multipole vectors, the other using a natural generalization of the two-point correlation function (also other methods have been presented in this paper.

Perhaps the best indication that we are on the right track is the fact that most of these methods are applicable in other areas of physics and astronomy, and that in some cases we have adapted tests of anisotropy from other areas, such as scattering theory and the theory of angular momentum in quantum mechanics. So, even if these anomalies eventually perish, they will be survived by the powerful methods that have been devised to test them.


A. Geometrical Identities and Derivations

A.1. Wigner 𝐷-Functions

From the unitarity of the rotation operators 𝐷(𝑅), we have𝑚𝐷𝑚𝑚𝜔1𝐷𝑚𝑚𝜔2=𝐷𝑚𝑚𝜔1𝜔2.(A.1) If 𝜔2=𝜔11, then using the identity 𝐷𝑚𝑚(𝜔1)=𝐷𝑚𝑚(𝜔), we find𝑚𝐷𝑚𝑚𝜔1𝐷𝑚𝑚𝜔1=𝛿𝑚𝑚.(A.2)

A.2. Gaunt Integral

The definition of the Gaunt integral used in this paper is𝒢123𝑚1𝑚2𝑚3=(1)𝑚1𝑑2̂𝑛𝑌1,𝑚1(̂𝑛)𝑌2𝑚2(̂𝑛)𝑌2𝑚2(̂𝑛).(A.3)

A.3. 3-j Symbols

We present here some useful identities related to the 3-j symbols (i) Isotropic limit𝑙1𝑙20𝑚1𝑚20=(1)𝑙2𝑚12𝑙1𝛿+1𝑙1𝑙2𝛿𝑚1,𝑚2.(A.4)(ii) Parity and permutations 𝑙1𝑙2𝑙𝑚1𝑚2𝑚=𝑙𝑙1𝑙2𝑚𝑚1𝑚2=(1)𝑙1+𝑙2+𝑙𝑙2𝑙1𝑙𝑚2𝑚1𝑚=(1)𝑙1+𝑙2+𝑙𝑙1𝑙2𝑙𝑚1𝑚2.𝑚(A.5)(iii) Orthogonality 𝑙1𝑚1=𝑙1𝑙2𝑚2=𝑙2𝑙1𝑙2𝑙3𝑚1𝑚2𝑚3𝑙1𝑙2𝑙3𝑚1𝑚2𝑚3=𝛿𝑙3𝑙3𝛿𝑚3𝑚32𝑙3,+1𝑙2+𝑙3𝑙1=||𝑙2𝑙3||𝑙1𝑚1=𝑙1𝑙(2𝑙+1)1𝑙2𝑙3𝑚1𝑚2𝑚3𝑙1𝑙2𝑙4𝑚1𝑚2𝑚3=𝛿𝑚2𝑚2𝛿𝑚3𝑚3,𝑙𝑚=𝑙(1)𝑙𝑚=𝑙𝑙𝑚𝑚02𝑙+1𝛿,0.(A.6)

The last expression is particularly useful in the derivation of (106).

A.4. Derivation of (92)

We start by equating expressions (89) and(22)0𝑥0200𝑑𝑙,𝑚2+1𝒞4𝜋𝑙𝑚𝑃(cos𝜗)𝑌𝑙𝑚=(̂𝑛)1,𝑚12,𝑚2𝑎1𝑚1𝑎2𝑚2𝑌1𝑚1̂𝑛1𝑌2𝑚2̂𝑛2.(A.7) As mentioned in the main text, the inversion of 𝒞𝑙𝑚 as a function of the 𝑎𝑚's is not a trivial task, since the angles (Θ,Φ,𝜃) depend nonlinearly on the angles (𝜃1,𝜑1,𝜃2,𝜑2). The easiest way to achieve this goal is to pick up a coordinate system where only the 𝜃 dependence is present. After integrating it out, we rotate our coordinate system using three Euler angles to recover back the (Θ,Φ) dependence, which can then be integrated with the help of some Wigner matrices identities. We start by positioning the vectors ̂𝑛1 and ̂𝑛2 in the 𝑥𝑦 plane, that is; we chose ̂𝑛1=(𝜋/2,𝜙1), ̂𝑛2=(𝜋/2,𝜙2). By (90), we then have cos𝜗=cos(𝜙1𝜙2). Using the relation [41]𝑌𝑚𝜋2,𝜙=𝜆𝑚𝑒𝑖𝑚𝜙,𝜆𝑚=(1)(+𝑚)/22+14𝜋𝔄!!(+𝑚)!!𝔅!!(𝑚)!!,if+𝑚2,0,otherwise(A.8) where 𝔄 denotes (+𝑚1) and 𝔅 denotes (𝑚1), we can integrate the 𝜃 dependence on both sides of (A.7). This gives us1𝜋𝑙,𝑚𝒞𝑙𝑚𝑌𝑙𝑚(0,0)=1,𝑚10𝑥0200𝑑2,𝑚2𝑎1𝑚1𝑎2𝑚2𝐼1𝑚12𝑚2,(A.9) where we have introduced the following definition:𝐼1𝑚12𝑚2𝜆1𝑚1𝜆2𝑚2×𝜋0𝑃𝜑cos1𝜑2𝑒𝑖(𝑚1𝜑1𝑚2𝜑2)𝑑𝜑cos1𝜑2.(A.10)

We need now to integrate out the Θ and Φ dependence in the right-hand side of (A.9) which was hidden due to our choice of a particular coordinate system. In order to do that, we keep the vectors ̂𝑛1 and ̂𝑛2 fixed and make a rotation of our coordinate system using three Euler angles 𝜔={𝛼,𝛽,𝛾}. This rotation changes the coefficients 𝒞𝑙𝑚's and