Abstract

The most direct probe of non-Gaussian initial conditions has come from bispectrum measurements of temperature fluctuations in the Cosmic Microwave Background and of the matter and galaxy distribution at large scales. Such bispectrum estimators are expected to continue to provide the best constraints on the non-Gaussian parameters in future observations. We review and compare the theoretical and observational problems, current results, and future prospects for the detection of a nonvanishing primordial component in the bispectrum of the Cosmic Microwave Background and large-scale structure, and the relation to specific predictions from different inflationary models.

1. Introduction

The standard inflationary paradigm predicts a flat Universe perturbed by nearly-Gaussian and scale-invariant primordial perturbations. These predictions have been verified to a high degree of accuracy by Cosmic Microwave Background (CMB) and Large-Scale Structure (LSS) measurements, such as those provided by the Wilkinson Microwave Anisotropy Probe (WMAP) [1], the 2dF Galaxy Redshift Survey (2dFGRS) [2], and the Sloan Digital Sky Survey (SDSS) [3]. Despite this success, it has proved to be difficult to discriminate between the vast array of inflationary scenarios that have been proposed by high-energy theoretical investigations or even to rule-out alternatives to inflation. Since most of the present constraints on the Lagrangian of the inflaton field have been obtained from measurements of the two-point function, or power spectrum, of the primordial fluctuations, a natural step to extend the available information is to look at non-Gaussian signatures in higher-order correlators.

The lowest-order additional correlator to take into account is the three-point function or its counterpart in Fourier space, the bispectrum. Most models of inflation are characterized by specific predictions for the bispectrum of the primordial perturbations in the gravitational potential . The bispectrum of these perturbations is defined as where we have introduced the notation so that the Dirac delta function here is . Together with the assumption of statistical homogeneity and isotropy for the primordial perturbations, this implies that the bispectrum is a function of the triplet defined by the magnitude of the wavenumbers , and forming a closed triangular configuration. The current constraints that we are able to derive on the bispectrum provide additional information about the early Universe; the possible detection of a non-vanishing primordial bispectrum in future observations would represent a major discovery, especially as it is predicted to be negligible by standard inflation.

The cosmological observable most directly related to the initial curvature bispectrum is given by the bispectrum of the CMB temperature fluctuations, which provide a map of the density perturbations at the time of decoupling, the earliest information we have about the Universe. Current measurements of individual triangular configurations of the CMB bispectrum are, however, consistent with zero. Studies of the primordial bispectrum, therefore, are usually characterized by constraints on a single-amplitude parameter, denoted by once a specific model for is assumed. Since most models predict a curvature bispectrum obeying the hierarchical scaling , with being the curvature power spectrum, the non-Gaussian parameter roughly quantifies the ratio defining the “strength” of the primordial non-Gaussian signal. In addition, we can write where encodes the functional dependence of the primordial bispectrum on the specific triangle configurations. For brevity, the characteristic shape-dependence of a given bispectrum is often referred to simply as the bispectrum shape (a precise definition of the bispectrum shape function will be given in Section 2.1). Inflationary predictions for both the amplitude and the shape of are strongly model dependent. Notice that the subscript “” stands for “nonlinear”, since a common phenomenological model for the non-Gaussianity of the initial conditions can be written as a simple nonlinear transformation of a Gaussian field. Generically, of course, non-Gaussianity is associated with nonlinearities, such as nontrivial dynamics during inflation, resonant behaviour at the end of inflation (“preheating”), or nonlinear postinflationary evolution. At the very least, future CMB and LSS observations are expected to be able to eventually detect the last of the three effects mentioned above.

Perturbations in the CMB provide a particularly convenient test of the primordial density field because CMB temperature and polarization anisotropies are small enough to be studied in the linear regime of cosmological perturbations. Once the effects of foregrounds are properly taken into account, a non-vanishing CMB bispectrum at large scales would be a direct consequence of a non-vanishing primordial bispectrum. As we will see, while other CMB probes of primordial non-Gaussianity are available, such as tests of the topological properties of the temperature map based on Minkowski Functionals or measurements of the CMB trispectrum, the estimator for the non-Gaussian parameter has been shown to be optimal. We will focus mostly on this bispectrum estimator in the section of this paper dedicated to the CMB.

In the standard cosmological model, the large-scale structure of the Universe, that is, the distribution of matter and galaxies on large scales, is the result of the nonlinear evolution due to gravitational instability of the same initial density perturbations responsible for the CMB anisotropies. This is, perhaps, the most important prediction of the inflationary framework which provides a common origin for the CMB and large-scale structure perturbations as the result of tiny quantum fluctuations stretched over cosmological scales during a phase of accelerated expansion. The large-scale structure we observe at low redshift, however, is characterized by large voids and small regions with very large-matter density, and it is therefore a much less direct probe of the initial conditions. The distribution of matter becomes a highly non-Gaussian field precisely as a result of the nonlinear growth of structures, even for Gaussian initial conditions. This non-Gaussianity is expressed, in particular, by a non-vanishing matter bispectrum at any measurable scale, including the largest scales probed by current or future redshift surveys. In this context, the effect of primordial non-Gaussianity, that is, of an initial component in the curvature bispectrum, will constitute a correction to the galaxy bispectrum. It follows that the possibility of constraining or detecting this initial component is strictly related to our ability to distinguish it from other primary sources of non-Gaussianity, that is, the nonlinear gravitational evolution, and, in the case of galaxy surveys, nonlinear bias.

The study of non-Gaussian initial conditions for large-scale structure has a relatively long history, with important contributions going back to the mid eighties. The standard picture that has been developed over the years assumed that, at large scales, the effect of primordial non-Gaussianity on the galaxy distribution is simply given in terms of an additional component to the galaxy bispectrum. This is obtained, in perturbation theory, as the linearly evolved and linearly biased initial matter bispectrum, related to the curvature bispectrum by the Poisson equation. Such component becomes subdominant as the gravity-induced non-Gaussian contribution grows in time. In this framework, as one can expect, high-redshift and large-volume galaxy surveys would constitute the best probes of the initial conditions. It has been shown, in fact, that proposed and planned redshift surveys, such as those of Euclid [4], should be able to provide constraints on the primordial non-Gaussian parameters comparable to, if not better than, those expected from CMB missions such as those of Planck. What is more important, in the event of a detection by Planck, is that confirmation by large-scale structure observations will be required.

Recent results from N-body simulations with non-Gaussian initial conditions, however, have revealed a more complex picture. The effect of primordial non-Gaussianity at large scales is not limited to an additional contribution to the galaxy bispectrum, but it quite dramatically affects the galaxy bias relation itself, that is, the relation between the matter and galaxy distributions. A surprising consequence is that it induces a large correction even for the galaxy power spectrum. Such an effect has attracted considerable recent attention and, remarkably, has placed constraints on the non-Gaussian parameter from current LSS datasets which already appear to marginally improve on CMB limits. However, from a theoretical point of view, a proper understanding of the phenomenon is not fully developed yet. For example, reliable predictions for the galaxy bispectrum are not yet available. Most importantly, as for general cosmological parameter estimation, a complete likelihood analysis aimed at constraining, or detecting, primordial non-Gaussianity in large-volume redshift surveys should involve joint measurements of the galaxy power spectrum and bispectrum, as well as possibly higher-order correlation functions. While we are still far from a proper assessment of what such analysis would be able to achieve, current results in this direction are very encouraging.

This review is divided into four parts. In Section 2 we will first discuss initial conditions as defined in terms of the primordial curvature bispectrum and its phenomenology. We will then review the observational consequences of primordial non-Gaussianity on the CMB bispectrum, Section 3, and on the large-scale structure bispectrum as measured in redshift surveys, Section 4. In both cases we will discuss theoretical models for the observed bispectra and technical problems related to the estimation of the non-Gaussian parameters, with the differences that naturally characterize such distinct observables. We also give an example of joint analysis using both CMB and large-scale structure when we consider the possibility of constraining a strongly scale-dependent non-Gaussian parameter , emerging in some recently proposed inflationary models.

2. Initial Conditions and the Primordial Bispectrum

In this section we will briefly overview the main predictions of inflationary models regarding the non-Gaussianity (NG) of the primordial curvature perturbation field. The link between NG of primordial density fluctuations and NG of CMB and LSS will be shown in following sections. In order to provide a full description of an NG random field, all correlators beyond the 2-point function are in principle necessary. However in this review we will focus on the primordial bispectrum (i.e., three-point function in Fourier space). This is not only justified by the fact that the bispectrum is the first and simplest higher-order correlator to look at, but also by the fact that most models of inflation predict vanishingly small correlators beyond the bispectrum. In Section 2.1 we will introduce the relevant quantities, their mathematical definitions, and we will provide a general overview and classification of the bispectra predicted in different inflationary scenarios (only from a purely mathematical point of view, without linking them to the Physics originating them at this stage). Finally, a useful eigenmode expansion technique for bispectra will be introduced in Section 2.2 and applied to the calculation of correlations between different bispectra in Section 3.4. In the same section we will also show which kinds of bispectra are predicted by different models of inflation.

2.1. The Primordial Bispectrum and Shape Function

The starting point for this discussion is the primordial gravitational potential perturbation which was seeded by quantum fluctuations during inflation or by some other mechanism in the very early Universe (. When characterizing the fluctuations we usually work in Fourier space with the (flat space) transform defined through The primordial power spectrum of these potential fluctuations is found using an ensemble average: where we have assumed that physical processes creating the fluctuations are statistically isotropic so that only the dependence on the wavenumber remains . Recall that, for nearly scale-invariant perturbations, the fluctuation variance on the horizon scale is almost constant , implying that .

The primordial bispectrum is found from the Fourier transform of the three-point correlator as Here, the delta function enforces the triangle condition, that is, the constraint that the wavevectors in Fourier space must close to form a triangle: . Examples of such triangles are shown in Figure 1, illustrating the basic squeezed, equilateral, and flattened triangles to which we will refer later. Note that a specific triangle can be completely described by the three lengths of its sides and so, in the isotropic case, we are able to describe the bispectrum using only the wavenumbers . The triangle condition restricts the allowed wavenumber configurations to the interior of the tetrahedron illustrated in Figure 2.

The most studied primordial bispectrum is the local model in which contributions from “squeezed” triangles are dominant, that is, with, for example, (as illustrated in Figure 1(a)). This is well motivated physically as it encompasses “superhorizon” effects during inflation when a large-scale mode (say) which has exited the Hubble radius exerts a nonlinear influence on the subsequent evolution of smaller-scale modes . Although this effect is small in single-field slow-roll inflation, it can be much larger for multifield models. In a weakly coupled regime, the potential can be split into two components: the linear term , representing a Gaussian field, giving the usual perturbation results, plus a small local non-Gaussian term [5]: where is called the nonlinearity parameter. In Fourier space, the nonlinear term is then given by the convolution From this we can infer, using (4), that the only non-vanishing contributions to the bispectrum (5) take the form In the scale-invariant case the power spectrum of the primordial potential takes the form , where defines the amplitude of primordial fluctuations at the end of inflation. Accounting for permutations, the local bispectrum then becomes Although this is a rather pathological function which diverges along the edges of the tetrahedron (i.e., when any ), we can infer from it some basic properties of the bispectrum for any model which is nearly scale invariant. For example, we can observe that the bispectrum at equal has the characteristic scaling If we remove this overall scaling by multiplying (9) by the factor , then we note that on transverse slices through the tetrahedron defined by (see Figure 2 ) the bispectrum only depends on the ratios of the wavenumbers, say, and . Indeed, it can prove convenient to characterize the bispectrum in terms of the following transverse parameters [6, 7]: with the domains and . The volume element on the regular tetrahedron of allowed wavenumbers then becomes .

These considerations lead naturally to the definition of the primordial shape function [8] where is a normalization factor which is often chosen such that is unity for the equal case; that is, (we will discuss alternatives to this rather arbitrary convention later). For example, the canonical “local” model (9) has the shape Thus it is usual to describe the primordial bispectrum in terms of an overall amplitude and a transverse two-dimensional shape , which incorporates any distinctive momentum dependence. Of course, if there is a nontrivial scale dependence, then the full three-dimensional dependence of on must be retained.

There are other physically well-motivated shapes in the literature which have also been extensively studied. The simplest shape is the constant model which, like the local model, has a large-angle analytic solution for the CMB bispectrum [9]. The local model tends to be the benchmark against which all other models are compared and normalized, but for practical purposes the constant model is much more useful, given its regularity at both late and early times. The equilateral shape is another important case with [8] While being not derived directly from a physical model, it has been chosen phenomenologically as a separable ansatz for higher-derivative models [10] and DBI inflation [11]. The equilateral shape is contrasted with the local model in Figure 3.

Another important early result was the primordial bispectrum shape for single-field slow-roll inflation derived by Acquaviva et al. [12] and Maldacena [13]: where are the usual slow-roll parameters. In the second line, we have noted that this shape can be accurately represented as the superposition of local and equilateral shapes. The coefficients in (16), which include the scalar spectral index , confirm that and so standard single slow-roll inflation cannot produce an observationally significant signal. Nevertheless, it is interesting to determine which shape is dominant in (16) and to what extent other primordial shapes are independent from one another.

Whether two different primordial shapes can be distinguished observationally can be determined from the correlation between the corresponding two CMB bispectra weighted for the anticipated signal-to-noise-ratio, as in the estimator (see next section) and the Fisher matrix analysis (see Section 3.8). However, direct calculations of the CMB bispectrum can be very computationally demanding. A much simpler approach is to determine the independence of the two shape functions and from the correlation integral (see [9], and also the study by Babich et al. in [8]) where we choose the weight function to be reflecting the primary scaling of the CMB correlator. The shape correlator is then defined by Here, the integral is over the tetrahedral region shown in Figure 2 taken out to a maximum wavenumber corresponding to the experimental range for which forecasts are sought (with with being the present-day conformal time). The weight function appropriate for mimicking the large-scale structure bispectrum estimator (see Section 4.3.2) would be different with varying scaling laws introduced by the transfer functions for wavenumbers above and below , the inverse comoving horizon at equal matter-radiation. Nevertheless, the weight given in (18) provides a compromise between these scalings, and so shape correlation results should offer a useful first approximation.

Below we will survey primordial models in the literature, showing how close the shape correlator comes to a full Fisher matrix analysis. However, here we note that the local shape (13) and the equilateral shape (54) have only a modest 46% correlation. For the natural values of the slow-roll parameters we find the somewhat surprising result that is 99.7% correlated with (and it cannot be easily tuned otherwise because is not consistent with deviations from scale invariance favored observationally ). Such strong correspondences are important in defining families of related primordial shapes, thus reducing the number of different cases for which separate observational constraints must be sought.

2.2. General Primordial Bispectra and Separable Mode Expansions

The three shape functions (13), (14), and (54) quoted above share the important property of separability; that is, they can be written in the form or as the sum of just a few such terms. As we will see, if a shape is separable, then the computational cost of evaluating the corresponding CMB bispectrum is dramatically reduced. In fact, without this property, the task of estimating whether a nonseparable bispectrum is consistent with observation appears to be intractable (for large ). Of course, the number of models which can be expressed directly in the form of (20) is very limited, despite the usefulness of approximate ansätze such as the equilateral shape (54). Indeed, approximating nonseparable shapes by educated guesses for the separable functions is neither systematic nor computationally efficient (because arbitrary nonscaling functions create numerical difficulties, as we will explain later).

Instead, we will present a separable mode expansion approach for efficient calculations with any nonseparable bispectrum, as described in detail by Fergusson et al. in [14] (and originally proposed in [6]). Our aim will be to express any shape function as an expansion in mode functions where, here, for convenience, we have represented the symmetrized products of the separable basis functions as with a one-to-one mapping ordering the products as . The important point is that must be an independent set of well-behaved basis functions which can be used to construct complete and orthogonal three-dimensional eigenfunctions on the tetrahedral region defined by (see Figure 2) The introduction of the cutoff at is motivated by both separability and the correspondence with the observational domain In the shape correlator (19), we have already seen what is essentially an inner product between two shapes on this tetrahedral region, which we can define for two functions as

Satisfactory convergence for known bispectra can be found by using simple polynomials in the expansion (21), that is, using analogues of the Legendre polynomials on the domain (23). With unit weight, the polynomials satisfying can be found by generating functions with the first three given by [14] The first few polynomials are plotted in Figure 4, where they are contrasted with the Legendre polynomials.

The three-dimensional separable basis functions in (2) reflect the six symmetries of the bispectrum through the permuted sum of the product terms. They could have been constructed directly from simpler polynomials, such as however, the polynomials have two distinct advantages. First, the 's confer partial orthogonality on the and, secondly, these remain well behaved when convolved with transfer functions.

In order to rapidly decompose an arbitrary shape function into the coefficients , it is more convenient to work in a nonseparable orthonormal basis (. These can be derived directly from through Gram-Schmidt orthogonalization, so that with being a lower triangular matrix (see [14]). Thus we can find the unique shape function decomposition

In the orthonormal frame, Parseval's theorem ensures that the autocorrelator is simply . Hence, with a simple and efficient prescription we can construct separable and complete basis functions on the tetrahedral domain (23) providing rapidly convergent expansions for any well-behaved shape function . These eigenmode expansions will prove to be of great utility in subsequent sections. Examples of this bispectral decomposition and its rapid convergence for the equilateral and DBI models are shown in Figure 5.

2.3. Families of Primordial Models and Their Correlations

We will now briefly survey the main categories of primordial models in the literature and their relative independence, closely following the discussion by Fergusson and Shellard in [9].

2.3.1. The Constant Model

The constant model (14) is the simplest possible primordial shape with triangles of every configuration contributing equally to the bispectrum ; it is the equipartition model. The constant model was motivated initially by its simplicity [9] leading to an analytic solution for the large-angle CMB bispectrum, as well as due to its close correlation with equilateral models. However, the shape does have a more explicit physical motivation in at least one context [15], during multifield inflation for a slowly turning trajectory (denoted as quasi-single-field inflation). For multifield inflation, it is well known that the conversion of isocurvature fluctuations into curvature fluctuations during “corner-turning” can source significant non-Gaussianity (see, e.g., [7, 16]). In the quasi-single-field case with mass isocurvature modes, a detailed investigation of the ongoing conversion into the curvature mode demonstrated that novel shapes could be generated [15], amongst them are the shapes which were very nearly constant. Generically, these model-dependent shapes belonged to a one-parameter family which interpolated nontrivially between equilateral (54) and local (13) shapes (see also [17, 18]). This is an important caveat for the present discussion, because non-Gaussian searches could uncover shapes intermediate between the categories we will discuss below.

2.3.2. Equilateral Triangles—Centre-Weighted Models

Bispectra dominated by contributions from nearly equilateral triangle configurations, can be fairly easily characterized analytically and are the most amenable to CMB searches. However, equilateral non-Gaussianity requires that the amplification of nonlinear effects around the time modes exit the horizon, which is not possible in a slow-roll single-field inflation. Instead, the kinetic terms in the effective action must be modified as in the Dirac-Born-Infeld (DBI) model [11] or by explicitly adding higher-derivative terms, such as in K-inflation (see, e.g., [19]). The resulting corrections modify the sound speed and inflation is able to take place in steep potentials. For DBI inflation, this leads to non-Gaussianity being produced with a shape function of the form [10, 11] Another example of a model with nonstandard kinetic terms is ghost inflation [20] with a derivatively coupled field driving inflation and a trilinear term in the Lagrangian creating a nonzero equilateral-type shape tending towards constant.

General non-Gaussian shapes arising from modifications to single-field inflation have been extensively reviewed in [19]. Using a Lagrangian that was an arbitrary function of the field and its first derivative, they were able to identify six distinct shapes describing the possible non-Gaussian contributions. Half of these had negligible amplitude being of the order of slow-roll parameters (with two already given in (16)). Of the remaining three shapes (see [19], and also [21]), one was believed to be subdominant and the second recovered the DBI shape (27), leaving a third distinct single-field shape which is the inverse of the local shape (13): . Finally, we recall the original equilateral shape (54), noting that it was introduced not because of a fundamental physical motivation, but as a separable approximation to the DBI shape (27) [8].

Despite the apparent visual differences between these shapes (see [9]), particularly near the edges of the tetrahedral domain, the shape correlator (19) reveals at least a 95% or greater correlation of the DBI, ghost, and single shapes to the equilateral shape (54) (consistent with results in [8, 22]). Comparative results between the shape correlator are given in Table 1 (together with the corresponding CMB correlation results brought forward and showing the efficacy of these estimates). These particular centre-weighted shapes must be regarded as a single class which would be extremely differentiate observationally, without a bispectrum detection of very high significance.

Finally, we comment on the “orthogonal” shape proposed by Smith et al. in [18], together with for characterizing single-field inflation models with an approximate shift symmetry (see also [19]). This shape is approximately , which means that it is very similar to an earlier study of flattened shapes [23] which proposed an “enfolded” shape with . From the eigenmode decomposition (26) of the equilateral model shown in Figure 5, it is clear how the degree of correlation can be altered by subtracting out the important constant term. With the specific choice of constant term used in the analysis by Smith et al. [18], one gets a correlation of about 30% between both local and equilateral shapes and the orthogonal ansatz.

2.3.3. Squeezed Triangles—Corner-Weighted Models

The local shape covers a wide range of models where the non-Gaussianity is produced by local interactions. These models have their peak signal in “squeezed” states where one is much smaller than the other two due to non-Gaussianity typically being produced on superhorizon scales. We have already observed that single-field slow-roll inflation (16) is dominated by the local shape [24], though is tiny [12, 13, 24]. The production of non-Gaussianity during multiple-field inflation [7, 16, 21, 2530] shows much greater promise through conversion of isocurvature into adiabatic perturbations (see, e.g., recent works in [15, 17, 31, 32] and references therein). The magnitude of the non-Gaussianity generated is normally around , which is at the limit for Planck detection, but models can be tuned to create larger signals. Significant can be produced in curvaton models with [3335]. Large can also be generated at the end of inflation from massless preheating or other reheating mechanisms [3638].

We note that local non-Gaussianity can also be created in more exotic scenarios. Models based on nonlocal field theory, such as -adic inflation, can have inflation in very steep potentials. Like single-field slow-roll inflation, the predicted “nonlocal” shape function is a combination of a dominant local shape (13) and an equilateral shape (54) (see, e.g., [3942]). The ekpyrotic model can also generate significant [4347]. Here the density perturbations are generated by a scalar field rolling in a negative exponential potential, so nonlinear interactions are important with .

In using the shape correlator for the local model, we must introduce a small-wavenumber cutoff, taken to be ; otherwise the shape correlator becomes infinite. This logarithmic divergence does not afflict the CMB bispectrum because we do not consider contributions below the quadrupole (a threshold which is approximated by the primordial cutoff). The local shape is modestly correlated at the 40%–55% level with the equilateral shapes, mainly through the constant term in the expansion (26). As can be seen in Table 2, this somewhat underestimates the CMB correlator. Nevertheless, a NG signal of only modest significance should be able to distinguish between these independent models.

Finally, warm inflation scenarios, that is, models in which dissipative effects play a dynamical role, are also predicted to produce significant non-Gaussianity [48, 49]. Contributions are again dominated by squeezed configurations but with a different more complex shape possessing a sign flip as the corner is approached (see Figure 6). This makes the warm and local shapes essentially orthogonal with only a 33% correlation. Again, in using the shape correlator, we need to introduce the same phenomenological cutoff as for the local model, but we also note the more serious concern which is the apparent breakdown of the approximations used to calculate the warm inflation shape near the corners and edges.

2.3.4. Flattened Triangles—Edge-Weighted Models

It is possible to consider inflationary vacuum states which are more general than the Bunch-Davies vacuum, such as an excited Gaussian (and Hadamard) state (see [50], and also discussions in [19, 23]). Observations of non-Gaussianity in this case might provide insight into trans-Planckian physics. The proposed shape for the bispectrum is The bispectrum contribution from early times is dominated by flattened triangles, with, for example, and for a small sound speed can be large. Unfortunately, as the divergent analytic approximation breaks down at the boundary of the allowed tetrahedron, some form of cutoff must be imposed, as shown for the smoothed shape in Figure 6 where an edge truncation has been imposed together with a Gaussian filter. The lack of compelling physical motivation and ill-defined asymptotics make predictions for this model uncertain.

2.3.5. Features—Scale-Dependent Models

There are also models in which the inflation potential has a feature, providing a break from scale invariance. This can take the form of either a step [51] or a small oscillation superimposed onto the potential [52]. Analytic forms for both by these three-point functions have been presented by Chen et al. in [53] with one approximation taking the form where is the associated scale of the feature in question and is a phase factor. Results for the shape correlator for a particular feature model (with and ) are given in Table 2, showing that it is essentially independent of all of the other shapes. Clearly, scale-dependent feature models form a distinct fifth category beyond equilateral, local, warm, and flat shapes.

3. Cosmic Microwave Background

Non-Gaussian initial conditions at the end of inflation (or produced by alternative models for the generation of primordial perturbations) can produce observable signatures in both the CMB and LSS. It is clear that an eventual detection of such signatures would be of great scientific interest. Cosmological measurements of primordial NG would indeed allow to constrain and discriminate between the different candidate scenarios of primordial inflation that have been briefly reviewed in the previous section in terms of their bispectrum prediction. This section is devoted to study the CMB bispectrum produced by a non-Gaussian primordial curvature perturbation field. We will start in Section 3.1 by determining how the primordial curvature bispectrum propagates to the observed bispectra of CMB temperature and polarization anisotropies. At the end of this section, we will obtain a formula expressing the CMB bispectrum as a convolution of the primordial one with suitable radiation transfer functions. The latter encode all of the radiative and gravitational effects producing the observed pattern of CMB anisotropies starting from a given primordial potential. This result will not come as a surprise to the reader familiar with CMB theory, since it is completely analogous to the formula relating the power spectrum of primordial curvature perturbations to the CMB angular power spectrum. We will be simply recasting the same formalism in terms of higher-order correlator. Armed with this useful relation, we will revisit the various models described in the previous section and calculate their corresponding CMB bispectra. This will be done in Sections 3.2 and 3.3 where we will employ the useful distinction between separable and nonseparable bispectra already introduced before and discuss its various implications. We will then compare the bispectra predicted in different scenarios in order to find out whether different models produce observationally distinguishable shapes (and thus whether primordial NG is a viable tool to discriminate between different theories of the Early Universe). This will be done in Section 3.4 by means of a suitably defined shape correlator that will tell us “how similar” two CMB bispectrum shapes are. After this preliminary work of definition and classification of various bispectra, in Section 3.5 we will finally deal with the problem of extracting the CMB bispectrum from the data in order to produce statistical estimates of the level of primordial non-Gaussianity and compare the NG measurements to theoretical predictions.

3.1. The CMB Bispectrum

In this section we will study the connection between the primordial bispectrum at the end of inflation and the observed bispectrum of CMB anisotropies . Our work will be primarily concerned with the analysis of the three-point function induced by a NG primordial gravitational potential in the CMB temperature fluctuation field. Temperature anisotropies are represented using the coefficients of a spherical harmonic decomposition of the cosmic microwave sky: Analogous expansions are performed for the -mode polarization field in order to produce polarization multipoles . For simplicity and clarity, throughout most of this review we will focus on the temperature multipoles and omit the superscript for convenience of notation. However we stress here that all of the considerations we make in the following can be readily applied to polarization multipoles and related bispectra. More discussion about this subject can be found in Section 3.8.2.

The primordial potential is imprinted on the CMB multipoles by a convolution with transfer functions representing the linear perturbation evolution, through the integral The radiation transfer functions encode all of the typical effects observed in the CMB power spectrum at linear order, that is, the Sachs-Wolfe effect, Integrated Sachs-Wolfe effect, acoustic peaks, and silk damping (see, e.g., [54, 55]). An equation identical to (31) produces the E-mode polarization CMB multipoles starting from the primordial temperature fluctuation field, provided that polarization transfer functions replace temperature transfer functions in the convolution above. It is sometimes useful to rewrite (31) in position, rather than Fourier, space. In this case it is straightforward to show that (31) becomes where, starting from the primordial potential , we transform from Cartesian into polar coordinates and defined In this expression is the spherical Bessel function of order . The CMB bispectrum is the three-point correlator of , so by substituting, we obtain where in the last line we have integrated over the angular parts of the three , having inserted the exponential integral form for the delta function in the bispectrum definition (5). The last integral over the angular part of is known as the Gaunt integral, which can be expressed in terms of Wigner- symbols as (for more details on these functions and their properties, see, e.g., [56] and references therein)

Given that most theories we will consider are assumed to be isotropic, the -dependence can be factorized out of the physically relevant part of the bispectrum [57]. It is then usual to work with the angle-averaged bispectrum, Or the even more convenient reduced bispectrum which removes the geometric factors associated with the Gaunt integral From the previous two formulae we also derive the following useful relations between the full, averaged, and reduced bispectra:

The reduced bispectrum from (34) then takes the much simpler form This is the key equation in this section, since it explicitly relates the primordial bispectrum, predicted by inflationary theories, to the reduced bispectrum observed in the cosmic microwave sky. This formula is entirely analogous to the well-known relation linking the primordial curvature power spectrum and the CMB angular power spectrum that is,

Finally, it is important to note that the Gaunt integral in (41) encodes several constraints on the angle-averaged bispectrum which are no longer transparent in the reduced bispectrum . These are as follows. (1)The sum of the three multipoles must be even (to ensure parity invariance). (2)The 's satisfy the triangle condition (to enforce rotational invariance).

Analogous to the wavenumber constraint (23), the second condition tells us that the only multipole configurations giving nonzero contributions to the bispectrum are those that form a closed triangle in harmonic (-)space. For wavenumbers, the triangle condition is enforced through the -integral over the three spherical Bessel functions which evaluates to zero if the 's cannot form a triangle, whereas in multipole space it is enforced by the angular integration over the spherical harmonics in (40).

3.2. Separable Primordial Shapes and CMB Bispectrum Solutions

In terms of the shape function (12), the reduced bispectrum (44) can be rewritten as The expression above can be simplified, and simple analytic solutions can sometimes be obtained for the very important class of separable shapes obeying the ansatz , as in (20). Substituting (20) into (46), we find that where we have defined the quantities

Instead of the three-dimensional integral of (46), we now have to deal with a much more tractable product of three one-dimensional integrals. Moreover, if we work at large angular scales in the Sachs-Wolfe approximation, the transfer functions become , where and represents respectively, the present-day conformal time and the conformal time at decoupling. The presence of a product of spherical Bessel functions in the integrals above can lead in some cases to simple analytic solutions.

Let us demonstrate this for the separable primordial shapes considered in Section 2. The simplest possible shape, the constant model (14) with , has a large-angle analytic solution for the reduced bispectrum [9]:

The large-angle solution (49) is an important benchmark with which to compare the shape of late-time CMB bispectra from other models (note the scaling). The more general constant solution does not have an analytic solution because the transfer functions cannot be expressed in a simple form, but it can be evaluated numerically from the expression

The numerical solution is shown in Figure 7, exhibiting a regular pattern of acoustic peaks introduced by the oscillating transfer functions.

For the local shape (13), the Sachs-Wolfe approximation also yields a large-angle analytic solution where the divergences for the squeezed triangles () in the primordial shape (13) are also reflected in It is straightforward, in principle, to calculate the full bispectrum from the separable expressions arising from (13): where the separated integrals analogous to (50) become However, we note that these highly oscillatory integrals must be evaluated numerically with considerable care.

For the equilateral shape (54) we first make its separability explicit by expanding the expression in the form: While there is no simple large-angle analytic solution known for the equilateral model, it can be evaluated from the simplified expression where   are given in (53) and  , in the scale-invariant case, are defined by (compare with the local case)

Before concluding this section we would like to note how all of the solutions presented here present a characteristic scaling. This is just a direct consequence of the scaling of the primordial bispectrum in the scale-invariant case, and it is a model-independent result. It is analogous to the typical scaling displayed by the angular power spectrum in correspondence to a scale-invariant primordial power spectrum .

3.3. Nonseparable Bispectra Revisited

Recall the mode expansion (21) of a general nonseparable primordial shape. If we substitute this into the expression for the reduced bispectrum (46), then the separability of the expansion leads to the same efficient calculation route discussed in the previous section through [14]: where simply result from convolving the basis functions with the transfer functions The computationally costly 3D integrals have again reduced to a sum over products of 1D integrals; we note that this economy arises because the triangle condition is enforced in (56) through the product of Bessel functions, resulting in a manifestly separable form in which we can interchange orders of integration. With this mode expansion, all nonseparable theoretical CMB bispectra become efficiently calculable provided that there is a convergent expansion for the shape function.

In the same way that we decomposed an arbitrary primordial shape in Section 2.2, it is possible to construct analogous late-time separable basis functions and orthonormal modes with which to describe the CMB bispectrum [6, 9]. The tetrahedral domain defined by the triangle condition for multipole configurations is essentially identical to that for wavenumbers (23), except that only even cases contribute . However, the appropriate weight function now incorporates Wigner- symbols arising from bispectrum products: where in the second expression we have exploited the freedom to divide by a separable function and use a weight which makes the bispectrum functions more scale invariant (eliminating an factor—see below). The inner product between two functions and is altered from the primordial wavenumber integral (59) into a sum over multipoles on the tetrahedral domain; that is, But for the change in the weight (which only affects configurations near the edges of the tetrahedron), the 1D polynomials and the 3D separable product basis functions (), as well as the resulting orthonormal modes , are nearly identical to their primordial counterparts , and defined in Section 2.2.

We can now expand an arbitrary CMB bispectrum in both the separable and orthonormal mode expansions, which is achieved in the following form: where the variance term reflects the signal-to-noise weighting expected in the CMB estimator (see Section 3.5). Again, the coefficients in the expansions are determined, first, from the orthonormal inner products and, secondly, the separable are found with the transformation matrix analogous to (26). Examples of the convergence of these mode expansions for equilateral, DBI, and cosmic string CMB bispectra are given in Figure 5.

3.4. CMB Bispectrum Calculations and Correlations

Prior to the systematic mode expansion approach (56) being implemented, robust hierarchical schemes were developed to calculate any nonseparable CMB bispectrum (46) directly [6, 9]. These use the transverse coordinate system given in (11) and employ adaptive methods on a triangular grid to accurately determine the oscillatory 2D -integrations, with important efficiencies also coming from the flat sky approximation, binning, and interpolation schemes. Precision to greater than 1% across the full Planck domain was established by direct comparison with analytic solutions such as (49) and (51). Examples of nonseparable (and separable) CMB bispectra found using these hierarchical coarse-graining methods are shown in Figures 7 and 8. While the CMB bispectra retain the qualitative features of the primordial shape functions , they are overlaid with the oscillatory transfer functions which give rise to a coherent pattern of acoustic peaks. These direct bispectrum calculations revealed that typical primordial models could be described by eigenmode or other expansions using only a limited number of terms.

Motivated by the form of the CMB estimator, we can define the following correlator to determine whether or not two competing theoretical bispectra can be distinguished by an ideal experiment: where the normalization is defined as The emergence of the inner product (59) in the expression (61) means that substitution of the mode expansions (60) for the theoretical bispectra reduces the correlator to While the late-time correlator (61) is the best measure of whether two CMB bispectra are truly independent, it can be demonstrated that for the majority of models the shape correlator (19) introduced earlier is sufficient to determine independence.

On the basis of the direct calculation of the bispectrum results and the CMB correlator, we can now quantitatively check the forecasting accuracy of the primordial shape correlator proposed previously (again closely following the discussion in [9]).

3.4.1. Nearly Scale-Invariant Models

For nearly scale-invariant models, the centre values for the bispectrum all have roughly the same profile but with different normalisations. As we see from Figure 8, the oscillatory properties of the transfer functions for the CMB power spectrum create a series of acoustic peaks for any combinations involving the following multipole values: . Of course, to observe the key differences between the scale-invariant models we must study the bispectrum in the plane orthogonal to the -direction, that is, the directions reflecting changes in the primordial shape functions. To plot the bispectrum (see Figures 7 and 8), we consistently divide by the large-angle CMB bispectrum solution for the constant model (14). This is analogous to multiplying the power spectrum 's by , because it serves to remove the overall scaling of the bispectrum, flattening while preserving the transverse momentum-dependence primordial shape, and the effects of the oscillating transfer functions.

The starting point is the constant model (14) which, despite its apparent simplicity, has a CMB bispectrum revealing a nontrivial and coherent pattern of acoustic peaks that we have already noted (see Figure 7). Given that the constant model has no momentum dependence, we stress that the resulting bispectrum is the three-dimensional analogue of the angular power spectrum for a scale-invariant model. The largest (primary) peak, for example, is located where all three (corresponding to the large blue region near the origin). We can interpret Figure 7, therefore, as the pure window function or beam effect of convolving any model with the radiation transfer functions while transforming from Fourier to harmonic space.

The CMB bispectrum for the equilateral model is plotted in Figure 8, showing how the centre weighting from the primordial shape is well preserved despite the convolution with the oscillating transfer functions. For the full CMB correlator (61), the DBI, ghost, and single shapes are generally even more closely correlated with the equilateral model, presumably because distinctive features are “washed out” by the transformation from Fourier to harmonic space. Comparative results between the shape correlator and the Fisher matrix analysis are given in Table 1, establishing that these models are highly correlated and difficult to set apart observationally.

The CMB bispectrum for the local model is also shown in Figure 8, demonstrating a marked contrast with the equilateral model which reflects their different primordial shapes shown in Figure 3. The dominance of the signal in the squeezed limit creates strong parallel ridges of acoustic peaks which connect up and emanate along the corner edges of the tetrahedron (see [58] for further details). The 51% CMB correlation between the local and equilateral models is underestimated by the shape correlator at 41%, presumably because of effective smoothing due to the harmonic analysis. Reflecting their distinctive primordial properties, the CMB bispectra for the flat and warm models are poorly correlated with most of the other models, though the flat shape could be susceptible to confusion with the local CMB bispectrum with which it has a larger correlation (see Table 2). It is clear that the local, equilateral, warm, and flat shapes form four distinguishable categories among the scale-invariant models.

3.4.2. Scale-Dependent Models, Cosmic Strings, and Other Late-Time Phenomena

Models which have a nontrivial scaling, such as the feature models, can have starkly contrasting bispectra as illustrated in Figure 8. For example, instead of having the same pattern of acoustic peaks which characterise the scale-invariant models, the feature model can become entirely anticorrelated so that the primary peak has the opposite sign. Later, for this particular choice of in (29), for increasing the phase of the oscillations becomes positively correlated by the second and third peaks. This can lead to small correlation with the other primordial shapes, all below 45% as shown in Table 2 for this and . Clearly, these nonseparable feature models form a distinct fifth category beyond the four scale-invariant shapes noted above and, of course, there are many possible model dependencies which can lead to further subdivision.

By way of further illustration of the breadth of other possible nonseparable CMB bispectra, we present the late-time CMB bispectrum predicted analytically for cosmic strings [59] as where , , , and Here, is a model-dependent amplitude with measuring the string tension relative to the Planck scale. The cutoffs around in (51) are associated with the string correlation length at decoupling (perturbations with can only be causally seeded after last scattering). Here, the nonseparable nature and very different scaling of the string CMB bispectrum are clear from a comparison with (51). Moreover, given the late-time origin of this signal from string metric perturbations, the modulating effect of acoustic peaks from the transfer functions is absent, as is clear from Figure 8. This is just one example of late-time phenomena such as gravitational lensing, secondary anisotropies, and contaminants which are accessible to analysis using the more general CMB mode expansions (60).

3.5. The Estimation of from CMB Bispectra

In light of the previous discussion, it is evident how measurements of the bispectrum from CMB experimental datasets are able to provide information about the primordial three-point function of the cosmological curvature perturbation field at the end of inflation. This in turn allows us to put significant constraints on inflationary models or on alternative models for the generation of cosmological perturbations. We will now start dealing with the problem of bispectrum estimation in the CMB as a test of primordial non-Gaussianity.

Let us assume that we have measured the three-point function of a given CMB dataset. There are now two general ways to exploit this information.(1)Tests of the Gaussian Hypothesis. By comparing the measured three-point function to its expected distribution obtained from Gaussian simulation we can detect whether some configurations present a significant deviation from Gaussian expectations. The issue with this approach is that it is sensitive not only to primordial non-Gaussianity, but also to any other possible source of NG, including those of noncosmological origin. Original bispectrum tests of this kind on COBE maps [60] revealed significant deviations from Gaussianity in the data. This NG signature in the three-point function seemed to be localized in harmonic space around multipoles and was object of much scrutiny (see, e.g., [6165]). It was then finally ascertained that the detected signal was not cosmological in origin, but due to a systematic artifact [66]. Moreover, the overall statistical significance of the result disappeared in a later analysis involving the measurement of all of the bispectrum modes available in the map [67] (only a subset of all the configurations had been studied before). General tests of Gaussianity are very useful to identify unexpected effects in the data, and to monitor systematics. However, as long as we are interested in a primordial NG signal, it is better to follow the approach of making an ansatz for the bispectrum we expect from the theory under study and obtain a quantitative constraint on a given model. This approach is outlined in the point that follows.(2) Estimation. In this case we choose the primordial model that we want to test, characterizing it through its bispectrum shape. We then estimate the corresponding amplitude from the data. If the final estimate is consistent wih , then we conclude that no significant detection of the given shape is produced by the data, but we still determine important constraints on the allowed range of . Note that ideally we would like to do more than just constrain the overall amplitude, and reconstruct the entire shape from the data by measuring single configurations of the bispectrum. However, the expected primordial signal is too small to allow the signal from a single-bispectrum triangle to emerge over the noise. For this reason we study the cumulative signal from all of the configurations that are sensitive to .

Since in this review we are concerned with the study of the primordial bispectrum, we will take the latter approach and deal with the problem of estimation from measurements of the bispectrum in CMB maps. We will first present a cubic estimator that optimally extracts the information from the data contained in the bispectrum (Section 3.5.1). We will then address the issue of understanding whether this optimal cubic statistic extracts all of the possible information available on in the data or whether there is enough additional information beyond the three-point function to allow more precise measurements using non-bispectrum-based estimators of (Section 3.5.2). We will then discuss concrete numerical implementations of bispectrum estimators (Section 3.6) and review the experimental constraints on obtained from bispectrum analysis of WMAP data (Section 3.7). Using a standard Fisher matrix analysis, forecasts on the error bars are achievable for future CMB surveys (Section 3.8). Following, we will study the NG signals in the map that could contaminate the primordial NG measurement and how they are dealt with when analyzing the data. Finally we will describe algorithms for the simulation of primordial NG CMB maps that are useful for testing and validation of estimators before applying them to real data.

In the following, we assume that the reader is familiar with essential concepts in statistical estimation theory, such as the definition of a statistical estimator, the role played by maximum-likelihood estimators in statistics, the definitions of unbiasedness and optimality, and the definition and main applications of the Fisher information matrix. The reader unfamiliar with these concepts can consult the appendix of this review and references therein.

3.5.1. Bispectrum Estimator of

In this section we are concerned with the statistical inference of from measurements of the bispectrum of the CMB anisotropies. We recall that we defined earlier as the amplitude of the bispectrum of the primordial potential. In principle, we can include both temperature and polarization multipoles in the analysis, in order to maximize the available data. However, for clarity we will consider only temperature multipoles in the following and omit the superscript in , for simplicity of notation. The extension to polarization is conceptually straightforward and will be discussed in a following paragraph. We will start by considering a simple cubic (The estimator is dubbed cubic due to the fact that it contains the third power of the random variable ) statistic written in the form In the previous equation represents the statistical estimate of from the data, are the multipoles of the observed CMB temperature fluctuations, are some weight functions, and is a normalization factor that has to be chosen to make the estimator unbiased, that is, to ensure that

We now want to find the weights that provide the best estimator (i.e., the minimum error bar estimator) within the class of cubic statistics written in the form of (66). It is a well-known result (see Appendix) that the best unbiased estimator of a parameter from a given dataset is the maximum-likelihood estimator. In order to answer our question we then have to write the bispectrum likelihood as a function of the parameter and maximize with respect to .

In the assumption that the bispectrum configurations are characterized by a Gaussian distribution (This is not strictly true, but it is a good approximation. The same approach applies to most cosmological observables.), maximizing the likelihood is equivalent to minimizing the following : where is the observed angular averaged bispectrum, that is, by definition and is the bispectrum variance, that is, the six-point function We will now make the assumption that we are working in the weak non-Gaussian limit; that is, is small and the distribution of can be approximated as Gaussian in the calculation of the variance. The implications of this approximation will be discussed in greater detail in the following sections; for the moment it will suffice to point out that the weak non-Gaussian approximation is generally a good one since most inflationary models predict to be small, and because the level of primordial non-Gaussianity is already constrained to be small by WMAP measurements [1, 68]. After restricting indices so that and , the six-point function above can be calculated using Wick's theorem, yielding [57] In the last formula is a permutation factor that takes the value of when all 's are different, when two 's are equal, and when all 's are equal. We can now substitute (71) into (69) and differentiate with respect to to get an explicit expression for the optimal cubic statistic we were looking for: where is the reduced bispectrum and is the Gaunt integral defined by equation (40); is the normalization factor mentioned at the beginning of the paragraph that guarantees the unbiasedness of the estimator.

Note that the noise and window function of the experiment are included in the and that appear in the formula above, with the following replacements: where is the window function (not to be confused with the weights ) and is the noise power spectrum (constant for uncorrelated white noise). The noise is assumed to be Gaussian, thus characterized by a vanishing three-point function. Comparing our result (72) to the initial ansatz (66), we then see that the optimal weights are In other words we are weighting the observed bispectrum by its expected signal-to-noise ratio.

We have now constructed a statistic that optimally extracts the information about from the bispectrum of the map. The question now is the following: is there additional information about in the map that is not contained in the bispectrum? This issue will be investigated in the following sections. For the impatient reader we anticipate that the answer is no: the bispectrum statistic built here is actually the minimum error bar estimator of from CMB data.

3.5.2. Optimality of the Cubic Estimator

In this section we address the issue of whether the cubic statistic (72) optimally extracts all of the information contained in or whether other statistical estimators (e.g., four-point function, or pixel space statistics such as the Minkowski functionals, or again wavelet estimators, just to mention a few among many possible examples) are able to produce smaller error bars and are thus more efficient than the bispectrum.

In a non-Gaussian primordial CMB map, the likelihood depends on the NG parameter . We will indicate it with , where indicates a vector including all of the 's (we will assume that all other cosmological parameters are fixed and concentrate on ). It is a well-known result in parameter estimation theory that there is a lower limit on the error bars that can be assigned to a given parameter (in our case ). Such lower limit, also known as the Rao-Cramer bound, is defined in terms of the Fisher matrix as (We again refer the reader unfamiliar with these concepts to the brief summary provided in appendix.) We remind the reader that the Fisher matrix is defined as If we can show that the bispectrum estimator of the previous section saturates the Rao-Cramer bound for the Fisher matrix above, then we conclude that it provides the best (i.e., minimum variance) estimate of from the data, rather than just the best estimate from the bispectrum of the data. In other words, no more information about could be extracted from than the information contained in the bispectrum. The aim of this section is to show that this is actually the case.

The issue of the optimality of bispectrum estimators of was addressed in great detail by Babich in [69]. In this section we will basically review the main results of that study, referring the reader to the original paper for their complete derivation.

As we mention in appendix, there is a sufficient and necessary condition for an estimator to saturate the Rao-Cramer bound, expressed by formula (A.5). This condition, applied to our case, reads Our aim is to show that the bispectrum statistic (72) satisfies this condition. We then need to start from a computation of the full likelihood for a general primordial non-Gaussian model. Following Babich [69], we will start by limiting ourselves to the particular case of the local model.

We recall from a previous section that local NG is the only case for which an explicit expression for the primordial potential is provided. In real space, Starting from this formula, it is possible to obtain a likelihood function for , dependent on the parameter . This is done by means of an expansion in terms of the order parameter . The full expression for the Probability Density Function (PDF) (see [69]) can be expanded around its Gaussian expectation for and schematically written as where is the covariance matrix of the Gaussian part of the potential that is,

Formula (80) is then telling us that the logarithm of the full likelihood can be decomposed into the sum of a Gaussian likelihood , plus a NG term that depends linearly on and that this decomposition is accurate up to terms of order ; that is, we are assuming that NG is weak, as we did in the previous section.

After computing , one has to account for 2D projection and radiative transfer in order to obtain the required likelihood . As shown by Babich in [69], this can be achieved by expanding the PDF (80) in spherical harmonics and performing the functional integration: where is the Dirac delta function of dimension and due to the 2D projection (As noted by Babich in [69], the additional degrees of freedom do not affect the CMB anisotropies and can therefore be integrated out.) The previous formula can be derived by recalling (32) together with the well-known formula in probability theory as follows: where is again the Dirac delta function, and are random variables linked by the functional relation , and , are the PDFs of and respectively. Solving the functional integral (82) yields [69]

In the previous formula we can recognize the standard PDF, valid in the standard Gaussian case, in the first term on the r.h.s. Added to this, we find a first-order -correction proportional to the CMB angular bispectrum. Higher-order correlators are not present at order . For reasons that will become clear shortly, although we have not computed it, we have explicitly denoted the term in the expansion with . Note that, besides assuming weak NG in this formula, we are also assuming rotational invariance (this is evident from the fact that the covariance matrix appearing in the Gaussian piece of (84) is diagonal and equal to ). Rotational invariance is a general property of the CMB sky, but it is broken when we deal with real CMB measurement characterized by inhomogeneous noise patterns and sky cuts. We will investigate these effects in the following section. For the moment we consider the purely ideal case described by (84). Armed with the PDF expression for local NG and recalling the necessary and sufficient condition (78), we can finally determine whether the estimator (72) is optimal or not. First of all we see that

We then see from combining (85) and (72) that We now see that, in order for the necessary and sufficient condition for optimality (78) to be verified, we need the term to be exactly equal to . The second-order quantity should then be calculated explicitly in the expansion (84) in order to complete the calculation and verify whether, or under which conditions, this is true. However this turns out not to be necessary if we consider the following “regularity condition for a PDF”. (Condition (87) can be easily derived remembering that, for a given random variable with probability density , we have by definition , and substituting in the previous expression, one then finds that the regularity condition (87) holds, provided the order of integration and differentiation can be exchanged (hence the “regularity condition” qualification). ) For a general PDF of a random variable depending on a parameter we have

Since this regularity condition must be valid for each value of the parameter ( in our case), it is clear that it must hold term-by-term, that is, at each order, in the expansion (84). By taking the average value of equation (86), keeping in mind that the estimator is unbiased and imposing (87), we then find that the average value of must be exactly equal to . If we could then replace in (86) with its average value, then we would exactly obtain the condition for optimality and conclude that the cubic estimator (72) saturates the Rao-Cramer bound. For present CMB experiments, the terms in the expansion (84) are evaluated summing over a large number of -modes ( for WMAP, for Planck in the signal-dominated regime), or, equivalently in pixel space, averaging over a large number of pixels (106 and for WMAP and Planck, resp.). For this reason we expect that the error made by replacing the -order term in (86) with its average value will be very small. In [69], an estimate of this error has been done in the approximation of neglecting radiative transfer and projection effects (i.e., working in 3D with the primordial potential, rather than with a CMB map). The conclusion was that for a number of observations the approximation above works very well. Moreover the variance of the -order term scales like . In the full radiative transfer case we expect the scaling to be unchanged, although the coefficients in front of it that led to the estimate might change. However, as noted above, the number of pixels in present-day experiments is many orders of magnitude larger than . That leads us to conclude that the approximation of replacing the average of the first-order-term in equation (86) is a very good one. We then reach the following important conclusion.

For a rotational invariant CMB sky, in the limit of weak NG, the cubic estimator defined by formula (72) is the best unbiased CMB estimator of

Let us now move to problem of generalizing the last conclusion to shapes different from local. In this case a full expression of the primordial potential is not available. The steps that lead to the conclusion that the local estimator is optimal can thus not be reproduced. However it was pointed out by Babich in [69] that, in the limit of weak NG, the full CMB NG likelihood can still be expressed in terms of its power spectrum and bispectrum by means of an Edgeworth expansion, regardless of its full expression. The Edgeworth expansion is basically a way to express a NG PDF as a series expansion around its Gaussian part [7072]. For CMB anisotropies one finds, at the end of the calculation, that

It is easy to see that takes the same form as in (84). For this reason all the previous derivation applies also to the present case and the following conclusion holds.

In the weak non-Gaussian limit and assuming rotational invariance of the CMB sky, the cubic estimator (72) is the best unbiased CMB estimator of for any non-Gaussian shape.

Before concluding this section, we would like to stress that, despite the technical complications arising in the detailed probe of the bispectrum estimator's optimality, the physical reason behind this result is quite clear. We can always expand the PDF in series of its momenta. The order parameter of this expansion is . This parameter is the natural measure of the amplitude of primordial NG, and it is actually predicted by inflation to be very small. For this reason higher-order momenta in the primordial non-Gaussian PDF are suppressed with respect to the bispectrum. Basically, the information on in a CMB map is entirely contained in the three-point function.

3.5.3. Breaking Rotational Invariance

So far, the assumption of rotational invariance of the CMB sky has been made when probing the optimality of the cubic estimator (72). In an ideal situation, the CMB sky is clearly rotationally invariant. However, two elements break rotational invariance in a CMB map derived from a real experiment: an anisotropic distribution of noise in pixel space and a galactic mask. Anisotropic noise comes from the fact that the CMB sky is generally scanned in a nonuniform way: regions that are less contaminated by astrophysical foreground emission are generally observed more times and are thus characterized by a lower noise level (see Figure 9, e.g.). A sky cut has also to be introduced in order to remove the regions on the galactic plane that are most contaminated by foregrounds. When rotational invariance is broken, the considerations of the previous two sections do not strictly apply anymore and the estimator (72) becomes suboptimal. However, the same Edgeworth expansion approach that was adopted in the previous section can still be applied, but this time keeping rotation-invariance breaking terms in the calculation, in order to find the new more general form of the optimal estimator. The general estimator turns out to be the sum of two terms: the first term is cubic in and is analogous to the one appearing in the rotationally invariant case, while the second term is linear in and accounts for breaking of rotational invariance. The explicit expression of this general optimal estimator is [73]

In the rotationally invariant case the covariance matrix is diagonal and equal to , while the linear term is proportional to a monopole. We then recover the form of the cubic estimator (72) as expected. Note that in the signal-dominated regime of the experiment under study (e.g., for WMAP and for Planck), and if the mask is not too large, then the simple cubic estimator (72) is still basically optimal, since we are in a nearly rotationally invariant case. For small masks it has been shown by Komatsu and Spergel in [74] that the bispectrum and power spectrum of the map are, to a good approximation, just rescaled by a factor representing the fraction of the sky left free by the mask; that is, In this case one can then assume the covariance matrix to be diagonal and account for the effects of the mask by correctly rescaling the normalization term in order to keep the estimator unbiased. This nearly rotationally invariant estimator then takes the form

3.5.4. Large Regime

The approximation of weak non-Gaussianity is the basis for all of the results derived so far. One can then ask at which point (i.e., for which values of ) this approximation breaks down. As we observed earlier, the Edgeworth expansion (88) shows that the likelihood of a generic primordial NG distribution can be expanded in series of its momenta, with order parameter We know that while WMAP observations already constrain . That means that the order parameter of the PDF expansion is 10−3 and thus the weak NG approximation seems to be a very good one in the entire range of allowed and predicted . However a subtle effect has been pointed out by Creminelli et al. [75], which changes the previous conclusions in certain cases. Let us quickly summarize their main results. We already saw that, for the angular averaged bispectrum of a Gaussian temperature field, We then included this expression for the variance in the weights of the optimal estimator (72) and in the normalization factor . It is easy to see that in this approximation the variance of the estimator can be predicted as

However the approximation of taking in the calculation of the estimator variance is not always a good one if is dominated by squeezed configurations (We recall that by squeezed configurations we mean triangles in which one of the sides is much smaller than the other two; that is, .) or more in general by configurations in which one of the 's is small. It turns out that, in these cases, the -dependent corrections to the Gaussian expectation of the bispectrum variance become important when gets large enough. This effect increases the variance of the estimator with respect to the expectation for . There is a clear physical interpretation for this. One can see, for example, by calculating (94) in the simple Sachs-Wolfe case (see also Section 3.8) that the variance of the estimator scales roughly like , or equivalently like in pixel space, with being the number of pixel in the observed map. This increase of the signal-to-noise ratio with the number of pixels is due to the fact that more and more bispectrum configurations are included into the sum over modes to estimate . However, if the signal is completely dominated by low- modes, as in the local case, then there is an intrinsic large cosmic variance, due to the small number of low- configurations. Clearly, cosmic variance cannot be beaten by increasing the resolution of the map. Creminelli et al. [75] then found that, for getting large, the of the estimator for local NG grows asymptotically as , that is, much slower than the expected , which one would obtain by neglecting -dependent corrections in the calculation of the estimator variance. They carried out a calculation of the estimator variance in the flat-sky approximation, and neglected the transfer functions, to find the following expression: where is the bispectrum amplitude. We clearly see from this formula what we were stating above; that is, when gets large, the variance starts scaling like . The same formula also shows the technical point behind this behaviour: in the correction term, the order parameter is actually not anymore but rather . This enhancement by a factor can make the first-order corrections nonnegligible anymore. The natural question is now how large an we need to make the correction term important in (95). Following Creminelli et al. [75], let us call the standard deviation of the estimator computed for . Let us say that for an experiment at a given angular resolution (defined by in harmonic space or by in pixel space) a value of is measured, equal to . Substituting this value into formula (95) behavior and calling the real estimator variance, one finds the following relative correction to : For an experiment like WMAP, the r.h.s. term becomes 1 when is about away from the origin. For an experiment like that of Planck, has to be about . The definition of a high- regime is thus dependent on the experiment under study, as a consequence of the fact that the enhancement of first-order terms in the variance expression depends on . In other words we can conclude the following.

If will be detected at several (in terms of the Gaussian expectation for the standard deviation), then -dependent correction terms in the estimator variance will have to be taken into account, and the simple expansion (84) of the CMB likelihood does not constitute a valid approximation anymore.

One caveat in all of this discussion is that formula (95) was obtained in flat sky, neglecting the transfer functions, and it should be checked how dependent the final results are on these approximations. Since the scaling argument is based on a very general physical reason, that is, the weight of squeezed configurations in the local estimator discussed earlier in this section, one expects that the general scaling with obtained in (95) does not depend on the details of radiative transfer and 2D projection. Liguori et al. [76] actually checked the results of this section numerically, by applying an implementation of the optimal cubic estimator (72) to full-sky simulations of CMB local NG maps with different and , including the full radiative transfer (For details about the numerical implementation of the optimal cubic estimator, and about the generation of NG CMB maps, see Sections 3.6 and 3.7. ). Although, as expected, the coefficients in formula (95) change with respect to the simple flat sky no radiative transfer approximation, the scaling of the error bars with follows very well the expectations, going from at low to when is detected at high significance. Since in the large regime the variance starts to scale very slowly, like , one is led to wonder whether the estimator discussed in the previous sections becomes suboptimal at this point and whether a better one can be found. The answer to this question is not immediate. In order to check for the optimality of an estimator, as we have seen, one has to see whether it saturates the Rao-Cramer bound. However, also the local likelihood and Rao-Cramer bound estimated in the previous sections have to be recomputed, since they were obtained neglecting higher-order terms in . In order to account for the -enhanced terms, it is necessary to produce an exact expression of the full likelihood. This can be extremely challenging in the full radiative transfer case, but it is feasible in the flat sky no radiative transfer approach that we are considering (and that we showed earlier to be a good approximation as long as scaling arguments are involved). Creminelli et al. [75] proceed to calculate the full likelihood in this approximation and conclude that the optimal cubic estimator of weak local NG indeed does not saturate the Rao-Cramer bound in the high- regime. The estimator (72) is thus no longer optimal in this case. They then proceed (always in the flat sky no transfer function case) to derive a cubic estimator that saturates the Rao-Cramer bound also for large . We will not enter into the details of this derivation here, referring the reader to the study by Creminelli et al. in [75] for a complete discussion. The main aim of this section was to show under which conditions the optimality of the cubic estimator that we discussed in previous sections is valid. Since current bispectrum analysis of WMAP data [1, 68] finds that  2, the weak NG approximation applies, and the cubic estimator we derived earlier is indeed an optimal estimator in this case. However, if future Planck measurement will produce a detection of at high significance, then the estimator will have to be modified in order to account for the enhanced variance in the high- regime. This is not necessarily a remote possibility if one considers that the present central value of from WMAP estimates would produce a e899 8 detection with Planck. Before concluding this section we would like to remark once again that the variance enhancement discussed here applies only to non-Gaussianity of the local type, whose bispectrum is dominated by squeezed configurations, affected by a large cosmic variance. For the other shapes the cubic estimator (72) is optimal in both the small- and high- regimes.

3.6. Numerical Implementation of the Bispectrum Estimator

In this section we turn to the problem of finding an efficient numerical implementation of the optimal bispectrum estimator (89). Let us repeat its expression here for convenience:

We remind the reader that this is the full expression, valid for the general nonrotationally invariant case. For a rotationally invariant CMB sky the linear term in the formula above vanishes, and the covariance matrix is diagonal and reduces to , giving the simplified expression (72) that we reproduce again here for convenience:

In a schematic way, the full estimator can be written as where the “cubic” term is the one containing the product , while the linear term is the one dependent on a single and vanishing in the rotationally invariant case, where it is proportional to a monopole. It was shown before in a formal way that a pure cubic estimator becomes suboptimal when rotational invariance is broken, and adding the linear term is necessary to restore optimality. It is useful to try to understand the reason of this effect qualitatively and in a more intuitive way. Let us assume that we have a map characterized by nonstationary noise, and we are observing a region of the sky that was sampled many times so that the noise level in this area is low. That implies that the level of small-scale power in this large region is lower than average. Now, for a specific realization of the CMB sky, this modulation of small-scale power on a large region can look like a non-Gaussian signal sourcing a squeezed configuration of the bispectrum. On average, this effect must cancel if the underlying noise model is Gaussian. However, this “confusion” between signal and noise increases the variance of any estimator of a primordial NG signal that is peaked on squeezed configurations. We know that this happens for the local model. This heuristic argument thus shows that, even though in principle a linear term must always be included when rotational invariance is broken, for a realistic noise model only local non-Gaussian estimates will be affected.

3.6.1. Primary Cubic Term for

Let us focus for the moment on the rotationally invariant case, where the linear term vanishes, and the covariance matrix is simply . We immediately see that a brute force implementation of equation (100), consisting in computing and summing over all of the bispectrum configurations, would take operations, where , the maximum multipole in the calculation, depends on the resolution of the experiment. As mentioned earlier, for WMAP and for Planck in the signal-dominated regime. At these resolutions, a brute force approach would be absolutely unfeasible for a general shape. If however we assume that the primordial shape under study is separable, then the dimensionality of the problem can be reduced and the overall number of operations scaled down significantly, making the computational cost affordable. Let us illustrate this point more in detail. Substituting (47) into the estimator expression (72) and remembering the identity (40), it is possible to rewrite (100) as follows (or more in general as a linear combination of terms of the following kind): From an inspection of previous formula we see how, as a direct consequence of separability, the initial sum over the indices has been factorized in the product of three sums, each running over two indices . This greatly reduces the computational cost from to operations. If we define the new quantities then we can recast the estimator expression above in the following form: where it is evident that we are now calculating our statistic in position space rather than in pixel space. Note how the filtered maps can be efficiently calculated using a fast harmonic transform algorithms such as those included in the HEALPix package. This fast position space algorithm was initially introduced by Komatsu et al. in [77] in the context of local estimation and applied to the estimation of WMAP 1-year data by the WMAP team by Komatsu et al. in [78]. It was then applied to equilateral estimation for the first time in study of Creminelli et al. in [73]. An alternative numerical implementation with respect to the one used by the aforementioned authors was introduced by Smith and Zaldarriaga in [22]. Although different under many technical aspects, this second algorithm is still based on the calculation of the position space statistic (106); we refer the reader to the original work for additional details. This second implementation has been used to produce alternative estimates of , and from WMAP data, and to estimate the amplitude of the orthogonal shape, recently introduced by Smith et al. in [18].

Let us now discuss the possible limitations of this numerical approach. As noted in Section 2, the separability condition is in principle quite restrictive: the only separable shape arising directly from primordial models of inflation is the local one. On the other hand, it is still possible to study nonseparable models by finding separable shapes that are highly correlated to the primordial one. As observed in studies by Creminelli et al. in [73], Fergusson and Shellard in [9], and Smith and Zaldarriaga in [22], the limits obtained from a highly correlated separable shape in this way will be very close to those that would have been obtained using the original nonseparable model (see again Sections 2, 3.2, and 2.2 for a detailed discussion of this issue). We know from earlier sections that the other two shapes mentioned so far in this section besides local, namely, the equilateral and orthogonal shapes, have actually been derived as separable approximations of theoretical inflationary shapes. These approximations were obtained in an heuristic way; that is, an educated guess of a good separable approximation of the shape under study was made, and the correlation was checked a posteriori. There is obviously no a priori guarantee that this approach would be easily repeatable for all of the shapes of interest. The eigenmode expansion method introduced in [14] and summarized by equation (26), however, provides a general and rigorous method to find separable approximations of any shape, thus enabling the estimation of any possible primordial model. In this case, recall that we expand our (nonseparable) primordial shape function in terms of the separable basis functions (see (26)), constructed from symmetric polynomial products , as where the second expression for the reduced bispectrum (56) expands in convolved basis functions (57) in harmonic space with In the mode expansion approach, then, the estimator for a specific model generalises to the following: where the filtered maps or shells are defined by Defining the integral , the estimator collapses into the compact form where it is possible to show a precise relationship between the theoretical bispectrum expansion coefficients and expectations for the observed coefficients .

It was also pointed out by Fergusson et al. in [14] (see also [6]) and summarized in formula (60) that the separation can be performed directly in harmonic space on the reduced bispectrum , rather than on the primordial shape . This provides an alternative, but equivalent, late-time -estimation pipeline with respect to the primordial shape separation approach given above as (111). In fact, since orthonormality is more direct on the harmonic domain without the intervention of transfer functions, the approach is considerably more straightforward conceptually. In this case, expectations for the observational expansion coefficients in the orthonormal frame (with , see (60)) become simply , that is, for an ensemble of maps possessing the theoretical bispectrum described by the coefficients. This means that for a NG bispectrum signal of sufficient significance we can consider directly and efficiently reconstruction of the bispectrum from the observed coefficients using (60). We also note that the harmonic space separation scheme also allows for the estimation of noninflationary late-time bispectra, such as the bispectrum of cosmic strings, as well as other secondary anisotropies.

We can then conclude, in light of these developments, that the fast cubic statistic (106) can be applied in complete generality to any model of primordial NG, as well as to any other potential source of CMB NG. We also point out that alternative approaches have been considered for harmonic space analysis using wavelets and binning. For example, Bucher et al. [58] recently proposed using a suitable binning scheme in which the full expression for is calculated in a subset of all of the triples , small enough to make the calculation feasible while maintaining calculation accuracy. Approaches based on a harmonic space separation scheme, of course, require the full calculation of the reduced bispectrum in order to determine the correlation between the theoretical prediction and the final expanded or binned bispectrum. The calculation of implies the necessity of numerically solving the radiative transfer integral (31) for all of the configurations which appears to be intractable in the nonseparable case, since the dimensionality of the problem cannot be reduced. However, this can be achieved efficiently in the general case using either the separable mode expansion integral (56) or else the hierarchical adaptive approach of Fergusson and Shellard [6] discussed in Section 3.4.

3.6.2. Linear Correction Term for

Let us now consider the realistic situation in which inhomogeneous noise and a sky-cut break rotational invariance (see Figure 9). In this case two complications arise as (1)A linear term in has to be added as follows.(2)The covariance matrix is now no longer diagonal. The inverse covariance weighting that appears in expression (98) is hard to compute numerically for high angular resolution experiment, since its size makes a brute force numerical inversion impossible.

A first approach, introduced by Creminelli et al. in [73], is to simplify the problem by assuming that the covariance matrix is diagonal in the cubic term of the estimator, and then finding the linear term that minimizes the variance under this assumption. In other words, we keep the cubic term in the form of (100) and compute the variance of this term, relaxing the assumption of isotropy at this point (This means that, when we apply Wick's theorem to the six-point function in the calculation of the cubic term variance, we take instead of . ).

It turns out that the variance is minimized (while leaving the estimator unbiased) for the following choice of the linear term: where is the normalization term. Despite being suboptimal with respect to a full implementation of (98), this choice of linear term has been shown to significantly improve the error bars with respect to the simple cubic statistic (100) in presence of anisotropic noise. At the same time, the simplicity of this implementation in comparison to the full optimal statistic (98) is manifest, since no terms appear in (112). Let us consider again a separable primordial bispectrum shape that can be written as Applying the same procedure as we did for the cubic term, the linear term can be recast in the form where we explicitly wrote as . This last formula can be rewritten as Like for the cubic part of the estimator, we have rewritten the linear term as a fast position space integral. The ensemble averages appearing in the last formula can be computed as Monte Carlo averages over a large number of Gaussian realizations of the CMB sky, characterized by the same beam, mask, and noise properties as in the experiment under study. This pseudooptimal, but relatively straightforward, implementation of the linear term has been adopted by a number of groups in order to estimate from WMAP data [1, 73, 75, 79]. The full optimal estimator (89) was implemented only quite recently by Smith et al. in [68], where the authors developed an efficient conjugate gradient inversion (see e.g., [80]) algorithm based on earlier results from the study of Smith et al. in [81], in order to compute the prefiltering in reasonable CPU time. Note that after the inverse covariance matrix prefiltering is calculated, the numerical implementation of the estimator is very similar to the one outlined above for the pseudooptimal case. The new position space statistic is obtained from formulae (106), (114), by making the following replacements, wherever the corresponding quantities appear: with analogous substitutions to be made for the terms appearing in the same equations. The improvement in error bars from the pure cubic suboptimal estimator to the pseudooptimal and optimal statistics is shown in Figure 10.

3.7. Experimental Constraints on

In order to obtain an estimate of from a given dataset, one has first to generate sets of Gaussian CMB maps and obtain the MC averages that appear in the linear term expression (114), after an inverse covariance prefiltering of the full-optimal estimator is implemented. The normalization term can be precomputed using formula (47) to evaluate numerically the theoretical bispectrum shape for the model we want to estimate. The statistic (98) can then be computed for the experimental data to get our result: The error bars are then obtained by running the estimator on simulated Gaussian maps (The error bars can be obtained from Gaussian simulations as long as the weak NG approximation applies. As we saw earlier, this works at any for any shape, except for the local shape when a large makes the error bars dependent. In this case the error bars would need to be calculated from NG simulations of . So far, no high-significance detection of has been reported, so working with G maps is at this stage sufficient to get accurate error bars. ): where indicates the MC average and a vector of simulated multipoles (obviously including mask, beam, and noise features of the experiment). For an accurate step-by-step description of an analysis of WMAP data, including details about channel coadding, noise model, beams, and pixel weighting schemes, we refer the reader to the explanations contained in the study of Komatsu et al. in [1]. The most stringent limits so far have been obtained by applying the bispectrum estimator to the WMAP datasets. Constraints have been put on the local, equilateral, and orthogonal shapes. The best constraints come from the full implementation of the optimal estimator done in the study of Smith et al. in [68] and in that of Senatore et al. [18], and applied to the WMAP 7-year data release as in the study of Komatsu et al. in [83]. They are, at 95% C.L., Since the first release of WMAP data, different groups have used the cubic statistic described in the previous paragraph, either in its pure cubic form (106) or in the improved version including the pseudooptimal linear term implementation (114). The results of different analyses of the WMAP 1-year, 3-year, 5-year, and 7-year datasets are summarized and commented in Table 3, where just the local and equilateral shapes have been included since the only two constraints on the orthogonal shape have been produced to date. The most recent orthogonal constraint has been already mentioned in (118). The other was obtained by Smith et al. [18] on WMAP 5-year data and it is .

3.8. Fisher Matrix Forecasts

The fisher matrix, defined as the curvature of the likelihood function calculated in its peak reassessment (see equation (A.3) in the appendix), plays a very important and well-known role in parameter estimation theory, not only because it defines the optimality of estimators through the Rao-Cramer bound, but also because it allows us to estimate a priori what the smallest error bars attainable will be for a given parameter (see again appendix). In other words, using the Fisher matrix we can forecast how well a parameter will be measured by a given experiment. This is very useful in order to optimize the experimental design to the detection of the parameters of interest. In our specific case, a Fisher matrix analysis will help us to understand what the smallest detectable in principle using different CMB datasets is, and which experimental features can be improved in order to increase the sensitivity to .

3.8.1. A General Derivation

Formula (A.12) from appendix, when applied to our case, yields where is the angular averaged bispectrum (i.e., the measured quantity). This can be rewritten in terms of the reduced bispectrum as where we have defined (see also in (58)) Note how the features of the experiment enter the Fisher matrix through the parameter , defining the angular resolution, and in the angular power spectrum expression in the denominator, which contains the angular beam and experimental noise: where is the theoretical power spectrum for a given set of cosmological parameters, is the beam of the experiment, and is the experimental noise. is a constant for uncorrelated noise. Likewise, the theoretical bispectrum will be convolved by the experimental beam Note that, since the noise is generally Gaussian, its three-point function vanishes. The experimental noise thus only enters in the denominator of the Fisher matrix expression. The effects of partial sky coverage can be easily accounted for. From (91) it follows that if only a fraction of the full sky is covered then the Fisher matrix takes an factor in front, which produces a degradation of the error bars of .

We saw previously that for separable shapes the reduced bispectrum can be calculated either analytically, under some simplifying assumptions on the transfer functions (e.g., the Sachs-Wolfe approximation), or numerically through formula (44). It is then possible to evaluate numerically the Fisher matrix and the corresponding error . In the context of estimation, the first calculation of this kind was done for by Komatsu and Spergel in [74], where it was found that WMAP could reach a sensitivity (note how this bound is actually saturated by the optimal estimator results presented in Table 3), while Planck [85] could go down to (Note that all of the errors quoted in this section are at . ) What allows Planck to improve on WMAP is that it has a much better angular resolution and that it is cosmic variance dominated in a very large range of scales; that is, the power spectrum signal is larger than the noise up to . Angular resolution and sensitivity are the two factors that increase the ability of a CMB experiment to constrain . This information is provided by the Fisher matrix expression (122). Looking at such expression, we notice how the signal-to-noise ratio is obtained by adding over all of the bispectrum configurations up to , weighted by their variance. Thus, the higher is, the more configurations are included in the sum and the larger is the final sensitivity to . On the other hand, we see that, if the power spectrum of the instrumental noise appearing in the variance term in the denominator dominates from a certain , then the signal contribution is suppressed above that threshold by the noise power spectra appearing in the denominator of (122). So what determines the sensitivity of a CMB experiment to is the range of over which the instrumental noise is low, so the experiment is cosmic variance dominated. This range is for Planck and for WMAP, hence Planck can obtain tighter constraints than WMAP. This is shown in Figure 11, where the Fisher matrix forecasts of are plotted for different CMB experiments: the predicted error bars decrease with up to the angular scale at which the measurements start to be noise dominated, after which the signal-to-noise ratio saturates. A simple calculation done by Babich and Zaldarriaga in [86] taking the Sachs-Wolfe approximation, and working in flat sky, showed that, before noise dominates, the signal-to-noise ratio for the local shape grows as where the () is dictated by the coupling between large and small scales introduced by squeezed configurations, from which most of the local signal comes.

Note also how, in absence of experimental noise, the beams in the numerator and in the denominator of (122) cancel each other out. An ideal noiseless CMB experiment would then have a signal-to-noise ratio indefinitely growing. However, this would not imply infinite sensitivity to , because, above a certain secondary anisotropies would start to dominate. The Fisher matrix analysis of the equilateral shape (see [22, 87], e.g.) showed that the minimum achievable error bars in this case are  100 and  60, for WMAP and Planck, respectively. (Note how the larger error bars in this case with respect to the local constraints do not reflect a higher sensitivity of CMB measurement to , but only the conventional choice of the normalization of the bispectrum amplitude in the definition of . The normalizations are in fact chosen in such a way that the bispectra have the same value for equilateral configurations , where the local bispectrum is suppressed and the equilateral bispectrum is peaked.) Additional shapes are studied by Smith and Zaldarriaga in [22].

3.8.2. Polarization

Babich and Zaldarriaga [86] showed with a Fisher matrix analysis that the CMB E-mode polarization measurements can be used to improve the sensitivity to . Although we have dealt so far only with temperature bispectra and related estimators, including polarization is fairly straightforward. As usual, the calculation starts from formula (31) linking the multipoles of CMB anisotropies to the primordial potential , but this time including the polarization radiative transfer in the convolution integral: The bispectrum is then defined in the usual way, but this time more configurations can be built by correlating temperature and polarization multipoles: The point to emphasize is that the polarization signal is generated on scales where the temperature signal is suppressed by Silk damping. The reason behind this can be briefly illustrated as follows: both the temperature and the polarization patterns that we observe in the CMB are produced by Thomson scattering of photons by electrons in the primordial plasma. Due to the Physics of Thomson scattering, in order for polarization to be generated it is necessary for the incident radiation field to be anisotropic (see, e.g., [88]). More precisely, the angular distribution of the intensity of the incoming radiation must present a non-vanishing quadrupole. However the Thomson scattering itself tends to isotropize the incoming radiation field. For this reason, on scales where the Thomson scattering is efficient (i.e., in the tight coupling regime of the photon-baryon fluid), polarization is not produced. In order to seed a significant quadrupole in the incoming photon distribution it is necessary to go below the free streaming scale of the photons in the primordial plasma, where the weak coupling between photons and electrons makes the isotropization process inefficient. However at these scales the free diffusion of photons with different temperatures damps the temperature fluctuations (a phenomenon known as Silk damping). For this reason we conclude that, on scales where temperature anisotropies are generated, polarization anisotropies are basically absent (because in the tight coupling regime the radiation intensity distribution does not present a quadrupole), and vice versa on scales where polarization is produced, temperature anisotropies are damped by diffusion processes. (This is strictly true only if we consider anisotropies generated at recombination. If we include a period of Early Reionization, then this picture is slightly changed, since after reionization polarization is generated on very large scales, where temperature anisotropies are present as well. For more details see, for example, [54] and references therein.)

The polarization bispectra thus open a window over a new -range in the projection and increase the overall information available. In other words, since the new configurations , , and so forth, including polarization, are partially independent of the pure temperature () bispectrum, adding those additional configurations to the Fisher matrix (and to the actual estimation from data) increases the total signal available. The Fisher matrix expression now becomes where and run over the and superscripts. We still work in the assumption that all of the quantities involved are Gaussian, but now the different bispectra of temperature and polarization are correlated for a given configuration , thus defining a multivariate Gaussian distribution. The full covariance matrix between bispectra (indicated by in the formula above) has then to be evaluated. A numerical evaluation of (129) shows [86] that, for an ideal (i.e., noiseless) experiment, adding the polarization signal produces an improvement of a factor  2 on constraints. For WMAP, adding polarization bispectra produces very little improvement, since polarization data are mostly noise dominated. For Planck, however, including polarization does generate a significant improvement, bringing the forecasted error bars from   to . Some error bar forecasts from temperature and polarization bispectra as a function of for different experimental designs including WMAP and Planck are shown in Figure 11. Motivated by this analysis, Yadav et al. [82, 89] have implemented a bispectrum estimator of including both temperature and polarization bispectra. All of the general considerations about optimality and the numerical implementation techniques described in previous sections apply in an analogous way to the temperature polarization case, although the presence of additional bispectra with a nontrivial covariance matrix introduces a few additional technical complications. We refer the reader to the study by Yadav et al. in [82, 89] for further discussion.

3.9. Non-Gaussian Contaminants

So far, we have considered only primordial non-Gaussianity as a source of the three-point function of the CMB. However many other astrophysical and cosmological effects can produce an observable angular bispectrum. Among these, diffuse astrophysical foreground emission (see, e.g., [91, 92] and references therein) unresolved point sources (see, e.g., [78]) and secondary anisotropies are probably the most important NG sources. Since the main focus in this review is on the primordial bispectrum, we will not describe these NG sources in great detail. We will however outline in this section their main effects in order to understand whether, and how, they could contaminate an estimate of primordial NG. Let us consider a number of sources of a CMB bispectrum signal and call the bispectrum produced by the th source. Let us also indicate with the primordial component of the bispectrum calculated for . For our purposes, is the signal that we want to measure, while the other bispectra are contaminants that we would like to eliminate. The total bispectrum of the map in presence of these contaminants is then where is the amplitude of the th bispectrum. If we have a precise prediction of the bispectra generated by the contaminants, we can then think of extending our estimator to a joint estimator of all of the amplitude parameters. The optimal cubic estimator defined in (72) would then be generalized to the multiparameter case by minimizing the following : The new errors on in this case can be forecasted as usual by means of a Fisher matrix analysis. The Fisher matrix described in the previous paragraph can be generalized straightforwardly to the multiparameter case. In this case, becomes an array whose entries are defined as The optimal errors on a given amplitude (including ) then become, according to the multidimensional generalization of the Rao-Cramer bound, where the crucial point to notice is that we now first invert the Fisher matrix and then we take the square root of the diagonal elements to find the errors. This is the error that is obtained when the full joint-parameter likelihood is calculated and then the 1-dimensional likelihood for a given parameter is obtained by integrating out all of the other degrees of freedom: a process defined in statistics as marginalization. One can see that the inverse of the Fisher matrix defines the covariance matrix of the parameters under study. If the various parameters are completely uncorrelated, then the Fisher matrix is diagonal and we would have , showing that the parameters can obviously be estimated independently and the marginalization process does not change the error bars on a given parameter of interest (in our case ). If the different parameters are correlated, however, then off-diagonal terms appear in the Fisher matrix, and the error bars after marginalization (i.e., the “real” error bars to quote in the results) are larger than those that would have been obtained by naively neglecting contaminants. An obvious but useful observation is that two bispectral amplitudes will be strongly correlated when the respective shapes are similar. To make a practical example, the bispectrum generated by correlating weak lensing of CMB anisotropies with the Integrated Sachs-Wolfe (ISW) effect can be shown to be peaked on squeezed configurations. For this reason the presence of this effect can be a significant contaminant for estimates of local non-Gaussianity.

So far, in this section we have described the degradation effects on the error bars if a hypothetical joint estimator of all of the CMB bispectrum amplitudes was built, and the amplitudes of contaminants were marginalized over to estimate . However a joint estimation might be difficult, due to factors like the presence of theoretical uncertainties on the shapes of contaminant bispectra or possible practical difficulties in finding an efficient implementation of this full bispectrum-likelihood estimator (e.g., if the additional secondary bispectra are nonseparable). As a result, the practical approach so far has been to estimate only using the techniques described in previous sections and neglect possible nonprimordial contaminants. In this case the possible effect of contaminants would not show up as a degradation of the error bars but in an even worse way, by introducing a bias in the measurements. Let us see this by assuming that the CMB three-point function takes contributions both from a primordial NG component and from a contaminant bispectrum with amplitude . Let us also assume that we can produce a set of NG Monte Carlo simulations of CMB maps including both bispectra. We assign a given in input to the primordial component of our simulated maps. Finally we estimate the average obtained from the simulations by applying the usual optimal cubic statistic described so far. The result of our MC average will be where is the averaged bispectrum extracted from the map. The term on the r.h.s. of the second line comes from the fact that the normalization is chosen in such a way as to obtain an unbiased estimator of the primordial component. However a second term is present, which accounts for the fact that a contaminant bispectrum, is in the map; this term clearly biases the estimator. (Let us remember with that by definition an estimator of a parameter (in our case ) is unbiased if , being the estimate from data and being the true parameter of the underlying model.) The magnitude of the bias will depend again on how similar the shape of the contaminant bispectrum is to the primordial one. If, for example, the contaminant bispectrum is strongly peaked on equilateral configurations and suppressed on squeezed ones, a local estimator of NG will then not be significantly biased by it, since the second term in equation (134) will cancel out. However, an estimate of will in this case be significantly biased.

In general we can define the correlation coefficients between two bispectra, labeled and , as The definition of “correlation coefficient” becomes completely transparent if we rewrite the previous formula in terms of the Fisher matrix and keep in mind that defines the covariance matrix of the bispectrum amplitudes: The correlation coefficient varies by definition from , for totally uncorrelated shapes, to , for identical shapes, or for totally anticorrelated shapes. The more a given contaminant bispectrum is correlated to the primordial bispectrum that we want to measure, the larger will be the induced bias. At this point we distinguish between three possibilities. The first is that the contaminant bispectrum shape and amplitude are perfectly known. In that case we can compute the expected bias from formula (175) and subtract from our estimate. The second possibility is that the shape of the contaminant bispectrum is known, but its amplitude is defined with a given uncertainty. In this case we can propagate this uncertainty by quoting it in addition to the statistical error bars on obtained in the usual way. The third and worst possibility is that we are unaware of the presence of some contaminant effect, or we know nothing about its bispectrum. In this case we might obtain a biased estimate of without knowing it and thus eventually misinterpret a spurious NG effect as primordial NG. Contaminants are then very dangerous, because, if not properly taken into account, they can lead to spurious claim of detection of primordial NG. For this reason, if a positive detection of were to be made at some point for a certain model, all possible tests for the presence of contaminant effects should be performed. Moreover, since we cannot be absolutely sure that we are considering all possible sources of NG contamination, cross-validation of the result using other non-bispectrum-based estimators will be very important. These other estimators (Minkowski Functionals, wavelets, needlets, higher-order correlators are just some examples among those considered in the literature) are by construction suboptimal estimators of the primordial component. However, in principle they are expected to produce a totally different response to NG contaminants than the primordial bispectrum. A cross-detection of with many different statistics would then be much less likely due to some unknown spurious effect. Another way to test the primordial origin of an observed NG signal, recently proposed by Munshi and Heavens in [93], is to modify the optimal bispectrum estimator in order to evaluate a function of rather than a single amplitude . The point is that, if a clear detection of is achieved at several , then the signal is large enough to allow a less radical data compression. Munshi and Heavens [93] have then recently proposed to estimate the “bispectrum-related power spectrum” defined as Like in the usual estimator, the optimal weighting is included, and the observed bispectrum from the map is correlated to the theoretical shape. However in this case we do not measure the overall amplitude, but rather the amplitude for each -bin. Note that By construction of , the usual estimator is then retrieved by summing the bispectrum-related power spectrum over all . The general idea is now that the functional dependence of this skew power spectrum on will show significant variation between different sources of NG, allowing a clearer test of the hypothesis that the origin of the observed signal is primordial. A number of investigations of WMAP data have already been performed using this statistic in order to look for primordial and secondary signals [94, 95], and related pseudo- statistics have been developed by Munshi et al. in [96].

In any case, as long as bispectrum estimators are considered, independently of the specific statistic or implementation, the best way to deal with NG contaminants is to make sure to list all of them and study their bispectra, or at least find ways to assess their potential impact on the final results. In the following paragraphs we will then turn our attention to a classification of the most important potential sources of spurious NG and see how they are treated in the primordial bispectrum analysis. Finally, we will consider some effects that interact with the measurement not necessarily by directly producing a secondary bispectrum, but rather by changing the normalization of the estimator or by increasing the error bars without producing any bias.

3.9.1. Diffuse Foreground Emission

There are three main astrophysical effects producing a galactic microwave emission from our galaxy in the typical frequency range of a CMB experiment [54, 91]: free-free emission from electron-ion scattering, synchrotron emission from acceleration of cosmic ray electrons in magnetic fields, and thermal dust emission.

Since these sources produce signals with a peculiar spectral and spatial distribution, multifrequency observations allow the separation of them from the primordial component of the CMB signal by suitable component separation algorithms. In the resulting “cleaned” map the foreground contribution to is minimized, although obviously it can never be completely eliminated. The remaining foreground contamination after cleaning is called the foreground residual. Note that the emission from the galactic plane of the CMB map is so strong that a clean separation of the primordial CMB component from the foregrounds is impossible. The galactic regions that are too contaminated to produce a clean component separation have to be masked out in the analysis. The size of the galactic mask will depend on the choice of the foreground flux level above which the pixel is considered too contaminated to be included in the analysis. The choice of the cutoff will depend on the specific analysis that one wants to perform on the data. Since the primordial NG signal is much smaller than the Gaussian component, more conservative masks (i.e., larger) need generally to be used for estimates than those applied to estimation. Direct information about the spatial distribution of foreground emission in the sky (i.e., free-free, synchrotron, or dust) is provided in the form of templates, obtained either from the most foreground contaminated channels of the CMB experiment itself, or from external astrophysical surveys (e.g., observations of radioemission, maps of H emission). Templates are affected by several sources of uncertainties and errors (see, e.g., [91]), and using them in assessing the possible impact on of foreground emission or residuals has both advantages and disadvantages. The safest approach is probably to combine internal consistency tests on the data with analysis involving the use of templates.

The first extensive tests of possible foreground contamination in measurements were performed by Yadav and Wandelt in [79], where a detection of a primordial local signal at above level on WMAP 3-year data was claimed. As explained earlier (see caption of Table 3), further analysis on more recent datasets and/or using more optimal estimators have led to an updated estimate that is about 2- away from the origin, that is, just a “hint” of a possible local signal, rather than a detection. However, as long as a detection was claimed by Yadav and Wandelt in [79], tests to exclude a possible contamination from diffuse foregrounds had to be carried out. In this case the authors relied mostly on the “internal consistency test” approach. Their analysis included the following. (1) Expanding the original galactic mask in order to see whether the estimated value of is stable for different choice of the mask. A significantly lower value of for a larger mask might mean that some unmasked noise contribution is affecting the measurement with the original mask. (2) Comparing estimates from foreground-reduced maps to estimates from “raw” maps that include a galactic mask, but have not gone through a component separation process. If foregrounds have a significant impact on then one expects the measurements from raw and reduced maps to differ significantly. (3)Comparing different frequency channels. If foregrounds significantly contaminate measurements at given frequencies, then different channels should produce different results.

Analyses involving some kind of prior information about foreground emission were carried on by both Yadav and Wandelt [79] and Smith et al. [68]. The two approaches adopted in this case were the following. (1) Producing simulations including both a Gaussian primordial CMB signal and the foreground emission. The latter has in this case to be generated according to a model that allows for a good reconstruction of the observed templates. The estimator can then be applied to these simulations in order to check whether the measured is consistent with 0 (as it should be, in absence of significant foreground contamination, since the primordial input is Gaussian). (2) For an optimal estimator including full prefiltering [68], adding the foreground templates to the noise covariance, by assigning infinite variance to each template . In this way the estimate is “blind” to the template amplitudes. This produces a loss of information that in turn determines an increase of the variance. The larger the contamination from foreground is, the more the variance increases. For negligible contamination, the variance stays the same. In any case, the effect of foregrounds is entirely included in the error bars, provided that the assumed templates are accurate enough. This method of analysis, called template marginalization, is adopted by Smith et al. in [68]. A complete mathematical derivation of this method is provided by Rybicki and Press in [97].

In addition to the methods outlined above, there is also the possibility of using the foreground templates for a joint estimation of and of the templates amplitudes (see equation (131)). This approach has been recently used by Cabella et al. in [98] for a needlet estimator. It could be obviously reapplied in the same form to a bispectrum estimator.

In conclusion, all of the tests above have been applied to WMAP 3-year and 5-year data releases. No evidence for the presence of a significant contamination of the local measurement from diffuse foreground was produced. Other shapes of were not considered since the only type of non-Gaussianity that has produced a marginal detection is so far the local one. Although diffuse foregrounds and foreground residuals do not seem to contaminate primordial NG measurements in WMAP, this is not guaranteed to hold true for Planck, due to its much higher sensitivity.

3.9.2. Unresolved Point Sources

Extragalactic point sources are the most important foreground at small angular scales (see [99]). Sources are identified by searching the maps for bright spots that fit the beam profile and then masked out. However not all of the sources can be resolved and eliminated in this way. Unresolved point sources contaminate the map and are a source of a NG signal that can potentially interfere with primordial NG measurements. Unclustered extragalactic point sources have a Poisson distribution and their bispectrum is then simply a constant: with an amplitude that has to be estimated from the data and depends on the level of contamination from unresolved sources. We can now use (136) to estimate the correlation between primordial shapes and the point source bispectrum. For a given choice of the amplitude we can also estimate the expected bias on the estimator. Simulations of NG maps including the bispectrum from point sources can also be produced and the primordial estimator for different shapes applied to them in order to estimate the bias. Finally, since is manifestly separable, an estimator of can be built. All of these analyses were performed by Komatsu et al. in [1, 78] on local and equilateral shapes to conclude that point sources do not contaminate significantly the estimate of . On the other hand, they have a larger impact on : their induced bias from MC simulations is , to be compared to the statistical error bar . Additional tests were performed by Smith et al. in [68] to account for the possible presence of clustered unresolved point sources. No significant contamination on was found in this case. As for the diffuse foreground case, the enhanced sensitivity that Planck can achieve with respect to WMAP might increase the impact of these effects.

3.9.3. Secondary Anisotropies

One big advantage of using CMB anisotropies to test primordial NG is that they are small and can then be treated in the linear regime. The CMB temperature fluctuation field is thus linked to the primordial potential through a linear convolution with radiation transfer functions, as we saw earlier. At this level, the Gaussianity of the primordial potential is conserved in the CMB temperature fluctuation field. If, however, we work at second-order in perturbation theory, the initial conditions are propagated nonlinearly into the observed CMB anisotropies, and the resulting CMB fluctuations are mildly non-Gaussian even starting from a Gaussian primordial curvature field. Second order effects are clearly very small. However they may well be of the same order of magnitude as primordial NG, since the NG component of the primordial potential is . In conclusion, secondary anisotropies are a potential source of CMB NG, at a level that could in principle contaminate estimates of primordial non-Gaussianity. To fully account for these effects, it is necessary to obtain a relation analogous to equation (31), but to second-order in perturbation theory. Radiation transfer functions are obtained at first order by solving the linearized system of Boltzmann-Einstein equations (see, e.g., [54, 55]). The same equations will then have to be expanded and numerically integrated at second order in this case. Having obtained second order transfer functions, the full angular bispectrum of secondary anisotropies can be calculated and correlated to the primordial one in order to check for the presence of contaminant effects. The full system of second order Einstein-Boltzmann equations has been derived in [100103] and partially integrated numerically in [104] including only the source terms that can be written as product of first-order perturbations. These terms have been shown to produce a totally negligible NG contamination. Very recently, the study in [105] has integrated numerically the full Einstein-Boltzmann system of equations at second order, including “genuine” second-order terms, but neglecting late-time effects, that is, contribution to the temperature anisotropies coming from the evolution of the gravitational potentials due to the late-time acceleration of the Universe or to reionization effects. According to this calculation, second-order early-time effects (i.e., Sachs-Wolfe and acoustic oscillations at second order) are able to bias the estimator at level of at the angular resolution achieved by the Planck satellite (), while a negligible contamination is expected for WMAP. Although a formal solution of the full system of equations has not been obtained yet for the “late-time” source, many late-time secondary effects are known and have been modeled for some time. Among these there are, for example, weak lensing, Sunyaev-Zeldovich (SZ) effect, Rees-Sciama (RS) effect, and so on. Therefore, a natural approach that was adopted in the literature consisted in studying the bispectra arising from these well-known effects and from their correlations taken one-by-one (e.g., ISW-lensing correlation, SZ-lensing correlation, and so on). It goes beyond the purpose of this review to discuss in detail these results and their implications. Let us just mention them briefly. A fisher matrix analysis in the study of Serra and Cooray in [106] showed that the combination of bispectra arising from ISW-lensing, SZ-lensing, and unresolved point sources produced a negligible contamination at the angular resolution and sensitivity of WMAP, but a significant one for an experiment with the characteristics of Planck. It was in particular shown that estimates of local NG would be biased, especially by ISW-lensing correlation, with for local NG. A similar result on ISW-lensing was obtained in another Fisher matrix analysis by Smith and Zaldarriaga [22], and a similar level of contamination was found by Mangilli and Verde in [107] by adding to the ISW-lensing signal also the analogous RS-lensing bispectrum. A bispectrum estimator of local and equilateral NG was applied to simulated lensed primordial NG CMB maps by Hanson et al. [108], and three main effects were studied: a possible bias induced by neglecting the lensing of primordial bispectrum in the normalization and weights of the estimator, an increase of the variance due to lensing-produced higher-order correlators, and ISW-lensing bias. The only significant effect turned out to be the ISW-lensing bias on , at a level confirming the Fisher matrix predictions. Note that this bias, being well known and expected, can be simply calculated and subtracted from future Planck estimates, as well as the early-time bias discussed above. The reason why the coupling between lensing and ISW tends to bias the local estimate can be understood physically: large-scale potential fluctuations source the ISW effect and produce a lensing signal on small scales, generating a NG signal on squeezed triangles. Although both the primordial local bispectrum and the ISW-lensing bispectrum are peaked on squeezed triangles, the presence of acoustic oscillations in the primordial configurations reduces the overall correlation between the two shapes, thus making the final bias significant, but not too large. Another recently studied effect is that of inhomogeneous recombination caused by perturbation in the electron number density as in the studies of Khatri and Wandelt in [109], and in Senatore et al. [110]. Also in this case the contaminant shape is close to local. Although smaller than the ISW-lensing induced bias, also this effect seems to be able to affect the primordial estimate at a level marginally detectable by Planck. In order to conclude our brief survey of studies of secondary bispectra, let us finally mention the work done by Babich and Pierpaoli in [111], where point source density modulation bispectra induced by lensing magnification and selection effects, as well as SZ modulation from lensing magnification, were studied. The conclusion was once again that these effects are negligible for WMAP but close to the sensitivity level of Planck for local NG. Despite the great attention received so far in the literature, more has yet to be done in the area of assessing NG contamination from secondary sources. Note in particular that all of the predictions in this section concern temperature anisotropies, and estimates on secondary polarization bispectra are yet unavailable. It is clear that a complete and accurate description of secondary bispectra will be crucial for analysis of the future Planck dataset. A summary of the contribution of the various effects described in this paragraph, as well as of the other sources of contamination considered in this section, is presented in Table 4 and relative caption.

3.9.4. Non-Gaussian Noise

Systematics are another potential cause of contamination beyond astrophysical and cosmological sources. The noise in the experiment is generally well described as Gaussian. However possible non-Gaussian properties have to be tested in our context. This was done by Yadav and Wandelt in [79] by taking differences of yearly WMAP data in order to create jackknife realizations of WMAP noise maps for different detectors, including instrument systematics. The estimator can then be applied to these realizations in order to check that a negligible is measured. This was the result obtained on the WMAP 3-year dataset.

3.9.5. Other Effects

In this section we quickly summarize other effects that could interfere with estimates of primordial non-Gaussianity, but did not fit the classification above in the sense that they do not correspond to NG effects contaminating the CMB sky or the instrument noise.

One of these effects is noise, expected to affect especially the low-frequency channels of Planck. The noise component is generally removed from the map using “destriping” algorithms (see, e.g., [114, 115]). The unsubtracted “destriping residuals” form a Gaussian-correlated random field in pixel space. Their nontrivial covariance matrix should in principle be included in the inverse covariance prefiltering of the optimal estimator. If not included in the prefiltering, this effect could in principle enhance the estimator error bars (although it cannot generate any bias, since it is Gaussian). Unfortunately, a full numerical evaluation of this covariance matrix is quite challenging. Donzelli et al. [113] applied the estimator in its pseudooptimal implementation to maps of Gaussian CMB signal noise, accounting only for anisotropic noise in the linear term, but including destriping residuals in the noise model adopted for the simulations. The final result shows that the error bars do not increase when noise effects are included in the simulations, even though they are neglected in the covariance matrix appearing in the estimator.

Another effect to take into account for Planck is that of an asymmetric beam. The beam in the estimator normalization term is approximated as a circular beam. However Planck optical simulations (see, e.g., [116]) show that in reality we have to deal with elliptic beams, characterized by a nontrivial azimuthal dependence. If the circular beam approximation in the normalization of the estimate is not accurate enough, a bias could be introduced. Moreover the anisotropy of the beam could cause an increase of the variance if neglected in the inverse covariance prefiltering. Again, these effects were found to be negligible in tests on realistic simulations performed by Donzelli et al. [113].

Finally, the estimate of is done assuming a given cosmological model, that is, by fixing all of the other cosmological parameters to their best-fit value obtained from a likelihood analysis of the angular power spectrum. Since they are themselves the product of a statistical estimation process, these values obviously present uncertainties that should be propagated into the final -error bars. (In particular, since we are not doing a joint-likelihood estimation of all of the parameters and marginalizing to get (that would be the optimal but time consuming approach), the effect of uncertainties in the parameters propagate onto the measure as a bias. This bias has to be evaluated and quoted in addition to the usual statistical -error bar.). This calculation was done by Liguori and Riotto in [112], where it was found that the propagated error is dependent and it can become important only if a large will be detected in the data at some point.

3.10. Generation of Simulated Non-Gaussian CMB Maps

In this section we will describe algorithms for the generation of non-Gaussian CMB maps with a given bispectrum. There are three main reasons why primordial NG simulations of the CMB are useful in the context of bispectrum estimation of as follows. (1) To test the unbiasedness of the bispectrum estimator (by checking that the Monte Carlo average of the recovered reproduces the set in input). (2) To study how the expected primordial NG signal imprinted in the CMB is modified by the presence of other effects, like those considered in Section 3.1. For example, weak lensing of primordial NG might in principle change the observed bispectrum and affect the estimates. This can be studied again by testing the estimator on NG-lensed simulations, as it was done by Hanson et al. in [108].(3) For local NG, to obtain the error bars of the estimator if a large is detected at several (see Section 3.5.4). We have previously seen that for a several-sigma detection of local NG the bispectrum variance is dependent. The Monte Carlo average (117) thus has to be evaluated on NG simulations with the measured in input.

Unless we are in the situation described at point of the list above, all we need to produce is then maps with given power spectrum and bispectrum, since higher-order correlators can be neglected. In the large local case higher-order correlators are instead important and have to be included. Fortunately the local case is the only one for which we have a full expression of the primordial potential that allows us to produce exact simulations.

We will divide this section into two parts. In the first we will describe exact simulation algorithms of local NG, while in the second we will describe methods to generate maps with given power spectrum and bispectrum, starting from an arbitrary primordial shape.

3.10.1. Algorithms for Local Non-Gaussianity

First of all, let us recall that the CMB multipoles are related to the primordial gravitational potential through the well-known formula where are the radiation transfer functions and the potential is written in Fourier space. We already met this formula when we calculated the relation between the primordial and CMB bispectrums in Section 3.1. We also recall that the local non-Gaussian primordial potential takes a very simple expression in real space, where In the previous expression is a Gaussian random field, characterized by a primordial power spectrum ; in the following we will refer to as the Gaussian part of the primordial potential. The remaining non-Gaussian part of the potential is simply the square of the Gaussian part point-by-point (modulo a constant term, necessary to enforce the condition ; however it is clear that this term only affects the CMB monopole). It is then convenient to work directly in real space and recast formula (140) in the following form: where , also used in (32), is the real-space counterpart of the radiation transfer functions , is a spherical Bessel function, and is a look-back conformal distance. This formula suggests to structure an algorithm for the generation of local CMB NG maps in the following steps. (1)Generate the Gaussian part of the potential in a box whose side is the present cosmic horizon. (2)Square the Gaussian part point-by-point to get the non-Gaussian part. (3)Expand in spherical harmonics the Gaussian and non-Gaussian parts of the potential for different values of the radial coordinate in the simulation box. (4)Convolve the spherical harmonic expansions of and with the radiation transfer function in order to obtain the Gaussian and non-Gaussian parts of the multipoles of the final NG CMB simulation. For a given choice of the non-Gaussian parameter a CMB map is then obtained simply through the linear combination (the superscripts and NL are always indicating Gaussian and non-Gaussian, resp.).

The most difficult and time-consuming part in this process is actually the generation of the Gaussian part of the potential . One possibility is to generate the Gaussian part of the potential in a cubic box in Fourier space, where different modes are uncorrelated and have variance given by the primordial power spectrum , then apply a Fast Fourier Transform (FFT) algorithm to go to real space. Cartesian coordinates are then transformed into spherical coordinates by means of an interpolation algorithm in order to transform into . Finally, the Gaussian potential in spherical coordinates is squared point-by-point to get the NG part, and the spherical harmonic expansion and radiation transfer function convolution at point 4 of the list above are performed in order to obtain the multipoles of the final CMB map. The aforementioned algorithm was implemented by Komatsu et al. in [78] to generate NG local CMB maps at the resolution of the Planck satellite.

The difficulty with this approach arises from the fact that we are working in a box of the size of the present cosmic horizon (about Gpc in conformal time), but at the same time a cell in this box must have a side no bigger than  Mpc in order to resolve the last scattering surface, where most of the CMB signal is generated. A more convenient and accurate way to produce the local NG was found in [117, 118]: the idea is to work directly in spherical coordinates, use a nonuniform discretization of the simulation box (since no sample points are needed in a large region of the box where photons are just free streaming, while many sample points are needed at last scattering, as we just pointed out above), and generate the multipoles of the expansion of through the following two-step approach.

Generate uncorrelated radial multipoles , Gaussianly distributed and characterized by the following spectrum: where is the Dirac delta function.Filter the multipoles with suitable functions in order to produce a Gaussian random field with the properties of the multipole expansion of the primordial Gaussian potential It can be shown that the expression of the filter functions is where is the primordial curvature power spectrum, and the filtering operation takes the form In the last expression are the desired quantities, that is, the multipoles of the expansion of the Gaussian part of the primordial potential for a given .

This algorithm, recently improved by Elsner and Wandelt in [119], was used to produce NG local maps at the resolution of WMAP and Planck in temperature and polarization. An example of its results is shown in Figure 12(a). Figure 13 shows 1-point PDFs of temperature anisotropies for different values of , extracted from these simulations.

3.10.2. Algorithms for Arbitrary Bispectra

In the limit of weak non-Gaussianity, an algorithm to produce non-Gaussian CMB simulations with a given power spectrum and bispectrum for separable primordial shapes was described by Smith and Zaldarriaga in [22]. In this algorithm the non-Gaussian components of the CMB multipoles are obtained using the following formula: where is the Gaussian part of the CMB multipoles, generated using the angular power spectrum , while is the given bispectrum of the theoretical model for which simulations are required. Note that alternative algorithms to generate CMB maps with given bispectrum have been proposed in the literature [120, 121], but they are less general than the one introduced by (146). Although (146) is completely general, as before its numerical evaluation is only computationally affordable for bispectra that can be written in separable form. We have emphasized already that separability results in a reduction of the computational cost of the estimator (100) from to operations; the same argument applies here and allows to rewrite (146) into an equivalent form in pixel space. Starting from formula (47), and substituting it in (146), we find that As already discussed in the -estimator section, the limitation dictated by separability is clearly overcome by using the eigenfunction representations for the bispectrums (26) and (60) introduced by Fergusson et al. in [14]. As usual, the basic idea is to start by expanding an arbitrary bispectrum shape (either primordial or in the CMB) using a separable polynomial decomposition until a good level of convergence is achieved and then to substitute the mode decomposition into (146) to get a linear combination of numerically tractable terms written in the form of (147). Using the separable mode coefficients for the reduced bispectrum (56) and the filtered map expressions (110) as the starting point, we find that the expression (147) generalises to where the are found by summing using a set of Gaussian 's convolved with the 's (refer to (108)): Here, the accuracy of convergence with is parametrized in terms of the correlation between the original nonseparable shape and the eigenmode expansion, as defined previously (19). Note that this convergence can also be checked more accurately using the full Fisher matrix correlation on the CMB bispectra , described in Sections 3.8 and 3.9.

In addition to the bispectrum separability requirement, there is an important further caveat which can prevent the straightforward implementation of the algorithm (146). By construction, terms and higher are not explicitly controlled. Following the discussion in [108], we can write the connected N-point functions as Thus the condition that the map has the power spectrum specified in input will only be satisfied if the power spectrum of the non-Gaussian component in (150) remains small. Since this method does not control terms, one has to ascertain that spuriously large contributions do not affect the overall power spectrum significantly. It turns out that this effect plagues current map simulations if the standard separable expressions for the local and equilateral bispectra are directly substituted into (146). However a slight modification of (146), described by Hanson et al. in [108] and Fergusson et al. in [14], allows us to overcome this problem at no computational cost. Moreover, it was shown by Fergusson et al. [14] that maps obtained from the eigenmode expansions (26) and (60) are stable independently of the shape under study, thus making this map-making generating algorithm robust and fully general. Examples of DBI NG maps produced by combining the eigenmode expansion method with the map-making algorithm described in this section are shown in Figure 12(b).

4. Large-Scale Structure

In the standard scenario, early perturbations produced during inflation are responsible for the common origin of the CMB temperature fluctuations and the large-scale matter and galaxy distributions in the Universe, that is, the large-scale structure. The Cosmic Microwave Background provides a remarkable example of a Gaussian random field in nature. Information on cosmological parameters is in fact derived from measurements of its power spectrum, the 's, while bispectrum measurements from WMAP data remain consistent with zero. The distribution of matter, as we can infer today from shear or galaxy observations, unlike the CMB, can be described as a highly non-Gaussian random field, even for Gaussian initial conditions.

The matter overdensity is defined in terms of the matter density and its mean value by with zero mean by construction. Such quantity assumes, at late times, the limiting value in voids, accounting for a large fraction of the volume of the Universe, while it achieves values in collapsed objects such as dark-matter halos. Its probability distribution function is therefore expected, at low redshift, to depart strongly from a Gaussian distribution centred at , even though it could be well approximated by it at decoupling, when perturbations around were of the order of . Such non-Gaussianity is the result of the nonlinear evolution of structures subject to gravitational instability.

In addition, nonlinearities in the bias relation between the galaxy and matter distributions constitute a second source of non-Gaussianity in the large-scale structure mapped out by redshift surveys. Non-Gaussian initial conditions would therefore provide a third component in the non-Gaussianity of the galaxy distribution. The question regarding the detection of effects due to primordial non-Gaussianity, is therefore strictly related to our ability to distinguish between these different contributions and, ultimately, it will depend on the robustness of our theoretical predictions in the linear and mildly nonlinear regimes. From this respect, cosmological Perturbation Theory (PT), and its more recent developments, is very important for providing the tools to study the evolution of non-Gaussianities and how to differentiate their origin.

Considering only the matter distribution, the leading-order prediction in standard PT for the matter bispectrum at large scales is given by the sum of a primordial component and a component due to gravitational instability, which is present also for Gaussian initial conditions. Until fairly recently it was assumed that this picture could be easily extended to the galaxy distribution, with the galaxy bispectrum receiving an additional contribution due to nonlinear bias. Following the historical development of the subject, in Section 4.1 we will discuss early work on higher-order moments of the matter and galaxy distribution, starting with the skewness. We will then consider in Section 4.2 the matter bispectrum and its description in the Eulerian perturbation theory, with specific attention given to effects at large scales due to a primordial component, as well as at small-scale, nonlinear corrections in presence of non-Gaussian initial conditions. Here, most of the theoretical results on higher-order correlation functions are developed. In Section 4.3, we will deal with the galaxy bispectrum. We will first introduce the simple model based on local bias and discuss problems related to bispectrum measurements in redshift surveys with specific attention given to the detection of primordial non-Gaussianity. We will see how early results indicated that the galaxy bispectrum could be used as a tool to constrain non-Gaussian initial conditions which is, in principle, competitive with the CMB, illustrating this with actual results from current datasets. We will then consider the outcome of recent N-body simulations with non-Gaussian initial conditions showing that the simple prediction for the galaxy bispectrum assumed in most of the previous literature on the subject fails to describe not only the measured halo bispectrum, but even the halo power spectrum, even at large scales! We now know that correlators of biased populations such as galaxies and dark-matter halos receive large corrections, at large scales, from local primordial non-Gaussianity. These results opened up new and promising opportunities for detection in future large-scale structure observations. Although, in our view, a proper understanding of these effects remains to be adequately developed at the time of writing, particularly with respect to higher-order galaxy correlation functions, we will describe the different descriptions proposed so far in the literature and the prospects for detection of primordial non-Gaussianity in measurements of the galaxy bispectrum.

From a historical perspective, non-Gaussian initial conditions have been studied for quite a long time. For instance, early works on the clustering of density peaks and rare objects can be found in the study of Grinstein and Wise in [122], Lucchin and Matarrese in [123], Matarrese et al. in [124], while early N-body simulations with non-Gaussian initial conditions go back to the early eighties [125129]. In the early days, a large variety of non-Gaussian models, often defined in terms of a nonlinear transformation of a Gaussian field, were considered. In some cases, large non-Gaussian components were studied because, on one hand, they could be used to falsify some models and, on the other, as a way to reconcile contradictory observational results with theoretical frameworks. In this review, however, we will consider only models predicting small departures from Gaussian initial conditions which are consistent with CMB observations.

While we focus in this review on direct bispectrum measurements, it should be stressed that the effects of primordial non-Gaussianity on large-scale structure are not limited to corrections to its higher-order correlation functions. Aside from the recent results on the galaxy power spectrum mentioned above, significant departures from Gaussian initial conditions are expected to have important effects on the halo mass function and therefore on the observed cluster number density. See Section in the study of Sefusatti et al. in [130] for a brief overview of previous work and the studies of Afshordi and Tolley in [131], Dalal et al. in [132], Desjacques et al. in [133], Fedeli et al. in [134], Grossi et al. in [135], Lam and Sheth in [136], Lo Verde et al. in [137], Maggiore and Riotto in [138], Oguri in [139], Pillepich et al. in [140], and Valageas in [141], for recent theoretical and N-body results. In addition, the corresponding effect on the abundance of voids has been studied by Kamionkowski et al. [142], while the possibility of constraining primordial non-Gaussianity from measurements of Minkowski Functionals in large-scale structure has been explored by Hikage et al. [143, 144]. Further effects on the intergalactic medium and reionization [145, 146] or on future 21 cm observations [147, 148] have also been investigated. We refer the reader to other papers in this issue for a more complete discussion of these alternative approaches.

Finally, we should warn the reader that this section will not discuss analytical tools for the estimation of the non-Gaussian parameters corresponding or comparable to those described in the previous section for the CMB bispectrum, with the simple reason been that such tools have not being developed yet! In the first place, the physics of the CMB is simpler in the sense that the bispectrum of the temperature fluctuations at large scales is expected to provide the direct measurement of the initial bispectrum of the curvature perturbations, while the large-scale galaxy distribution is characterized, as mentioned above, by additional sources of non-Gaussianity for which we do not even have, at the moment, a proper model. In the second place, the optimal estimator for the parameter presented in Section 3.5 has been developed over several years to tackle the data provided by the WMAP satellite, which represented, so far, the best test of the Gaussianity of the initial conditions. The analysis of the galaxy bispectrum, on the other hand, did not have such timely and compelling motivations as it was understood, up until a couple of years ago, that large-scale structure observations will be able to provide results comparable to the CMB only in future, large-volume redshift surveys. We hope, nevertheless, that this review might provide a starting point for the development of a proper estimator from large-scale structure bispectrum measurements, possibly taking advantage of the techniques already introduced in the context of CMB observations.

4.1. The Skewness

This section serves as a brief historical overview. Since the first large-scale observations did not allow an accurate determination of individual bispectrum or trispectrum configurations, most of the attention in the early literature focused on the moments of the galaxy distribution, and, in the first place, on the third- and fourth-order moments, that is, the skewness and kurtosis, respectively. The “normalized” moment of order can be defined in terms of the smoothed density field as where the subscript “c” indicates the connected correlations. For Gaussian initial conditions, a perturbative treatment of the equations of gravitational instability predicts at leading order [149] with , computed in linear theory. Notice that we are neglecting here, for simplicity, additional and relevant contributions due to the smoothing of the density field (see [150, 151]). When non-Gaussian initial conditions are present, one expects an extra contribution to the skewness, typically with a different relation with , whose value depends on the non-Gaussian model. Comparisons between the second- and third-order moments, and , (as well as higher-order moments such as the kurtosis) measured in redshift surveys have been early recognized as a tool to test the Gaussianity of primordial perturbations [125, 152158]. These works recognized as well the importance of reliable predictions in the nonlinear regime and of a proper modeling of the effects of galaxy bias. In this respect, Fry and Scherrer [154] proposed a more quantitative prediction for the contribution to the galaxy skewness due to galaxy bias-based perturbation theory and on the local bias expansion of Fry and [159]. They derived, for the skewness of the galaxy distribution, an expression of the form where we assumed non-Gaussian initial conditions described by a non-vanishing initial skewness (but vanishing higher-order moments) and where and represent constant bias parameters typical of the galaxy population (which we will discuss explicitly in Section 4.3). This relatively simple expression describes the skewness measured in galaxy surveys, as the sum of three components corresponding to three sources of non-Gaussianity for the galaxy distribution: one primordial, one due to gravitational instability, and the last due to nonlinear bias. Further studies in perturbation theory can be found in [160, 161] while an alternative derivation of the smoothed moments of the density field based on the spherical collapse model has been studied in [162]. The skewness predicted by texture models has been studied in simulations as a function of the smoothing scale by Gaztanaga and Mähönen [163] and compared to measurements of the same quantities in the APM Galaxy Survey as in the study by Gaztañaga [164]; see Figure 14. The differences between the in the non-Gaussian texture model with respect to the Gaussian case provide a qualitative example of the typical effects that we expect for non-Gaussian initial conditions as a function of the smoothing scale and redshift. On the other hand, it should be kept in mind that early works focused on models of primordial non-Gaussianity characterized by a scaling of higher-order correlation functions quite different from the one induced by the parametrization.

The measured skewness, as higher-order moments, corresponds to a single number. Despite the possibility to study its peculiar dependence on the smoothing scale , it is nevertheless difficult to separate the different components, particularly with respect to bias effects. However, this possibility is offered in principle by direct measurements of the galaxy bispectrum, relying on its dependence on the shape of triangular configurations. In the next sections we will discuss in details first the bispectrum of the matter distribution then the bispectrum of the galaxy distribution, a direct observable in redshift surveys.

4.2. The Matter Bispectrum

In this review we will focus on the predictions for correlation functions in the Fourier space from the Eulerian Perturbation Theory (PT). This approach solves perturbatively the equations for the matter density and velocity field evolution governed by gravitational instability. These are the continuity equation, the Euler equation, and the Poisson equation relating the matter density and the gravitational potential. In the PT framework, the relation between the results and the initial conditions, given in terms of the initial correlators of the density field, is particularly transparent. Moreover, recent works have significantly extended, as we will discuss later, the predicting power of this specific tool. Different approaches are also available: see, for instance, the study by Scoccimarro in [165] for a comparison between bispectrum measurements in N-body simulations and predictions in the Lagrangian Perturbation Theory. We refer the reader to the study by Bernardeau et al. in [70] for a comprehensive review of cosmological perturbation theory of the large-scale structure.

4.2.1. Leading-Order Results in Perturbation Theory

As mentioned before, we consider specifically models where non-Gaussian initial conditions are completely given in terms of the correlators of the curvature perturbations at early times, and the mechanism responsible for the extra non-Gaussian properties of the density field is not active during the subsequent evolution of matter perturbations, governed only by gravitational instability. In PT, the solution for the evolved matter density contrast is expressed as a series of corrections to the linear solution [166]: where each term can be written formally as (From now on, we will adopt a different convention for the Fourier transform with respect to the one used for the formulae in previous section. The present convention is more common in the large-scale structure literature and conforms with the one adopted in the classical paper by Bernardeau et al. in [70]. ) with representing the symmetrized -order kernel in PT. The initial conditions in the Gaussian case are completely specified by the linear power spectrum , with , where we adopt the notation . Non-Gaussian initial conditions are described, in the first place, by a nonzero expression for the three-point function of the linear solution, that is, . In turn, the initial matter correlators, that is, the correlators of the linear solution , are given in terms of the correlators of the curvature perturbations as where we introduce the function with being the matter transfer function and the growth factor, expressing Poisson's equation in the Fourier space as Notice that we denote with the primordial curvature perturbations, that is, evaluated during the matter-dominated era, not their value linearly extrapolated at present time. (This choice, not unique in the literature, is particularly convenient since curvature perturbations are constant during matter domination. Also, it conforms to the definition of in terms of assumed in the CMB literature on observational constraints and specifically in the study by Komatsu and Spergel in [74]. ) The linear, that is, initial, power spectrum is given by while the initial bispectrum and trispectrum are Notice that, given these simple relations between curvature and primordial matter correlators, issues such as the property of separability discussed in Section 2.2 for the CMB bispectrum are not present in the case of three-dimensional, large-scale structure observables.

The nonlinear power spectrum is obtained perturbatively from the expansion where the term corresponds to the linear solution, , while the other terms represent, in analogy with perturbation theory in quantum field theory, one- and higher-loop corrections as they involve integrations over internal momenta. In particular, the term vanishes for Gaussian initial conditions as it depends on the initial bispectrum (see [167] for an analysis of nonlinear corrections to the matter power spectrum due to primordial non-Gaussianity).

In a similar fashion, nonlinear corrections in (157) provide a perturbative expansion for the matter bispectrum: In this case, the leading-order contributions are given by the tree-level terms and , with the first being the initial component and the second corresponding to a contribution to the matter bispectrum due to gravity alone, of the form Notice that this contribution is present even for Gaussian initial conditions as it depends only on the initial power spectrum and describes the emergence of non-Gaussianity due to gravitational instability. The leading-order, tree-level expression of the matter bispectrum with non-Gaussian initial conditions is therefore given in terms of the sum This expression corresponds to the first two terms on the r.h.s. of (156) for the skewness, which can be obtained from (168) by integration. For instance, the contribution to the skewness induced by gravity, in the unsmoothed case, is obtained from as and, integrating over the angles, corresponding to the result of (155).

The possibility of distinguishing the primordial component from the gravity-induced one relies on their specific and distinct dependence on scale, on the triangular configuration, shape and on redshift. For a primordial non-Gaussianity described by a curvature bispectrum obeying the hierarchical scaling , typical of weakly non-Gaussian models such as the local and equilateral ones, the different redshift and scale dependence of the two contributions are evident in their ratio for equilateral triangles (), given by (The first equality is in fact identical for local, equilateral, and orthogonal non-Gaussianity, simply by definition of the equilateral bispectrum, (54), introduced in [8] and of the orthogonal bispectrum introduced in [18], where and are precisely the amplitudes that provide the same value for the curvature bispectrum as the local model for equilateral configurations. ) We therefore expect, for a wide range of non-Gaussian models, the initial contribution to be larger at large scales and at high redshift. Figure 15(a) shows the two contributions and their sum for equilateral configurations as a function of . Moreover Figure 15(b), 15(c), and 15(d) show the effect of the primordial component for different non-Gaussian models, for values of the respective parameters corresponding to the current 95% C.L. limits [18, 68] and with the shaded area indicating the allowed region.

In addition, presents a specific dependence on triangle shapes, determined by gravitational instability and described by (167) at tree level. The shape dependence of , determined by the specific non-Gaussian model under consideration, is generically different. Such differences can be explicitly shown in plots of the reduced bispectrum, defined as which removes the redshift and scale dependencies of the gravity contribution. Figure 16 shows the reduced bispectrum at tree-level in perturbation theory, at for , as a function of the angle between and . In all panels, the continuous line represents the gravity-induced term which assumes larger values for nearly collapsed triangles, that is, for or . This indicates that the probability of finding larger values for the matter density in triplets of points forming a squeezed or folded triangle is larger than that for nearly equilateral triangles. This prediction is confirmed by the typical filamentary nature of the large-scale structure, evident from snapshots of N-body simulations or images of redshift surveys, since along these filaments it is easier to form collapsed triangles than equilateral ones. It should be stressed that the bispectrum is, in fact, the lowest-order statistic sensitive to the three dimensionality of structures and that these features are not captured by the information contained in the power spectrum alone. The effects of the primordial component on the matter bispectrum are shown by the dashed lines which correspond, as in Figure 15, to the 2- limits from CMB observations, in the case of the local (a), equilateral (b) and orthogonal (c) models while they correspond to the values in the folded case, for which no experimental bounds are available. Although the large scales and and the relatively high redshift have been chosen to enhance the effect of the non-Gaussian component, these triangles are not completely out of reach for future, large-volume surveys. Primordial non-Gaussianity modifies, in very specific ways, the shape dependence of the matter bispectrum produced by gravitational instability.

While the dependence of the matter bispectrum on scale and redshift is responsible for the specific behavior of the skewness of the matter density field on the smoothing scale and redshift, the sensitivity to the triangle shape is completely lost in analysis of the density higher-order moments. Instead, accurate measurements of the bispectrum, when achievable, offer in principle the possibility to disentangle the different contributions when triangles of different size and shape are included in the analysis.

The matter bispectrum is not, unfortunately, a direct observable. While we will discuss later how the statistical properties of the matter distribution can be inferred from galaxy redshift surveys, we should mention that the shear field in weak lensing surveys is another observable directly related to the matter distribution. The observational consequences on the weak lensing bispectrum, of a primordial non-Gaussian component (of the local type) such as the one in (168), have been explored by Takada and Jain in [168]. The authors find that the primordial component alone (i.e., without contamination from the gravitational one) could be detected if , assuming and a tomography over four redshift bins for a galaxy number density of arcmin. The large cosmic variance for low 's makes difficult the detection of the primordial component, prominent instead at larger scales. As we will see in the next section, primordial non-Gaussianity has some effect on small scales as well, due to the nonlinear evolution of structures.

4.2.2. Second-Order Corrections

The simple prediction of (168) for the matter bispectrum is expected to be valid at the largest observable scales and at high redshift, where nonlinear evolution is subdominant. Despite the fact that such conditions correspond as well to the regime where a detection of the initial component is favored, the effects of non-Gaussian initial conditions can be significant even at smaller scales and at low redshift. Since these effects are the result of nonlinear gravitational evolution and non-Gaussian initial conditions, it is no longer possible to identify distinct contributions resulting from distinct sources of non-Gaussianity, as it is the case for the tree-level expression of (168). Nevertheless, it is possible to distinguish individual corrections in PT to the matter bispectrum depending exclusively on the initial power spectrum , and therefore present as well for Gaussian initial conditions, and corrections depending instead on higher-order initial correlators, such as the initial bispectrum and trispectrum , which can be interpreted as small-scale effects due to non-Gaussian initial conditions. One-loop corrections in PT for Gaussian initial conditions have been studied by Scoccimarro in [169] and Scoccimarro et al. in [170], while the extension of these results to non-Gaussian initial conditions is studied by Sefusatti in [171].

A comparison of these results with measurements of the matter bispectrum in N-body simulations [133] with non-Gaussian initial conditions of the local kind can be found in the study by Sefusatti et al. in [172]. Figure 17 shows the equilateral configurations of the matter bispectrum measured in N-body simulations together with predictions from perturbation theory at tree level (dashed line) and one loop (continuous line). In particular, Figure 17(a) considers for Gaussian initial conditions while Figure 17(b) shows the same quantity divided by the tree-level prediction in PT to highlight the small-scales nonlinear behavior. Figures 17(c) and 17(d) show, respectively, the ratio and the difference between the matter bispectrum with an initial local component corresponding to and the Gaussian case. The agreement between one-loop predictions and the simulations results is quite remarkable, while we notice that the tree-level prediction fails to accurately describe the effect of primordial non-Gaussianity already at relatively large scales.

The significance of these relatively small corrections to individual configurations is to be considered in relation to the much larger number of configurations that can be measured as we include smaller and smaller scales, and they could lead to a measurable effect when considered in terms of the cumulative signal-to-noise ratio. On the other hand, these effects loose in part the shape dependence of the original initial bispectrum and require an accurate model (perhaps beyond standard perturbation theory) and strong priors on the underlying cosmological parameters to be distinguished from the nonlinear, “Gaussian” component. A step in the direction of improved predictions is offered by the promising results of the Renormalized Perturbation Theory [173175] and of the Renormalization Group approach [176, 177]. The extension of the latter to the case of non-Gaussian initial conditions has been recently considered by Bartolo et al. in [178], which studies specific predictions for the matter power spectrum and bispectrum.

4.3. The Galaxy Bispectrum

From the discussion above, we could expect that future, large-volume and high-redshift galaxy surveys will be able to directly detect a possible, large primordial component to the matter bispectrum by measurements of the galaxy bispectrum, or at least provide constraints on the non-Gaussian parameters comparable to the constraints from measurements of the CMB bispectrum. Such an expectation is motivated by the simple observation that the number of Fourier modes available in a three-dimensional, ideal, all-sky galaxy survey is in principle much larger than the number of modes available in two-dimensional CMB maps.

The galaxy distribution is, however, a less direct probe of the early Universe than the CMB temperature fluctuations. On top of the nonlinear evolution of structures and its contribution to higher-order correlation functions, one has to take into account the nonlinear nature of galaxy bias, being itself responsible for additional non-Gaussianity. An analysis of the galaxy bispectrum should therefore be able to detect a small primordial component by separating it from these primary contributions.

In this respect, an even more complex picture, due to additional and somehow unexpected effects of primordial non-Gaussianity on galaxy bias, has been emerging in the last couple of years, following the results of Dalal et al. [132]. N-body simulations have shown, in fact, that nonlinear bias and an initial bispectrum are not two distinct sources of non-Gaussianity for the galaxy bispectrum, not even at large scales! Instead a local initial component can significantly affect the bias relation precisely at large scales, adding extra corrections. In the spirit of a review and since we do not have, at the time of writing, a satisfactory model of the galaxy bispectrum in presence of non-Gaussian initial conditions, in this section we will summarize earlier results, while in Section 4.3.5 we will present the recent developments that radically changed our understanding of the effects of local non-Gaussianity on the large-scale structure and finally comment, in Section 4.3.6, on some consequences for galaxy bispectrum measurements as far as current research provides.

4.3.1. The Galaxy Bispectrum and Local Bias

Until recently, it was commonly assumed, even for non-Gaussian initial conditions, that the galaxy overdensity , defined in terms of the galaxy density and its mean as can be expressed, at large-scales, as a local function of the matter density contrast, (Properly speaking we should consider here the smoothed matter density contrast, that is, with being a top-hat filter function. For simplicity, we implicitly assume a smooth density field, so that, for large enough filtering scale, for example, , matter perturbations are small, .) that is, Such a reasonable expectation is based on the fact that the physics of galaxy formation operates on much smaller scales, below the typical halo size than those we are interested in. At large scales, where fluctuations are small, , we can consider the Taylor expansion [159] describing the bias relation between galaxy and matter in terms of a series of constant bias parameters, . This expansion allows for a consistent extension of the perturbative expressions for the matter correlators to the galaxy ones. In fact, from (175) we can derive the galaxy three-point function in position space and the tree-level expression for the galaxy bispectrum given by where the second term on the r.h.s., proportional to the quadratic bias parameter , is of the same order of the gravity-induced contribution to the matter bispectrum (167). Relying on this simple result, measurements of the galaxy bispectrum have been considered in the first place, in the context of Gaussian initial conditions, as a way to determine the bias parameters and break the degeneracy between linear bias () and the amplitude of matter fluctuations (e.g., ), otherwise affecting power spectrum measurements [165, 179185]. In this respect, the corresponding reduced galaxy bispectrum is where is the reduced matter bispectrum (including a possible initial contribution) and the effect of nonlinear bias is simply given by an additive constant term. As already mentioned, measurements of triangular configurations different in shape and size allow to disentangle the different sources of non-Gaussianity and determine independently and , provided that accurate predictions for the matter bispectrum, from PT or N-body simulations, are available [186, 187] and the effects of redshift distortions and the survey geometry are properly taken into account [165, 188].

In particular, if we allow the possibility of non-Gaussian initial conditions, then the matter bispectrum includes an initial contribution, so that we can rewrite (178) at tree level explicitly as and we can extend the analysis to obtain simultaneous constraints on the bias parameters and on the parameter determining the amplitude of the primordial bispectrum, that is, A first conservative estimate of the possibilities offered by this method in measurements of the galaxy bispectrum in the 2dF Galaxy Redshift Survey [189] and in the Sloan Digital Sky Survey (SDSS), [190] is given by Verde et al. in [191] as a simple extension of previous results for the bias alone [181] suggesting that a primordial component could be detected for values of a local of the order of -. As we will see in the next sections, a complete analysis of the galaxy bispectrum, including all measurable configurations, can improve this estimate by more than an order of magnitude: Scoccimarro et al. [192] forecast in fact for the SDSS limits of the order of .

Among the various observational issues in analyses of galaxy correlators, for example, finite volume effects or completeness of the galaxy samples, we stress that particularly relevance has the problem of redshift distortions. Redshift distortions have in fact a significant impact on the shape dependence of the galaxy bispectrum, particularly at small scales [165, 183, 193]. A recent treatment of redshift distortions in bispectrum predictions (with Gaussian initial conditions) can be found in the study by Smith et al. in [188].

4.3.2. A Bispectrum Estimator

In this section, we define a simple estimator for the measurement of the galaxy bispectrum in N-body simulations as well as actual data. This allows us to derive an expression for the bispectrum variance and define a Fisher matrix for an analysis of the galaxy bispectrum in terms of the non-Gaussian (and bias) parameters. In the next section we will consider a proper likelihood analysis and the effects of the bispectrum covariance. Since what follows can be applied in general to bispectrum measurements, we will consider, to simplify the notation, the case of the matter density field in Fourier space, described by the density contrast . We will point out relevant differences in the application to the galaxy distribution.

For a cubic box of volume a bispectrum estimator can be defined as [170] where is the volume of the fundamental cell and where each integration is defined over the bin centered at and of size equal to a multiple of the fundamental frequency . The Dirac delta function ensures that the wavenumbers , and indeed form a closed triangle, as imposed by translational invariance, while the normalization factor , given by represents the number of fundamental triangular configurations (given by the triplet , and ) that belong to the triangular configuration bin defined by the triangle sizes , and with uncertainty .

The leading contribution to the bispectrum variance following from this estimator, in analogy with the power spectrum case [194], is given by Scoccimarro et al. in [170] as (This expression (see [192]) corrects a typo in equation (A.16) of Scoccimarro et al. in [170]. ) with the factor respectively, for equilateral, isosceles, and general triangles and where with the particle (or galaxy) number density accounting for the shot noise contribution. In the case of a galaxy distribution, the matter power spectrum on the r.h.s. should be replaced with the galaxy power spectrum, expressed, at large scales, by , under the local bias assumption of (175). Equation (192) constitutes the Gaussian limit to the bispectrum variance, as it neglects higher-order corrections dependent on the three-, four-, and six-point, connected, correlation functions.

We will not discuss here the theory of the Fourier-space correlation functions estimation in redshift surveys as this would take us quite far away from our topic. For the power spectrum estimation we refer the reader to [195, 196] for a pedagogical and historical introduction to a rather extensive literature. It should be noted that only a limited fraction of these results relative to the power spectrum has been extended to the bispectrum [165, 181]. No optimal estimator for , given a specific model, has been studied so far.

4.3.3. The Fisher Matrix Forecasts

In this section we consider simple forecasts for the constraints on the non-Gaussian parameters from measurements of the galaxy bispectrum in future redshift surveys. Specifically, we will consider a Fisher matrix for reduced galaxy bispectrum in terms of the non-Gaussian parameter and the linear and quadratic bias parameters and . These three parameters characterize the relative weight of the different non-Gaussian contributions to the galaxy bispectrum. Since the possibility to detect a primordial component relies on our ability to separate the three contributions, a robust result should, at least, involve a marginalization over bias. On the other hand, we will assume all cosmological parameters as known. This is in part justified by the weak dependence of the matter-reduced bispectrum on cosmology discussed in the previous section. In this respect, it can be shown that the reduced bispectrum has the same signal-to-noise ratio as the bispectrum. For given triangular configurations, in fact, since the variance of is dominated by the variance of (see, for instance, [192]).

The Fisher matrix can be written as where the indices and run over the parameters of interest , and , while the reduced bispectrum variance, as mentioned above, can be expressed in first approximation as with given by (192). Notice that depends on the linear bias parameter . The sum over the triangles configurations can be explicitly defined in terms of three sums over the wavenumbers , and in steps of : with to ensure that a close triangle can be formed and with representing the minimal physical scale included in the analysis. Clearly, larger values of correspond to a much larger number of available configurations. For this reason, in fact, the cumulative signal-to-noise ratio for the bispectrum, that is, the sum of the signal-to-noise ratio over all measurable configurations, grows more rapidly with than it does for the power spectrum. On the other hand, we expect the primordial component to decrease significantly at small scales (high-). In practice, however, can be defined as the smallest scale at which we can trust our model for the galaxy bispectrum, in our case, the tree-level expression in (179).

In Figure 18, the forecasted errors on bias parameters and non-Gaussian parameters as a function of for an ideal geometry galaxy survey of volume and a galaxy number density of at redshift (dashed, red lines) and (continuous, blue lines) are shown. The negligible difference between the results for the non-Gaussian parameters at different redshift is a consequence of the fact that the signal-to-noise ratio of the primordial component to the matter and galaxy bispectrum for a single triangular configuration, , is, in our approximation, constant, both as a function of redshift and scale. This is not the case for the contributions due to gravitational instability and bias. It is clear that the choice of significantly affects the final result. For instance, Sefusatti and Komatsu [197] define , for a given survey, as the inverse of the scale given by the condition to ensure that the tree-level predictions is applied within the mildly nonlinear range. Notice that the choice of depends on redshift, since at larger redshift we can expect a larger range of validity of perturbation theory predictions, both for matter and galaxy bispectrums. The dependence on the survey volume is simply given by .

Scoccimarro et al. [192], from a Fisher matrix analysis as the one described above, have shown that the 2dF and SDSS surveys should be able to probe values of , assuming that . They suggested as well that an all-sky survey with a galaxy number density of up to redshift can probe values of of order unity.

Sefusatti and Komatsu [197] provided more specific predictions for a choice of planned and proposed high-redshift galaxy surveys, based on a similar Fisher approach, for the errors on non-Gaussian parameters both for the local and equilateral models. It is found that, for equilateral non-Gaussianity, the degeneracy between the non-Gaussian parameter and the bias parameters is severe. This is due to the fact that the specific shape dependence of the initial contribution, being somehow complementary to the shape dependence of the bispectrum induced by gravity which is low for nearly equilateral triangles, reduces to a certain extent the overall shape dependence of the total matter bispectrum. In this case, it is then more difficult to distinguish the total matter bispectrum from the component due to nonlinear bias, which at tree-level approximation is a simple constant in the expression for the reduced bispectrum , (179). On the other hand, such degeneracy extends to unphysical regions of the - plane and it can be significantly reduced by introducing a correlation between linear and quadratic biases as the one predicted by the halo model. The marginalization over bias can be then replaced by a marginalization over the parameters of the Halo Occupation Distribution describing the galaxy population. Sefusatti and Komatsu [197] find that future large-volume surveys ( at , ), designed to accurately measure acoustic oscillations in the galaxy correlation function and thus map the late-time expansion of the Universe, should be able to probe and , that is, values comparable to those expected from future CMB missions. At that time they constituted the best forecasts for constraints on from large-scale structure measurements. These results implied, in particular, that, if Planck will indeed detect primordial non-Gaussianity, a confirmation by large-scale structure observations will be required to firmly establish such an important discovery.

4.3.4. Effects of Covariance and Current Results

The simple Fisher matrix analysis described in the previous section makes several approximations, starting with the assumption of an ideal geometry for the survey under consideration, and the Gaussian variance for the galaxy bispectrum configurations. In fact we can expect a proper treatment of the survey selection function and of the bispectrum covariance to have a significant impact on the estimation of the non-Gaussian (and bias) parameters. Triangular bispectrum configurations at the largest scales probed by a realistic redshift survey (where the initial component should provide the largest corrections) are indeed highly correlated, because of the limited number of measurable Fourier modes.

The issue of bispectrum covariance has been studied in [165, 184, 185, 192]. For instance, Scoccimarro et al. [192] compare the Fisher matrix results for an ideal survey with a volume and galaxy number density similar to those of the main sample of the SDSS, with the predictions resulting from a likelihood analysis of the same survey, including the effects of survey geometry and covariance. Such analysis involves all measurable triangular configurations defined by wavenumbers , , , with , resulting in a total number of triangle bins, . The estimation of the corresponding, , bispectrum covariance matrix clearly represents a challenging computational problem as it cannot be determined from a relatively small number of N-body simulations. This work uses instead a code [165] implementing particle displacements as predicted by second-order Lagrangian perturbation theory (2LPT, see, for instance, [70] and references therein) to produce realizations of the density field. Such a large number of realizations are in fact necessary for an accurate determination of the covariance matrix. In addition, the 2LPT results, including particle velocities, allow for an exact redshift mapping. Each mock catalog, in redshift space, is then weighted according to the Feldman-Kaiser-Peacock (FKP) procedure [165, 181, 194] to take into account the SDSS selection function. The same covariance matrix is compared to analytic expressions in the study by Sefusatti et al. in [184].

Given a proper estimate of the covariance matrix, a likelihood function for the reduced bispectrum can be defined in terms of the normalized bispectrum eigenmodes that diagonalize it [165]. These can be expressed as where , and their signal-to-noise ratio is given by where represents the eigenvalue for , with . The eigenmodes presenting the largest signal-to-noise ratio can be easily interpreted by considering how they weight different bispectrum configurations. In fact, the largest signal-to-noise ratio corresponds to an eigenmode defined by a nearly equal weighting of all triangles, and it therefore represents the overall bispectrum amplitude. The next eigenmode weights instead with opposite sign triangles close to the equilateral shape and nearly collinear triangle. Each eigenmode represents in fact a fraction of the information contained in the bispectrum configurations, and a crucial role in this respect is played by the shape and scale dependence. To illustrate this point, Figure 19 (from [192]) shows the 95% C.L. limits on from the likelihood analysis of the IRAS PSC catalog [198] as a function of the number of eigenmodes included.

Although the diagonalization of the covariance matrix does not ensure the exact independence of the eigenmodes, which can still present non-vanishing higher-order correlations, it has been shown that this is nevertheless a reasonable assumption in practice [165]. This allows us to write a likelihood function for the non-Gaussian and bias parameters, denoted generically as , in terms of the product of the probability distribution functions for each individual eigenmode; that is, The probability distributions , which can be determined from the mock catalogs, are not expected in general to be Gaussian, although this can be in fact a good first-order approximation in the case of the SDSS main sample [192].

A direct implementation of this kind of analysis, taking into account all measurable bispectrum configurations and their covariance, has been performed by Scoccimarro et al. [199] for different IRAS catalogs [114, 200, 201] and by Feldman et al. [202] for the IRAS PSC catalog Saunders et al. [198] considering that the case of the model of primordial non-Gaussianity [192, 203] derives the limit at C.L. for the bispectrum measured in the PSCz catalog.

Along these lines, Scoccimarro et al. [192] also studied the constraints on for local non-Gaussianity that could be obtained from measurements of the galaxy bispectrum in the SDSS main sample, including the effects of the survey geometry and bispectrum covariance, forecasting the 1- error , after marginalization over the bias parameters. This work compared this more realistic estimate of the predicted errors on from the likelihood analysis of the SDSS bispectrum to the Fisher matrix forecast for an ideal geometry of nearly the same volume and galaxy density finding a worsening of a factor of 4-5. They point out, however, that the realistic errors, which are an estimate from the north part of SDSS alone, should be taken as a an upper bound to the results actually achievable because of the FKP weighting scheme, not optimal at the largest scales where the primordial component is the largest and because of the fact that extra signal can be found as well in open configurations, not considered there, due to the broken translation invariance. We might add, based on the results of Section 4.2.2, that nonlinear corrections present for non-Gaussian initial conditions might increase the overall signal due to a nonzero , particularly on small scales where a large number of triangular configurations can be measured.

At this point we should remind the reader that all of the results discussed so far on the galaxy bispectrum and its significance for constraining primordial non-Gaussianity assume the expression (179) to be a reliable prediction. As we will see in the remainder of this section, this is not the case, as additional effects of non-Gaussian initial conditions have to be taken into account. Nevertheless, the primordial component, whose direct detection has been the main target of the earlier works discussed above, is still expected to provide a contribution to the galaxy bispectrum, and there are good reasons to believe that these results can be still interpreted as a “conservative estimate” of the possibilities offered by bispectrum measurements in the large-scale structure to test the Gaussianity of the initial conditions.

4.3.5. Primordial Non-Gaussianity and Nonlocal Galaxy Bias

The constraints and forecasts discussed so far in this section are based on the tree-level expression for the galaxy bispectrum, (179), derived under the assumption of local bias, (175). As anticipated, our understanding of galaxy bias in presence of primordial non-Gaussianity radically changed in the last two years, after Dalal et al. [132] presented measurements of the halo power spectrum in simulations with non-Gaussian initial conditions of the local kind showing the presence of large corrections at large scales, not captured by the local bias prescription! Figure 20 shows the matter-halo cross-power spectrum for different values of from these simulations, where the unexpected effect of non-Gaussianity at large scales is evident.

Local bias, (175), in fact, implies a leading contribution to the galaxy (or halo) power spectrum of the simple form with no dependence on , while the simulations results of Dalal et al. [132], later confirmed by Desjacques et al. [133], Grossi et al. [135], and Pillepich et al. [140], are consistent with a scale-dependent correction to the linear bias of the form with where is the linear critical density for spherical collapse, extrapolated at . Such correction therefore increases with the scale, with redshift via the growth factor and with the non-Gaussian parameter , and vanishes for unbiased populations ().

A theoretical interpretation, based on the peak-background split [204], has been assumed by Afshordi and Tolley [131], Dalal et al. [132], Giannantonio and Porciani [205], Slosar et al. [206], and, with a somehow different derivation, by McDonald [207]. According to these works, the local relation between the galaxy density and the matter density in (175) is modified, in presence of local primordial non-Gaussianity, to include an explicit dependence on the primordial curvature perturbation, ; that is, where the factors and are proportional to and depend, in turn, on the linear and quadratic bias parameters and . See the study by Giannantonio and Porciani in [205] for a detailed derivation of this expression in the context of the peak-background split. The galaxy two-point function will be given by the following perturbative expansion: where the second term can be rewritten as the scale dependence of the linear bias parameter of (193), with the behavior resulting from the relation between and given by . Giannantonio and Porciani [205] describe, in fact, the galaxy distribution as multivariate distribution, although the matter density and the curvature are not two independent random fields, but they are related by the Poisson, (161). It should be noted that no derivation of a similar effect (in the context of the peak-background split) due to a different kind of primordial non-Gaussianity (if feasible) has been, so far, proposed. The derivations presented in the works cited above, in fact, all rely on the relatively simple expression defining the local model (6) while their generalization to a model defined by a generic initial bispectrum is a quite challenging problem.

Following the results of Dalal et al. [132], moreover, an apparently different explanation, resulting in fact in a very similar but distinct effect on the galaxy power spectrum, has been proposed by Matarrese and Verde [208] and Taruya et al. [167]. Taruya et al. [167], starting from the local bias prescription of (175), point out that the next-to-leading-order correction to the galaxy two-point function in presence of local primordial non-Gaussianity represents, in fact, a large correction, identical up to a constant factor, in the large-scale limit, to the bias correction of (193). The perturbative expression for the galaxy power spectrum is given by where the second term, proportional to the quadratic bias parameter and dependent on the matter bispectrum , corresponds to the lowest order, one-loop correction. Remarkably, for local non-Gaussianity, in the limit , such correction presents the same scale and redshift dependence, and, for massive halos or highly biased populations (), even the same amplitude, as the one resulting from (193). The expression, however, can be applied to any model of primordial non-Gaussianity, given the appropriate initial matter bispectrum (see, for instance, [209]). In the case of equilateral non-Gaussianity, the correction is almost negligible, while local non-Gaussianity appears to be a limiting case leading to a particularly significant effect. The same correction has been considered already by Scoccimarro [203] in the context of initial conditions, where it leads to a redefinition of the bias parameters, with no additional scale-dependence.

Matarrese and Verde [208] presents a different derivation of an expression similar to the one of (196), based on earlier works on the density peak correlation function [122, 124]. In this case, a specific prediction for the bias parameters, valid however only in the high density threshold limit, is included. It is interesting to notice that the possibility of large-scale effects on the correlations of biased distributions has been explicitly pointed-out by Grinstein and Wise [122], although without further study.

The two distinct corrections to the galaxy power spectrum, one corresponding to the modified bias relation of (194), the other to the perturbative correction due to nonlinear bias of (196), have been studied in a comprehensive framework recently by Giannantonio and Porciani [205], where the authors suggest that the effect measured in N-body simulations is mainly due to the multivariate nature of the galaxy distribution with local primordial non-Gaussianity, rather than the effect of nonlinear bias (196). In addition, Desjacques et al. [133] pointed-out that even the galaxy bias parameters , related in the framework of the halo model to the halo bias parameters for halo populations of mass , present a dependence of due to the effects of non-Gaussianity on the halo mass function. The picture that has been emerging in the last years is therefore quite complex and it should be stressed that a wide consensus in the community on a well defined model, even for the galaxy power spectrum, is still lacking. For instance, a discrepancy of the order of a between predictions and simulations results, did not find yet a unique interpretation (see discussions in [133, 135, 138, 140, 205]).

This rather surprising effect of local non-Gaussianity on the bias relation leads, remarkably, to the possibility of placing limits on from current large-scale structure observations, already comparable to limits from the CMB! [131, 206]. Specifically, Slosar et al. [206] derived from measurements of the cross-correlation of several large-scale structure datasets and the CMB [210] the 2- constraints leading to a marginal improvement of the WMAP results. Encouraging predictions for the constraints that can be derived in future spectroscopic as well as photometric redshift surveys can be found in the study of Carbone et al. in [211]. A fair comparison between these forecasts and those derived for the galaxy bispectrum in the study of Sefusatti and Komatsu in [197] is clearly not possible as the latter do not include the effect on the bias relation discussed above. Two observations, however, are in order. In the first place, these effects on the galaxy power spectrum are specific of the local model of non-Gaussianity, while the galaxy bispectrum is in principle sensitive to any initial component . In the second place, robust results can be obtained from galaxy power spectrum measurements at large scales in photometric surveys. The degradation of the information that can be extracted from bispectrum measurements in photometric surveys with respect to spectroscopic ones is still to be properly studied. The impact of photometric errors on the accurate determination of the bispectrum dependence on the triangle shape can in fact be significant.

4.3.6. The Galaxy Bispectrum after the Paper of Dalal et al. in [132]

First steps in the direction of an extension of the results discussed above to the galaxy bispectrum have been taken by Jeong and Komatsu [212] and Sefusatti [171]. Specifically, Jeong and Komatsu [212] considered an expression for the high-peak three-point function derived by Matarrese et al. in [124], analogous to the one for the two-point function studied by Matarrese and Verde in [208], and applied it to the case of local non-Gaussianity. Sefusatti [171] considered instead the perturbative approach of Taruya et al. [167] based on the local bias expansion of (175), and applied it to local and equilateral non-Gaussianity.

These works show that the galaxy bispectrum is expected to be sensitive to both the initial matter bispectrum as well as to the initial matter trispectrum , by means of a contribution analogous to (196) and given by which represents a large correction at large scales, with an asymptotic behavior characterized by an extra factor with respect to the primordial matter bispectrum component, and a dependence on . In addition, Sefusatti [171] points out that, unlike the power spectrum, large-scale corrections due to nonlinear bias are present as well for equilateral non-Gaussianity (and virtually for any nonpathological form of the primordial bispectrum and trispectrum). Figure 21 shows the one-loop corrections to the galaxy bispectrum due to nonlinear bias and primordial non-Gaussianity under the assumption of local bias [171]. Figure 21(a) assumes local non-Gaussianity including a nonzero initial bispectrum and trispectrum, while Figure 21(b) assumes a nonzero initial bispectrum of the equilateral type. Thin lines correspond to Gaussian initial conditions. The black continuous line represents the matter bispectrum and therefore the first term on the r.h.s. of (198), while the blue dashed lines correspond to the second term. Notice that, at next-to-leading order in PT, the matter bispectrum depends on the initial trispectrum as well as the initial bispectrum , so that an effect is present also for equilateral non-Gaussianity where the figure assumes that .

It should be noted, however, that these results ignore, at least for local non-Gaussianity, the modified bias relation of (194) (see, in this respect, some comments in [205]) and do not provide reliable predictions for the constant bias parameters. Furthermore, they have not been properly tested against measurements of the halo bispectrum in numerical simulations. The only work, at the time of writing, in this direction is that of Nishimichi et al. [213] which shows, however, that the dependence of the halo bispectrum on is roughly consistent with the functional form resulting from the prediction of (198). The authors attempt as well, using a simple fit to their measurements, a preliminary forecast analysis for a future large-volume (), high-redshift survey, finding a detectable value of , using a very limited number of configurations. Figure 22 from the paper by Nishimichi et al. [213] shows measurements of a set of triangular configurations of the halo bispectrum in simulations with local non-Gaussian initial conditions, as a function of the non-Gaussian parameter , where the dependence of the halo bispectrum on is evident.

A simple but reasonable expectation would be that the inclusion of the effects of primordial non-Gaussianity on galaxy bias will improve the results of Sefusatti and Komatsu [197], which are based on the detectability of the primordial component alone. Our understanding of these phenomena is, however, evolving rapidly in these days, and these notes on recent developments are likely to become outdated relatively soon.

4.4. Running Non-Gaussianity
4.4.1. The Case of a Scale-Dependent

DBI models of inflation predict, as we have seen, a primordial curvature bispectrum very close to the equilateral model in its shape dependence. An additional but quite generic feature of these models is given by a significant departure from the hierarchical scaling [15, 17, 19, 214216]. More recently, this possibility has been explored as well in models of local non-Gaussianity [31, 217220].

Under a phenomenological point of view, this extra scale dependence can be described by a running , or, more properly, in terms of an amplitude parameter and a running parameter , defined by where is a properly chosen pivot scale, while defines an overall scale characteristic of the triangular configuration on which depends. In other terms, the defined above replaces the constant in the definitions of the local and equilateral bispectra effectively introducing an extra dependence on scale.

Observational consequences of a running have been explored by Lo Verde et al. in [137] and Sefusatti et al. [87], while in the study by Taruya et al. in [167] this effect is included in the prediction for one-loop corrections to the matter and galaxy power spectrum.

Lo Verde et al. [137] provided an analysis of the possibility of constraining the running parameter by combining current limits from the CMB on the amplitude parameter at the pivot scale  Mpc with future measurements of cluster abundance. The effect of an significantly different from , can result in a much larger (or smaller) amount of non-Gaussianity on the smaller scales relevant for the cluster mass function. Figure 23(a), from [137] illustrates the difference in the range of scales probed by different observables. Focusing in particular on the equilateral model for the curvature bispectrum, this work assumes the amplitude of to be constrained by the CMB bispectrum at the pivot point scale and derives the expected constraints on its running by considering the effective amplitude of at the smaller scales () probed by cluster surveys. For an all-sky cluster survey up to redshift they find the 1- constraints, marginalized over , and , assuming the fiducial values to be and , with a Planck prior . Their analysis, however, does not include the simultaneous limits that measurement of the CMB bispectrum alone is expected to provide on both the amplitude and running .

4.4.2. Running Non-Gaussianity and Bispectrum Measurements

Sefusatti et al. [87] perform a Fisher matrix analysis of the CMB bispectrum to obtain the sensitivity of this observable to the running of . The results in the case of local non-Gaussianity, assuming the same pivot and marginalizing over the amplitude , are the 1- uncertainties of for WMAP and for Planck, where stands for the fiducial value of the amplitude parameter. In the case of equilateral non-Gaussianity, we have for WMAP and for Planck. Since it is always possible, given the observable of interest (e.g., the CMB bispectrum for a specific experiment) and the non-Gaussian model, to choose the pivot point in such a way to remove any degeneracy between the amplitude and the running parameters, a measurement of the running parameter comes at no cost with respect to the determination of the . Notice, however, that, for reasons related to the numerical implementation of the CMB estimator, Sefusatti et al. [87] assume, for the overall scale representative of a given triangular configuration, the geometric mean of the three wavenumbers; that is, While the difference with the more physically motivated definition in terms of the arithmetic mean is very small for equilateral non-Gaussianity, in the local model this is not the case.

Sefusatti et al. [171] consider as well the Fisher matrix from large-scale structure information, and specifically the galaxy power spectrum (including the effect on halo bias) for the local model and the galaxy bispectrum, but in terms of the simple description of (179), therefore excluding halo bias effects. This different choice of observables with respect to the model of primordial non-Gaussianity assumes a negligible effect of equilateral non-Gaussianity on the galaxy power spectrum (still to be confirmed by N-body simulations). It is shown, in particular, that future galaxy redshift surveys can significantly improve CMB results. Figure 23(b) shows the contours plots for the 1- uncertainties resulting for a joint Fisher matrix analysis of CMB and large-scale structure information. The expected limits are plotted against the predictions for the relation between the amplitude and the running from DBI inflationary models. It is interesting to notice how these models predict a stronger running for smaller values of the amplitude parameter. In this respect, constraining the value of can place additional limits on the parameters of the inflaton Lagrangian.

5. Conclusions

Weakly non-Gaussian initial conditions are defined, in most of the relevant inflationary models, by a non-vanishing bispectrum for the primordial curvature perturbations. The most direct observables of this primordial density correlator are, naturally, the bispectrum of the temperature fluctuations in the CMB and the bispectrum of the mass distribution at large scales as probed by galaxy surveys. In this review we presented an overview of the problems, results, and expectations connected with the detection of (or constraints on) primordial non-Gaussianity specifically in bispectrum measurements of the CMB and LSS.

The CMB is an ideal observable for tests of primordial NG because temperature and polarization anisotropies can be described in the linear regime of cosmological perturbations. The statistical properties of the primordial curvature field are thus directly reflected in the pattern of CMB fluctuations. As we have seen, tests of primordial NG are formulated in terms of the estimation of the bispectrum amplitude for each of the shapes predicted by different inflationary models. It was originally shown in the literature that a maximum-likelihood estimator of the bispectrum optimally extracts all of the information from a CMB map. Extracting the primordial nonlinear parameter from the bispectrum has subsequently become the standard way to test primordial NG in the CMB. The best measurements to date come from analysis of the WMAP datasets and roughly constrain the primordial bispectrum amplitude to be for the local, equilateral, and orthogonal shapes. Despite already being very stringent (the NG part of the CMB temperature anisotropies is constrained at the level of of the total fluctuation), these bounds are still far from the typical order of magnitude of primordial NG predicted by most inflationary models. As we have seen, the Fisher matrix forecasts show that future results from the Planck satellite (whose release date is predicted to be in 2012) will improve previous WMAP constraints by roughly one order of magnitude, thus impacting the range of some theoretical predictions. This significant improvement is due to the better sensitivity of Planck to many more bispectrum configurations in the analysis, and the possibilty of exploiting both temperature and polarization datasets. Another important limitation on current constraints is that inflationary predictions encompass more shapes than those that have been constrained so far. The reason why many shapes remain to be constrained is that they cannot be written as a separable product of one-dimensional functions of a single wavenumber. Separability, as we have seen, is a crucial property since it makes the actual analysis computationally affordable in terms of CPU time. We have reviewed recent work showing that this limitation can also be overcome in future analysis by means of a fully general, and mathematically well-defined, eigenmode expansion of the bispectrum shape. Thanks to this, and in light of the significant improvement in sensitivity provided by Planck, better and more general CMB constraints on primordial NG models will be available in the near future. One caveat is that the high precision of the forthcoming CMB datasets makes them much more sensitive to other spurious (i.e., nonprimordial) sources of NG, which could bias the estimate. Achieving an accurate control on these contaminants is clearly a crucial goal for future analysis. As we have seen, much work is being done in order to predict, detect, and isolate nonprimordial NG effects, but some issues still have to be addressed. In particular a complete prediction of the total bispectrum generated by second-order cosmological perturbations is not yet available, although a number of effects have been studied in detail. Accurate characterization of NG from diffuse foreground residuals is another important issue that will require further investigation.

For large-scale structure, many aspects of the general CMB scenario outlined above change, as should be evident from a comparison of the discussions in Sections 3 and 4. In the first place, we cannot rely on a direct relation between the observed galaxy bispectrum and the primordial curvature bispectrum predicted by inflationary models. As we have seen, a small departure from Gaussian initial conditions should result in a correction to the galaxy bispectrum induced by gravitational instability and nonlinear bias, constituting the dominant contributions. The nature of this correction is a complex problem in its own right, since it is due to the linearly evolved initial matter bispectrum as well as to the effects of primordial non-Gaussianity on the galaxy bias relation. Such effects are still under investigations and we do not have, to date, an accurate theoretical model. On the other hand, early results from galaxy power spectrum measurements are very encouraging, albeit restricted at present to the local non-Gaussian model. Current datasets already appear to be able to confirm and improve CMB results. In this respect, it is evident that the ultimate goal is the implementation of a complete large-scale structure analysis in terms of all measurable correlators, including power spectrum, bispectrum, and beyond, that is, an analysis that fully reflects the non-Gaussian nature of the mass and galaxy distributions even on large scales.

There are several issues which remain to be resolved, for which we can identify three main categories. First, we need to develop a robust model for the galaxy correlators accurately accounting for small-scale nonlinearities for both the matter and galaxy density fields, as well as in the presence of non-Gaussian initial conditions; this also must account to describe nonlocalities in the bias relation. In this review we have briefly summarized the state of the art, noting that our understanding of these phenomena is evolving rapidly. Secondly, once a reliable model is available, it will be necessary to develop the machinery that will allow us, in the event of a future detection, to properly identify the effects of different models and their bispectrum shapes. In this respect, the CMB results, presented in Section 3, provide an important benchmark. Finally, observational problems connected with redshift surveys, such as the effects of redshift distortions and/or photometric errors, survey selection function, completeness, and so forth, will have to be addressed. We have not discussed these issues here as they are generic to all large-scale structure experiments, but they clearly represent a major challenge for the exploitation of future datasets. Both the first and the last points are crucial for virtually all of the science goals of future ground-based or satellite surveys, particularly dark energy studies. Although only partial results have been obtained so far, there is every indication that characterising non-Gaussianity in future galaxy surveys will result in a significant test of the initial conditions of the Universe.

To summarize, sufficient experimental sensitivity has been reached recently in CMB experiments (namely, WMAP) to allow for meaningful constraints on the nonlinear parameter for several different families of models. These results are already arguably the most stringent quantitative test of the predictions of standard inflation. However, much tighter constraints on a broader range of models are expected from the future Planck data release. Thus a dramatic confrontation is set to continue between the de facto standard model of inflation and observational datasets from both the CMB and large-scale structure. Tests of primordial non-Gaussianity are rapidly becoming one of the most effective and promising approaches for gleaning important information about the physical processes that generated the primordial cosmological perturbations.

Appendix

Basics of Estimation Theory

If a random variable is characterized by a Probability Density Function (PDF) dependent on a parameter then an estimator for is a function used to infer the value of the parameter. If a given dataset is drawn from the distribution , then is the estimate of the parameter from the given observations. Since is a function of a random variable, it is itself a random variable. In the literature a random variable obtained as a function of another set of random variables is often referred to as a statistic.

A general property usually required when building an estimator is its unbiasedness. An estimator for a parameter is unbiased if its average value is equal to the true value of the parameter: The standard deviation is generally used to determine the error bars on that is, where denotes statistical average and is the variance of the inferred parameter. When we measure a parameter from a set of observations drawn from the PDF , we clearly would like our estimate not only to be unbiased, but also to have as small error bars as possible. In other words, among all of the possible unbiased estimators of that can be built, we look for the one that minimizes defined in (A.2). If such an estimator exists, then it is called an optimal estimator.

In this context a crucial role is played by the Fisher information matrix, defined as (Note that for simplicity we work here using a single parameter. The generalization to the multiparameter case is however straightforward. ) The Fisher matrix appears in an important theorem, known as the Cramer-Rao inequality, stating that, for any unbiased estimator of This theorem is then placing a lower bound on the error bars that can be attained when estimating a given parameter from a given set of observations. No matter which estimator is used, the smallest attainable error bars will be given by the square root of the inverse of the Fisher matrix. For a demonstration of this crucial result see, for example, the paper by Kendall and Stuart in [221] or, in relation to the CMB bispectrum, that of Babich in [69]. It is then clear that the best estimator of a parameter is an unbiased estimator saturating the Rao-Cramer bound. If such an estimator is found, then it is impossible to obtain a better estimate using any other statistic. The question then becomes whether, for a given PDF , an estimator saturating the Rao-Cramer bound exists.

It can be shown that a necessary and sufficient condition for an estimator of a parameter to be optimal is the following: where is the Fisher information matrix just introduced above.

Another crucial quantity in estimation theory is the so-called maximum-likelihood estimator. In a maximum-likelihood (ML) approach we take the observed dataset as fixed and we estimate as the parameter that maximizes the probability (likelihood) to observe the given data. In formulae, the ML estimate of is the value that satisfies In this context the PDF is often denoted as the likelihood function and indicated as . Two powerful theorems involving the likelihood have been proven as follows. (1)If there is an optimal unbiased estimator (i.e., an unbiased estimator saturating the Rao-Cramer bound), then it is the maximum-likelihood estimator or a function of it. (2) The maximum-likelihood estimator is asymptotically optimal; that is, it saturates the Rao-Cramer bound when , with being the number of repeated observations in our dataset .

These two theorems answer our initial question about the best estimator choice. The first theorem basically states that, if a best method exists, then the ML estimator is that method. Note that this result follows naturally from the optimality condition (A.5) introduced above. The second theorem says that for very large datasets the ML-estimator is the best method, that is, the one saturating the Rao-Cramer bound. In other words, when dealing with the practical problem of estimating a parameter from a given dataset, we should in theory always choose an ML approach. However in practice this is not always possible: for example, the PDF might be too difficult to calculate or sample numerically, or the ML condition (A.6) (generally a complicated nonlinear equation) too difficult to solve. In this case other approaches and different estimators have to be chosen.

An important role is played by the likelihood of Gaussian random variables. If a given observed variable is characterized by Gaussianly distributed errors, then it is easy to see that its likelihood is where the statistic is defined as where are the measured values of our observable. In the previous equation we made dependent on a vector of parameters that we want to fit. Our observable could be, for example, the CMB angular power spectrum , the primordial power spectrum , or, like in our case, the angular bispectrum , and we might be interested in knowing the sensitivity of our observation to any cosmological parameter. Our statistical estimate of will be obtained by minimizing . That is clearly equivalent to maximize the likelihood. Let us now for simplicity work in the one-dimensional case (i.e., our observable depends on a single parameter) and expand about its minimum, that is about the best fit value of the parameter : The linear term vanishes here since we are in the minimum. The quadratic term represents the curvature and defines the error on . If moves very quickly away from its minimum, then our determination of will be more precise, while the error on will be much larger otherwise. If we define then we can estimate the minimum possible error on as . It is easy to see that the curvature of the likelihood in the Gaussian case matches exactly the definition of the Fisher matrix given above. The lower limit on the error bar then coincides, as it should, with the Rao-Cramer bound. This at the same time validates the choice of as the error on the parameter and also shows a simple way to interpret the Rao-Cramer bound. Since the Fisher matrix represents the curvature of the of the likelihood around its maximum, it also provides an intrinsic minimum error on the measurement of the parameter. A likelihood strongly peaked around its maximum for a given parameter will provide stronger constraints on that parameter and vice versa. We have however to keep in mind that the curvature constructed above is the curvature of the likelihood only if the distribution of our observable is Gaussian. This, strictly speaking, is in general not true, but it is a reasonably good approximation in most cases. (A clarifying example is provided by the CMB angular power spectrum. We know that is distributed like a with degrees of freedom, which rapidly gets close to a Gaussian as grows.) The Fisher matrix for any observable is then defined as the second derivative of the statistic (A.8). If we compute it explicitly, then we get The second term in the sum above is generally neglected. The idea, as explained by Dodelson in [54] or by Press et al. in [222] is that the observed will oscillate around their real value, making the difference oscillate around zero, resulting in cancellations. We are then left with the expression generally used in the literature:

In this paper, we have applied the basic concepts described in this Appendix to the estimation of the non-Gaussian parameter from the bispectrum of CMB and LSS datasets. We would like to stress again that we have just very quickly sketched some essential concepts in estimation theory here. For excellent and much more comprehensive reviews of ideas and applications of estimation theory to cosmology, we refer the reader to the papers by Dodelson in [54], Martinez et al. in [223], and Tegmark et al. in [224]. The brief review provided here was actually largely inspired by those works. A detailed and complete book about statistical methods and estimation theory is, for example, that by Kendall and Stuart in [221].

Acknowledgments

The first, third, and fourth authors were supported by STFC Grant no. ST/F002998/1 and the Centre for Theoretical Cosmology. The second acknowledges support by the French Agence National de la Recherche under Grant no. BLAN07-1-212615 and by the European Union under the Marie Curie Inter European Fellowship.