Table of Contents Author Guidelines Submit a Manuscript
Advances in High Energy Physics
Volume 2018, Article ID 2657325, 6 pages
Research Article

Constraints on Gravitation from Causality and Quantum Consistency

Institute of Cosmology & Department of Physics and Astronomy, Tufts University, Medford, MA 02155, USA

Correspondence should be addressed to Mark P. Hertzberg; ude.stfut@grebztreh.kram

Received 9 October 2018; Accepted 11 November 2018; Published 18 November 2018

Academic Editor: Diego Saez-Chillon Gomez

Copyright © 2018 Mark P. Hertzberg. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The publication of this article was funded by SCOAP3.


We examine the role of consistency with causality and quantum mechanics in determining the properties of gravitation. We begin by examining two different classes of interacting theories of massless spin 2 particles—gravitons. One involves coupling the graviton with the lowest number of derivatives to matter, the other involves coupling the graviton with higher derivatives to matter, making use of the linearized Riemann tensor. The first class requires an infinite tower of terms for consistency, which is known to lead uniquely to general relativity. The second class only requires a finite number of terms for consistency, which appears as another class of theories of massless spin 2. We recap the causal consistency of general relativity and show how this fails in the second class for the special case of coupling to photons, exploiting related calculations in the literature. In a companion paper Hertzberg and Sandora (2017), this result is generalized to a much broader set of theories. Then, as a causal modification of general relativity, we add light scalar particles and recap the generic violation of universal free-fall they introduce and its quantum resolution. This leads to a discussion of a special type of scalar-tensor theory: the models. We show that, unlike general relativity, these models do not possess the requisite counterterms to be consistent quantum effective field theories. Together this helps to remove some of the central assumptions made in deriving general relativity.

1. Introduction

General relativity is consistent with observations over a vast range of length scales. The force law has been tested down to fractions of a millimeter, while precisions tests of the relativistic theory have occurred on solar system scales, binary pulsars, and even the recent gravitational wave observations of merging binary black holes. On galactic and cosmological scales there is also agreement, though it does require the introduction of some as yet undiscovered form of dark matter and a small but nonzero amount of dark energy [1].

The latter has provided some of the central motivations for considering alternatives to general relativity. It is difficult to understand why the vacuum energy is so small, despite there being known large contributions from massive particles running in loops, such as top quarks. Furthermore, the coincidence problem (why there is a comparable amount of matter and dark energy today), as well as the cosmological horizon, homogeneity, and flatness problems are also sometimes invoked as motivations. Also, there are a suite of difficulties in understanding general relativity as a quantum theory, including nonrenormalizability, trans-Planckian unitarity violation, black hole information paradox, and global issues associated with de Sitter space and eternal inflation.

This range of primarily conceptual challenges, leads one to enquire just how inevitable general relativity is; whether theoretically consistent alternatives exist. It is sometimes thought that indeed general relativity follows inevitably as the unique consistent theory of massless spin 2 particles at low energies. Following uniquely if one only assumes the Lorentz symmetry applied to the spin 2 degrees of freedom. That to deviate from general relativity requires either a violation of Lorentz symmetry, or the introduction of additional degrees of freedom.

In this letter we clarify some aspects of this basic idea. Firstly, we point out that in fact classes of theories of massless spin 2 particles exist, which are Lorentz invariant and do not propagate new degrees of freedom. The most basic one involving the least number of derivatives, so-called minimal coupling, and the others involving higher derivatives (the latter can be organized to not propagate any additional degrees of freedom). While the first leads to general relativity, the seconds appears as another class of theories of spin 2, and was earlier introduced in Ref. [2]. We examine this second class in the special case of coupling to photons. We examine the propagation of photons in the second theory, showing there is superluminality (by exploiting related results in the literature) and forbidding a possible UV completion. In a companion paper [3] this idea is developed further and generalized to a much larger set of theories, including couplings to fermions and scalars, using a more systematic analysis.

Secondly, we study conventional ways to modify general relativity by the addition of new light scalars. We emphasize that in the Standard Model of particle physics only the Lorentz symmetry is postulated and most couplings compatible with it are observed. Similarly, new scalars should generically come with many parameters leading to violation of the observed universality of free-fall. However, quantum effects typically remove the problem by making the scalars heavy. This leads to an examination of the so-called models, which do have the property of immediately ensuring the universality of free-fall. We show that such theories, unlike general relativity, fail to have the appropriate set of counterterms in the domain of applications of these theories, and so they fail to be consistent quantum effective field theories.

2. Massless Spin 2 Particles

We are interested in constructing theories of massless spin 2 particles from the ground up. As is well known, the massless spin 2 unitary representation of the Lorentz group involves two helicities in 3+1 dimensions. It is useful to embed these two degrees of freedom into a symmetric tensor field in order to build a local theory. We shall denote this and we will shortly discuss to what extent this may be interpreted as a metric field. Since is a 10 component object, we need to remove 8 of the 10 degrees of freedom; 4 are removed by the introduction of constraints on , while another 4 can be removed by the introduction of an identification , where , the gauge function, parameterizes a family of physical equivalent representations of the same state .

If one fixes the gauge, then under a Lorentz transformation , the field transforms as where depends on the gauge choice and . Since is evidently not a Lorentz tensor, it is generally very difficult to construct a Lorentz invariant interacting theory when we couple to matter. There does exist, however, a manifestly gauge invariant and indeed Lorentz covariant 4-tensor we can construct out of derivatives of , the linearized Riemann tensor which we will explicitly make use of in the upcoming “Type II” theories.

The free theory of these spin 2 particles is associated with terms of the form . By demanding Lorentz/gauge invariance, only a unique set of terms is allowed, up to boundary terms, which is where and we have scaled the coefficient of the first term to without loss of generality.

3. Type I: Lowest Number of Derivatives

The interaction that involves the least number of derivatives, and hence would be most relevant at large distances, is to attempt to couple directly to matter as follows: where is some symmetric tensor built out of the matter fields, whose properties we shall shortly identify. Evidently this term is not gauge invariant for a generic , which means the theory is not unitary as it would propagate the wrong number of degrees of freedom. Alternatively, one could try to define this term in a particular gauge to avoid additional degrees of freedom, but then one would find the term is not Lorentz invariant as is not a proper Lorentz tensor.

It is easy to see that under a gauge transformation and an integration by parts, the problem is fixed by taking to be conserved . So should be proportional to the matter energy-momentum tensor as follows , where is a coupling. We are assured that this is conserved due to translation invariance (at least to leading order; more on this shortly). This immediately implies that all matter particles must couple universally to , with strength , as only the total energy-momentum tensor is conserved for interacting particles. This implies the (weak) equivalence principle.

This ensures the theory is gauge invariant on-shell. To also be gauge invariant off-shell one must endow the matter fields with a gauge transformation rule, such as for scalars [4]. However, the gauge invariance is only ensured to order , since the presence of this interaction means that energy and momentum will in general be exchanged between matter and gravitons, so is no longer exactly conserved. This requires several fixes: (i) one must include the coupling of to itself as gravitons carry energy and momentum, (ii) at higher order in one must modify the gauge transformation rule for to involve higher order corrections, (iii) the gauge transformation rule for matter must also involve higher order corrections, such as , and (iv) an infinite tower of interaction terms, in powers of , must be included with the schematic “  ” form where every term is determined uniquely in terms of , up to boundary terms and field redefinitions. Amazingly, this infinite series can be resummed for any matter Lagrangian [5], giving the Einstein-Hilbert action where , , and the matter Lagrangian involves the lift to “minimal coupling” and . The gauge invariance is lifted to the full diffeomorphism invariance and is the fully nonlinear Ricci scalar. Now inevitably has a geometric interpretation.

Hence Type I coupling leads uniquely to general relativity and all of its successes. This theory can even be quantized, in the low energy regime, with concrete quantum gravity predictions such as [6, 7]; we shall return to this issue later. Furthermore, by including some small, but nonzero, vacuum energy in the matter Lagrangian this can even account for cosmic acceleration. This theory does lead to conceptual puzzles, such as the cosmological constant problem and black hole information paradox, as mentioned in the introduction, but is in great agreement with observations.

4. Type II: Higher Number of Derivatives

Here we would like to describe a much bigger class of theories of massless spin 2 particles; this was introduced earlier in Ref. [2] and some other work includes Ref. [8]. By exploiting the gauge invariant object defined in eq. (2) we can immediately write down a manifestly gauge/Lorentz invariant class of theories by coupling it into essentially any four index object with interaction giving Note that this object involves only a finite number of terms in powers of , unlike the Type I theory that involves an infinite tower of terms. (Both types are of course nonrenormalizable in 3+1 dimensions so higher order terms will be generated by quantum mechanics, which we shall return to later). Also note that for any matter Lagrangian , we are essentially free to choose any new independent object which may be built out of many new parameters, unlike in Type I where the interaction term comes uniquely specified by the matter Lagrangian through the minimal coupling procedure. We further note that in Type II we can have many different flavors of massless spin 2 particles, without any contradiction, while in Type I there can only be a single flavor.

As a concrete example of the interaction, if the matter involves vector fields with field strengths we could choose to be where are couplings. In general some constraints may be placed on the relative sizes of to avoid higher time derivatives and ghosts, however there is no requirement for the couplings to be universal. So is permitted by the Lorentz symmetry. Hence such a theory does not imply the (weak) equivalence principle.

5. Principles in Physics

If we considered the equivalence principle to be another fundamental postulate, then this would suffice to reject this entire Type II class in favor of the very special Type I. However, in this work we only take the Lorentz symmetry as a fundamental postulate, and the equivalence principle is to be derived rather than assumed.

In fact it is useful to put this point of view in a broader perspective. It is often suggested that modern particle physics is built out of various additional postulates, such as the “gauge principle” or “principle of minimal coupling”. However, if we examine the structure of the Standard Model, in particular its symmetries, a different picture emerges. (i) Exact symmetries: CPT derives from locality and unitarity, while gauge is derived as an identification to remove the unphysical components of fields associated with twelve spin 1 particles. (ii) Approximate symmetries: , are derived as accidental. (iii) Asymmetries: , , , chiral, scale, etc., are not derivable from Lorentz symmetry and are not realized in nature. So in the Standard Model of particle physics, only that which follows from the Lorentz symmetry (applied to unitary representations) is realized, and no additional postulates appear to be required.

Furthermore, the global , associated with electric charge, derives from considering the analogous Type I theory of massless photons coupled to charged matter with the lowest number of derivatives as , which requires coupling to a very special conserved (just as above). On the other hand, we can also consider an analogous Type II theory of massless photons coupled to neutral matter with a higher number of derivatives as which does not require anything special about (just as above). In fact this describes the low energy effective theory of photons coupled to neutrinos with , etc. So at the level of the effective theory, both Type I and Type II theories are realized in nature for photons coupled to matter. This begs a question for gravitation: why has nature chosen Type I and not the much larger class Type II of gravitons coupled to matter?

There is reason to think that causality provides a possible answer to this question. We examine this in greater detail and for a much broader class of models in a companion paper [3]. For now we illustrate the idea in the special case of coupling to photons.

6. Causality in Type I

It is well known that in general relativity with standard matter sources there is no problem with causality [9, 10]. This can be seen as follows. Consider photons minimally coupled to gravity in Type I. In the geometrics optics limit, photons obey the null geodesic equation , where is the photon’s 4-momentum. The leading deflection from null propagation on the Minkowski cone comes from expanding in powers of the gravitational coupling as and , giving . Then using the linearized Einstein equations in the Lorenz gauge and solving for a particular solution gives [9, 10]which clearly satisfies for any matter that satisfies null-energy condition. Hence light stays inside the Minkowski cone and slows down in accord with the Shapiro time delay.

7. Causality in Type II

Let us focus on the case of the four index object given in eq. (8) and focus on a single species, the photon. The modified Maxwell equation for a region of space-time that is Ricci flat is Note that this equation is exact, and the derivatives in Type II are just ordinary, not covariant, derivatives.

In the geometric optics limit, the leading deflection from null propagation on the Minkowski cone is [11] where all terms on the right hand side are evaluated for free null propagation. Here is the photon’s polarization unit vector with for each of the two modes. For an appropriate choice of polarization and direction of propagation, relative to the gravitational field, one can arrange for going outside the Minkowski cone. Since these theories are manifestly Lorentz invariant, this leads to problems with causality. This idea is greatly generalized in a companion paper [3].

In a related context of QED, minimally coupled to gravity, it is known that one can integrate out the electron and generate terms of the form (8) (with the nonlinear Riemann tensor) with coefficients [12]. However this does not produce superluminality since the leading Shapiro time delay dominates over this effect in its domain of applicability. Related ideas appear in the context of string theory [13]. However, in the context of this new class of spin 2 theories, where this is the leading interaction, superluminality appears unavoidable.

8. Additional Fields

There does exist a manifestly causal way to modify gravitation. This involves the introduction of additional degrees of freedom. Since fermions do not mediate long range forces, and vectors (with minimal coupling) have sources that tend to neutralize, we focus on the remaining case of adding light scalars.

Usually in the literature a scalar is added that is taken to couple universally to the trace of the matter energy-momentum tensor at leading order as where is a coupling. Here the universal coupling is inserted to be compatible with the (weak) equivalence principle. But as described earlier in this letter, the Standard Model does not give any reason to utilize any principle beyond that of just the Lorentz symmetry. Since is a gauge singlet scalar, we could take any gauge invariant term in the Standard Model and multiply it by where are arbitrary couplings and obtain a Lorentz invariant theory.

So, to assume the form (13) is to tune the theory to be compatible with tests of the universality of free-fall, which have constrained [14]. One may appeal to technical naturalness to justify such universal couplings [15], or to link to the dilaton of a spontaneously broken scale symmetry [16], or to special fields associated with extra dimensions. However, generic scalars beyond the Standard Model do not have this feature of universal coupling. Instead Lorentz symmetry suggests the nonuniversal (14) is much more generic.

This poses a challenge to deriving the (weak) equivalence principle. However, if we take quantum effects into account, then a generic scalar will pick up a mass from Standard Model particles running in a loop of the form (or ), using a hard UV cutoff on the loop integral . For , and unless the cutoff is extremely low, the scalar will be typically heavy and unable to mediate long range forces, so general relativity is recovered at large distances.

9. () Gravity: Classical Treatment

A popular framework that is both causal and enforces the universality of free-fall is the so-called models. Here the Einstein-Hilbert action eq. (6) is modified as . Note that inside any nonlinear function are higher derivatives. This can be seen by expanding around a Minkowski background obtaining . The terms here only lead to a total derivative in the action if is linear in , but for nonlinear these higher derivatives have consequences. At the classical level, these consequences can be captured by the introduction of a scalar with action where and are functions that depend on the choice of . Note that couples to matter in a universal way through the single function , satisfying the (weak) equivalence principle.

A popular example is (with and ) which is a model of inflation [17]. Here the classically equivalent scalar plays the role of the inflaton. Its potential turns out to be For large field values the potential is exponentially flat and inflation takes place. One computes correlation functions of the scalar mode of the form (where is the Bunch-Davies vacuum) to obtain an approximately scale invariant spectrum of density perturbations with small red tilt and tensor-to-scalar ratio . These predictions are compatible with recent data [18, 19]. Similarly there exist many popular models of dark energy associated with various choices of [20].

10. () Gravity: Quantum Treatment

Here we would like to examine gravity as an effective field theory. To begin, let us return to the Einstein-Hilbert action eq. (6) and try to study it as a quantum theory. The quantum partition function is where the first measure of the path integral is over the two modes of the graviton labelled and the second measure is over the matter fields. In practice there are various complications associated with gauge fixing and constraints, but this is the formal structure. In principle this allows one to compute various correlation functions such as , . If we perform the path integral partially by integrating down to some scale , we will generate a new Wilsonian effective action including corrections such as (plus some nonlocal terms, etc.). These additional terms are required as counterterms to cancel divergences associated with graviton loops. As long as we focus on only the physical degrees of freedom, such as the two modes of the graviton, we can use this effective theory to compute quantum effects. (In fact quantization of the leading Einstein-Hilbert term already gives rise to long range corrections to gravitation; see Refs. [6, 7]).

Note that the effective Lagrangian in eq. (18) formally involves higher derivatives due to the presence of the terms , etc. Furthermore the presence of these types of terms might, at first sight, seem to justify the kind of actions that we wrote above. However, it is essential to not use these higher derivative terms incorrectly; the original path integral is only defined with a measure for the two modes of the graviton and the matter fields. The measure does not include integration over additional degrees of freedom, such as a scalar . Instead the path integral forces these to be spurious additional degrees of freedom; they can never be external and on-shell.

By contrast, the models, as applied to inflation and dark energy, etc., explicitly make use of the additional scalar . This scalar is given its own dynamics, its own phase space, and its own independent set of fluctuations that are used as the source of density perturbations. This means the models are disconnected from a rigorous quantum treatment of the Einstein-Hilbert action.

One might attempt to quantize using the path integral. However, the path integral requires one to integrate over all the physical fields in the theory. So we would need to explicitly use the scalar and form Again by integrating over high energy modes, we form a new type of Wilsonian effective action (plus generating terms, etc., again). We emphasize the presence of new counterterms, such as . These additional counterterms are required to cancel new divergences that arise from the ability to put the external and on-shell. It is very important to note that such additional terms cannot be put in the form, and more generally, cannot put in the form of some function only of the metric when generic matter is included. Hence the models, which exploit the dynamics of the additional degree of freedom, contain cut-off dependence in the quantum theory without the required counterterms to be a consistent quantum effective theory.

We note that in other formulations of gravity, such as Palatini, similar conclusions hold. Namely, in constructions in which there are no new degrees of freedom, then the theory can always be recast into the general relativity form with a collections of appropriate counter-terms, while, in constructions in which there is a new degree of freedom, it can only be self-consistently quantized by reorganization into the scalar-tensor form.

11. Outlook

The above arguments help toward deriving general relativity as the only consistent theory involving massless spin 2 at low energies. (i) Type II theories that utilize higher derivative couplings (but can avoid extra degrees of freedom) can lead to problems with causality; see [3] for an extended analysis. (ii) Additional scalars, which would generically lead to nonuniversal free-fall, are typically expected to be heavy due to quantum effects. (iii) models are not consistent quantum effective field theories.

However, important puzzles remain, including understanding dark energy. The smallness of the vacuum energy within the framework of general relativity does have a candidate explanation by introducing many (heavy) scalars, leading to a potential with an exponentially large number of vacua. Though it is unclear how to formulate probabilities in this context. While the behavior of gravitation at the Planck scale requires further new physics.

Data Availability

The paper does not refer to any specific data set.


An earlier version of this manuscript was presented at the 13th International Symposium on Cosmology and Particle Astrophysics (CosPA 2016).

Conflicts of Interest

The author declares that they have no conflicts of interest.


I would like to thank Raphael Flauger, Jaume Garriga, Alan Guth, David Kaiser, Juan Maldacena, McCullen Sandora, and Mark Trodden for helpful conversations. I would like to thank the Tufts Institute of Cosmology for support.


  1. A. G. Riess, A. V. Filippenko, and P. Challis, “Observational evidence from supernovae for an accelerating universe and a cosmological constant,” The Astronomical Journal, vol. 116, no. 3, p. 1009, 1998. View at Publisher · View at Google Scholar
  2. R. M. Wald, “Spin-two fields and general covariance,” Physical Review D, vol. 33, 1986. View at Google Scholar
  3. M. P. Hertzberg and M. Sandora, “General relativity from causality,” Journal of High Energy Physics, vol. 119, 2017. View at Google Scholar
  4. M. D. Schwartz, Quantum Field Theory and the Standard Model, Cambridge University Press, Cambridge, 2014. View at MathSciNet
  5. S. Deser, “Self-interaction and gauge invariance,” General Relativity and Gravitation, vol. 1, no. 1, pp. 9–18, 1970. View at Google Scholar · View at MathSciNet
  6. J. F. Donoghue, “Leading quantum correction to the Newtonian potential,” Physical Review Letters, vol. 72, 1994. View at Google Scholar
  7. L. H. Ford, M. P. Hertzberg, and J. Karouby, “Quantum gravitational force between polarizable objects,” Physical Review Letters, vol. 116, no. 15, Article ID 151301, 2016. View at Google Scholar
  8. D. Bai and Y. H. Xing, “Higher derivative theories for interacting massless gravitons in Minkowski spacetime,” Nuclear Physics B, vol. 932, 2018. View at Google Scholar
  9. M. Visser, B. Bassett, and S. Liberati, “Superluminal censorship,” Nuclear Physics B. Proceedings Supplement, vol. 88, pp. 267–270, 2000. View at Google Scholar
  10. A. Adams, N. Arkani-Hamed, S. Dubovsky, A. Nicolis, and R. Rattazzi, “Causality, analyticity and an IR obstruction to UV completion,” Journal of High Energy Physics, vol. 2006, 2006. View at Google Scholar
  11. G. M. Shore, “Quantum gravitational optics,” Contemporary Physics, vol. 44, 2003. View at Google Scholar
  12. I. T. Drummond and S. J. Hathrell, “Q{ED} vacuum polarization in a background gravitational field and its effect on the velocity of photons,” Physical Review D: Particles, Fields, Gravitation and Cosmology, vol. 22, no. 2, pp. 343–355, 1980. View at Publisher · View at Google Scholar · View at MathSciNet
  13. X. O. Camanho, J. D. Edelstein, J. Maldacena, and A. Zhiboedov, “Causality constraints on corrections to the graviton three-point coupling,” Journal of High Energy Physics, vol. 2016, no. 2, 2016. View at Publisher · View at Google Scholar
  14. S. Schlamminger, K.-Y. Choi, T. A. Wagner, J. H. Gundlach, and E. G. Adelberger, “Test of the equivalence principle using a rotating torsion balance,” Physical Review Letters, vol. 100, no. 4, Article ID 041101, 2008. View at Publisher · View at Google Scholar · View at Scopus
  15. L. Hui and A. Nicolis, “Equivalence principle for scalar forces,” Physical Review Letters, vol. 105, no. 23, 2010. View at Publisher · View at Google Scholar
  16. C. Armendariz-Picon and R. Penco, “Quantum equivalence principle violations in scalar-tensor theories,” Physical Review D: Particles, Fields, Gravitation and Cosmology, vol. 85, no. 4, 2012. View at Publisher · View at Google Scholar
  17. A. A. Starobinsky, “A new type of isotropic cosmological models without singularity,” Physics Letters B, vol. 91, no. 1, pp. 99–102, 1980. View at Publisher · View at Google Scholar
  18. G. Hinshaw, “WMAP Collaboration,” arXiv:1212.5226.
  19. P. A. R. Ade, “Planck Collaboration,” arXiv:1303.5082.
  20. T. P. Sotiriou and V. Faraoni, “f(R) theories of gravity,” Reviews of Modern Physics, vol. 82, no. 1, article 451, 2010. View at Publisher · View at Google Scholar · View at MathSciNet