Is Energy Conserved in General Relativity?

original by Michael Weiss and John Baez

In special cases, yes. In general--- it depends on what you mean by "energy", and what you mean by "conserved".

In flat spacetime (the backdrop for special relativity) you can phrase energy conservation in two ways: as a differential equation, or as an equation involving integrals (gory details below). The two formulations are mathematically equivalent. But when you try to generalize this to curved spacetimes (the arena for general relativity) this equivalence breaks down. The differential form extends with nary a hiccup; not so the integral form. The differential form says, loosely speaking, that no energy is created in any infinitesimal piece of spacetime. The integral form says the same for a finite-sized piece. (This may remind you of the "divergence" and "flux" forms of Gauss's law in electrostatics, or the equation of continuity in fluid dynamics. Hold on to that thought!)

An infinitesimal piece of spacetime "looks flat", while the effects of curvature become evident in a finite piece. (The same holds for curved surfaces in space, of course). GR relates curvature to gravity. Now, even in Newtonian physics, you must include gravitational potential energy to get energy conservation. And GR introduces the new phenomenon of gravitational waves; perhaps these carry energy as well? Perhaps we need to include gravitational energy in some fashion, to arrive at a law of energy conservation for finite pieces of spacetime?

Casting about for a mathematical expression of these ideas, physicists came up with something called an energy pseudo-tensor. (In fact, several of 'em!) Now, GR takes pride in treating all coordinate systems equally. Mathematicians invented tensors precisely to meet this sort of demand--- if a tensor equation holds in one coordinate system, it holds in all. Pseudo-tensors are not tensors (surprise!), and this alone raises eyebrows in some circles. In GR, one must always guard against mistaking artifacts of a particular coordinate system for real physical effects. (See the FAQ entry on black holes for some examples.)

These pseudo-tensors have some rather strange properties. If you choose the "wrong" coordinates, they are non-zero even in flat empty spacetime. By another choice of coordinates, they can be made zero at any chosen point, even in a spacetime full of gravitational radiation. For these reasons, most physicists who work in general relativity do not believe the pseudo-tensors give a good local definition of energy density, although their integrals are sometimes useful as a measure of total energy.

One other complaint about the pseudo-tensors deserves mention. Einstein argued that all energy has mass, and all mass acts gravitationally. Does "gravitational energy" itself act as a source of gravity? Now, the Einstein field equations are

            G_{mu,nu} = 8pi T_{mu,nu}

Here G_{mu,nu} is the Einstein curvature tensor, which encodes information about the curvature of spacetime, and T_{mu,nu} is the so-called stress-energy tensor, which we will meet again below. T_{mu,nu} represents the energy due to matter and electromagnetic fields, but includes NO contribution from "gravitational energy". So one can argue that "gravitational energy" does NOT act as a source of gravity. On the other hand, the Einstein field equations are non-linear; this implies that gravitational waves interact with each other (unlike light waves in Maxwell's (linear) theory). So one can argue that "gravitational energy" IS a source of gravity.

In certain special cases, energy conservation works out with fewer caveats. The two main examples are static spacetimes and asymptotically flat spacetimes.

Let's look at four examples before plunging deeper into the math. Three examples involve redshift, the other, gravitational radiation.

Very fast objects emitting light

According to special relativity, you will see light coming from a receding object as redshifted. So if you, and someone moving with the source, both measure the light's energy, you'll get different answers. Note that this has nothing to do with energy conservation per se. Even in Newtonian physics, kinetic energy (mv^2/2) depends on the choice of reference frame. However, relativity serves up a new twist. In Newtonian physics, energy conservation and momentum conservation are two separate laws. Special relativity welds them into one law, the conservation of the energy-momentum 4-vector. To learn the whole scoop on 4-vectors, read a text on SR, for example Taylor and Wheeler (see refs.) For our purposes, it's enough to remark that 4-vectors are vectors in spacetime, which most people privately picture just like ordinary vectors (unless they have very active imaginations).

Very massive objects emitting light

Light from the Sun appears redshifted to an Earthbound astronomer. In quasi-Newtonian terms, we might say that light loses kinetic energy as it climbs out of the gravitational well of the Sun, but gains potential energy. General relativity looks at it differently. In GR, gravity is described not by a "potential" but by the "metric" of spacetime. But "no problem", as the saying goes. The Schwarzschild metric describes spacetime around a massive object, if the object is spherically symmetrical, uncharged, and "alone in the universe". The Schwarzschild metric is both static and asymptotically flat, and energy conservation holds without major pitfalls. For further details, consult MTW, chapter 25.

Gravitational waves

A binary pulsar emits gravitational waves, according to GR, and one expects (innocent word!) that these waves will carry away energy. So its orbital period should change. Einstein derived a formula for the rate of change (known as the quadrapole formula), and in the centenary of Einstein's birth, Russell Hulse and Joseph Taylor reported that the binary pulsar PSR1913+16 bore out Einstein's predictions within a few percent. Hulse and Taylor were awarded the Nobel prize in 1993.

Despite this success, Einstein's formula remained controversial for many years, partly because of the subtleties surrounding energy conservation in GR. The need to understand this situation better has kept GR theoreticians busy over the last few years. Einstein's formula now seems well-established, both theoretically and observationally.

Expansion of the universe leading to cosmological redshift

The Cosmic Background Radiation (CBR) has red-shifted over billions of years. Each photon gets redder and redder. What happens to this energy? Cosmologists model the expanding universe with Friedmann-Robertson-Walker (FRW) spacetimes. (The familiar "expanding balloon speckled with galaxies" belongs to this class of models.) The FRW spacetimes are neither static nor asymptotically flat. Those who harbor no qualms about pseudo-tensors will say that radiant energy becomes gravitational energy. Others will say that the energy is simply lost.

It's time to look at mathematical fine points. There are many to choose from! The definition of asymptotically flat, for example, calls for some care (see Stewart); one worries about "boundary conditions at infinity". (In fact, both spatial infinity and "null infinity" clamor for attention--- leading to different kinds of total energy.) The static case has close connections with Noether's theorem (see Goldstein or Arnold). If the catch-phrase "time translation symmetry implies conservation of energy" rings a bell (perhaps from quantum mechanics), then you're on the right track. (Check out "Killing vector" in the index of MTW, Wald, or Sachs and Wu.)

But two issues call for more discussion. Why does the equivalence between the two forms of energy conservation break down? How do the pseudo-tensors slide around this difficulty?

We've seen already that we should be talking about the energy-momentum 4-vector, not just its time-like component (the energy). Let's consider first the case of flat Minkowski spacetime. Recall that the notion of "inertial frame" corresponds to a special kind of coordinate system (Minkowskian coordinates).

Pick an inertial reference frame. Pick a volume V in this frame, and pick two times t=t_0 and t=t_1. One formulation of energy-momentum conservation says that the energy-momentum inside V changes only because of energy-momentum flowing across the boundary surface (call it S). It is "conceptually difficult, mathematically easy" to define a quantity T so that the captions on the Equation 1 (below) are correct. (The quoted phrase comes from Sachs and Wu.)

  Equation 1:  (valid in flat Minkowski spacetime, when Minkowskian
                coordinates are used) 

                                               t=t_1
       /                  /                    /
       |                  |                    |
       | T dV     -       | T dV       =       | T dt dS
       /                  /                    /
      V,t=t_0           V,t=t_1               t=t_0

   p contained       p contained            p flowing out through
   in volume V    -  in volume V       =    boundary S of V
   at time t_0       at time t_1            during t=t_0 to t=t_1

   (Note: p = energy-momentum 4-vector)

T is called the stress-energy tensor. You don't need to know what that means! ---just that you can integrate T, as shown, to get 4-vectors. Equation 1 may remind you of Gauss's theorem, which deals with flux across a boundary. If you look at Equation 1 in the right 4-dimensional frame of mind, you'll discover it really says that the flux across the boundary of a certain 4-dimensional hypervolume is zero. (The hypervolume is swept out by V during the interval t=t_0 to t=t_1.) MTW, chapter 7, explains this with pictures galore. (See also Wheeler.)

A 4-dimensional analogue to Gauss's theorem shows that Equation 1 is equivalent to:

  Equation 2:  (valid in flat Minkowski spacetime, with Minkowskian
                coordinates)

       coord_div(T) = sum_mu (partial T/partial x_mu) = 0

We write "coord_div" for the divergence, for we will meet another divergence in a moment. Proof? Quite similar to Gauss's theorem: if the divergence is zero throughout the hypervolume, then the flux across the boundary must also be zero. On the other hand, the flux out of an infinitesimally small hypervolume turns out to be the divergence times the measure of the hypervolume.

Pass now to the general case of any spacetime satisfying Einstein's field equation. It is easy to generalize the differential form of energy-momentum conservation, Equation 2:

  Equation 3:  (valid in any GR spacetime)

        covariant_div(T) = sum_mu nabla_mu(T) = 0    

                    (where nabla_mu = covariant derivative)

(Side comment: Equation 3 is the correct generalization of Equation 1 for SR when non-Minkowskian coordinates are used.)

GR relies heavily on the covariant derivative, because the covariant derivative of a tensor is a tensor, and as we've seen, GR loves tensors. Equation 3 follows from Einstein's field equation (because something called Bianchi's identity says that covariant_div(G)=0). But Equation 3 is no longer equivalent to Equation 1!

Why not? Well, the familiar form of Gauss's theorem (from electrostatics) holds for any spacetime, because essentially you are summing fluxes over a partition of the volume into infinitesimally small pieces. The sum over the faces of one infinitesimal piece is a divergence. But the total contribution from an interior face is zero, since what flows out of one piece flows into its neighbor. So the integral of the divergence over the volume equals the flux through the boundary. "QED".

But for the equivalence of Equations 1 and 3, we would need an extension of Gauss's theorem. Now the flux through a face is not a scalar, but a vector (the flux of energy-momentum through the face). The argument just sketched involves adding these vectors, which are defined at different points in spacetime. Such "remote vector comparison" runs into trouble precisely for curved spacetimes.

The mathematician Levi-Civita invented the standard solution to this problem, and dubbed it "parallel transport". It's easy to picture parallel transport: just move the vector along a path, keeping its direction "as constant as possible". (Naturally, some non-trivial mathematics lurks behind the phrase in quotation marks. But even pop-science expositions of GR do a good job explaining parallel transport.) The parallel transport of a vector depends on the transportation path; for the canonical example, imagine parallel transporting a vector on a sphere. But parallel transportation over an "infinitesimal distance" suffers no such ambiguity. (It's not hard to see the connection with curvature.)

To compute a divergence, we need to compare quantities (here vectors) on opposite faces. Using parallel transport for this leads to the covariant divergence. This is well-defined, because we're dealing with an infinitesimal hypervolume. But to add up fluxes all over a finite-sized hypervolume (as in the contemplated extension of Gauss's theorem) runs smack into the dependence on transportation path. So the flux integral is not well-defined, and we have no analogue for Gauss's theorem.

One way to get round this is to pick one coordinate system, and transport vectors so their components stay constant. Partial derivatives replace covariant derivatives, and Gauss's theorem is restored. The energy pseudo-tensors take this approach (at least some of them do). If you can mangle Equation 3 (covariant_div(T) = 0) into the form:

       coord_div(Theta) = 0

then you can get an "energy conservation law" in integral form. Einstein was the first to do this; Dirac, Landau and Lifshitz, and Weinberg all came up with variations on this theme. We've said enough already on the pros and cons of this approach.

We will not delve into definitions of energy in general relativity such as the Hamiltonian (amusingly, the energy of a closed universe always works out to zero according to this definition), various kinds of energy one hopes to obtain by "deparametrizing" Einstein's equations, or "quasilocal energy". There's quite a bit to say about this sort of thing! Indeed, the issue of energy in general relativity has a lot to do with the notorious "problem of time" in quantum gravity.... but that's another can of worms.

References (vaguely in order of difficulty):

Clifford Will, "The renaissance of general relativity", in "The New Physics" (ed. Paul Davies) gives a semi-technical discussion of the controversy over gravitational radiation.
Wheeler, "A Journey into Gravity and Spacetime". Wheeler's try at a "pop-science" treatment of GR. Chapters 6 and 7 are a tour-de-force: Wheeler tries for a non-technical explanation of Cartan's formulation of Einstein's field equation. It might be easier just to read MTW!)
Taylor and Wheeler, "Spacetime Physics".
Goldstein, "Classical Mechanics".
Arnold, "Mathematical Methods in Classical Mechanics".
Misner, Thorne, and Wheeler (MTW), "Gravitation", chapters 7, 20, and 25
Wald, "General Relativity", Appendix E. This has the Hamiltonian formalism and a bit about deparametrizing, and chapter 11 discusses energy in asymptotically flat spacetimes.
H. A. Buchdahl, "Seventeen Simple Lectures on General Relativity Theory" Lecture 15 derives the energy-loss formula for the binary star, and criticizes the derivation.
Sachs and Wu, "General Relativity for Mathematicians", chapter 3
John Stewart, "Advanced General Relativity". Chapter 3 ("Asymptopia") shows just how careful one has to be in asymptotically flat spacetimes to recover energy conservation. Stewart also discusses the Bondi-Sachs mass, another contender for "energy".
Damour, in "300 Years of Gravitation" (ed. Hawking and Israel). Damour heads the "Paris group", which has been active in the theory of gravitational radiation.
Penrose and Rindler, "Spinors and Spacetime", vol II, chapter 9. The Bondi-Sachs mass generalized.
J. David Brown and James York Jr., "Quasilocal energy in general relativity", in "Mathematical Aspects of Classical Field Theory".