In flat spacetime (the backdrop for special relativity) you can phrase energy conservation in two ways: as a differential equation, or as an equation involving integrals (gory details below). The two formulations are mathematically equivalent. But when you try to generalize this to curved spacetimes (the arena for general relativity) this equivalence breaks down. The differential form extends with nary a hiccup; not so the integral form. The differential form says, loosely speaking, that no energy is created in any infinitesimal piece of spacetime. The integral form says the same for a finite-sized piece. (This may remind you of the "divergence" and "flux" forms of Gauss's law in electrostatics, or the equation of continuity in fluid dynamics. Hold on to that thought!)

An infinitesimal piece of spacetime "looks flat", while the effects of curvature become evident in a finite piece. (The same holds for curved surfaces in space, of course). GR relates curvature to gravity. Now, even in Newtonian physics, you must include gravitational potential energy to get energy conservation. And GR introduces the new phenomenon of gravitational waves; perhaps these carry energy as well? Perhaps we need to include gravitational energy in some fashion, to arrive at a law of energy conservation for finite pieces of spacetime?

Casting about for a mathematical expression of these ideas, physicists came up with something called an energy pseudo-tensor. (In fact, several of 'em!) Now, GR takes pride in treating all coordinate systems equally. Mathematicians invented tensors precisely to meet this sort of demand--- if a tensor equation holds in one coordinate system, it holds in all. Pseudo-tensors are not tensors (surprise!), and this alone raises eyebrows in some circles. In GR, one must always guard against mistaking artifacts of a particular coordinate system for real physical effects. (See the FAQ entry on black holes for some examples.)

These pseudo-tensors have some rather strange properties. If you choose
the "wrong" coordinates, they are non-zero even in flat empty spacetime.
By another choice of coordinates, they can be made zero at any chosen point,
even in a spacetime full of gravitational radiation. For these reasons,
most physicists who work in general relativity do not believe the pseudo-tensors
give a good *local* definition of energy density, although their integrals
are sometimes useful as a measure of total energy.

One other complaint about the pseudo-tensors deserves mention. Einstein argued that all energy has mass, and all mass acts gravitationally. Does "gravitational energy" itself act as a source of gravity? Now, the Einstein field equations are

G_{mu,nu} = 8pi T_{mu,nu}Here G_{mu,nu} is the Einstein curvature tensor, which encodes information about the curvature of spacetime, and T_{mu,nu} is the so-called stress-energy tensor, which we will meet again below. T_{mu,nu} represents the energy due to matter and electromagnetic fields, but includes NO contribution from "gravitational energy". So one can argue that "gravitational energy" does NOT act as a source of gravity. On the other hand, the Einstein field equations are non-linear; this implies that gravitational waves interact with each other (unlike light waves in Maxwell's (linear) theory). So one can argue that "gravitational energy" IS a source of gravity.

In certain special cases, energy conservation works out with fewer caveats. The two main examples are static spacetimes and asymptotically flat spacetimes.

Let's look at four examples before plunging deeper into the math. Three examples involve redshift, the other, gravitational radiation.

Despite this success, Einstein's formula remained controversial for many years, partly because of the subtleties surrounding energy conservation in GR. The need to understand this situation better has kept GR theoreticians busy over the last few years. Einstein's formula now seems well-established, both theoretically and observationally.

It's time to look at mathematical fine points. There are many to choose from! The definition of asymptotically flat, for example, calls for some care (see Stewart); one worries about "boundary conditions at infinity". (In fact, both spatial infinity and "null infinity" clamor for attention--- leading to different kinds of total energy.) The static case has close connections with Noether's theorem (see Goldstein or Arnold). If the catch-phrase "time translation symmetry implies conservation of energy" rings a bell (perhaps from quantum mechanics), then you're on the right track. (Check out "Killing vector" in the index of MTW, Wald, or Sachs and Wu.)

But two issues call for more discussion. Why does the equivalence between the two forms of energy conservation break down? How do the pseudo-tensors slide around this difficulty?

We've seen already that we should be talking about the energy-momentum 4-vector, not just its time-like component (the energy). Let's consider first the case of flat Minkowski spacetime. Recall that the notion of "inertial frame" corresponds to a special kind of coordinate system (Minkowskian coordinates).

Pick an inertial reference frame. Pick a volume V in this frame, and pick two times t=t_0 and t=t_1. One formulation of energy-momentum conservation says that the energy-momentum inside V changes only because of energy-momentum flowing across the boundary surface (call it S). It is "conceptually difficult, mathematically easy" to define a quantity T so that the captions on the Equation 1 (below) are correct. (The quoted phrase comes from Sachs and Wu.)

Equation 1: (valid in flat Minkowski spacetime, when Minkowskian coordinates are used) t=t_1 / / / | | | | T dV - | T dV = | T dt dS / / / V,t=t_0 V,t=t_1 t=t_0 p contained p contained p flowing out through in volume V - in volume V = boundary S of V at time t_0 at time t_1 during t=t_0 to t=t_1 (Note: p = energy-momentum 4-vector)T is called the stress-energy tensor. You don't need to know what that means! ---just that you can integrate T, as shown, to get 4-vectors. Equation 1 may remind you of Gauss's theorem, which deals with flux across a boundary. If you look at Equation 1 in the right 4-dimensional frame of mind, you'll discover it really says that the flux across the boundary of a certain 4-dimensional hypervolume is zero. (The hypervolume is swept out by V during the interval t=t_0 to t=t_1.) MTW, chapter 7, explains this with pictures galore. (See also Wheeler.)

A 4-dimensional analogue to Gauss's theorem shows that Equation 1 is equivalent to:

Equation 2: (valid in flat Minkowski spacetime, with Minkowskian coordinates) coord_div(T) = sum_mu (partial T/partial x_mu) = 0We write "coord_div" for the divergence, for we will meet another divergence in a moment. Proof? Quite similar to Gauss's theorem: if the divergence is zero throughout the hypervolume, then the flux across the boundary must also be zero. On the other hand, the flux out of an infinitesimally small hypervolume turns out to be the divergence times the measure of the hypervolume.

Pass now to the general case of any spacetime satisfying Einstein's field equation. It is easy to generalize the differential form of energy-momentum conservation, Equation 2:

Equation 3: (valid in any GR spacetime) covariant_div(T) = sum_mu nabla_mu(T) = 0 (where nabla_mu = covariant derivative)(Side comment: Equation 3 is the correct generalization of Equation 1 for SR when non-Minkowskian coordinates are used.)

GR relies heavily on the covariant derivative, because the covariant derivative of a tensor is a tensor, and as we've seen, GR loves tensors. Equation 3 follows from Einstein's field equation (because something called Bianchi's identity says that covariant_div(G)=0). But Equation 3 is no longer equivalent to Equation 1!

Why not? Well, the familiar form of Gauss's theorem (from electrostatics) holds for any spacetime, because essentially you are summing fluxes over a partition of the volume into infinitesimally small pieces. The sum over the faces of one infinitesimal piece is a divergence. But the total contribution from an interior face is zero, since what flows out of one piece flows into its neighbor. So the integral of the divergence over the volume equals the flux through the boundary. "QED".

But for the equivalence of Equations 1 and 3, we would need an extension of Gauss's theorem. Now the flux through a face is not a scalar, but a vector (the flux of energy-momentum through the face). The argument just sketched involves adding these vectors, which are defined at different points in spacetime. Such "remote vector comparison" runs into trouble precisely for curved spacetimes.

The mathematician Levi-Civita invented the standard solution to this problem, and dubbed it "parallel transport". It's easy to picture parallel transport: just move the vector along a path, keeping its direction "as constant as possible". (Naturally, some non-trivial mathematics lurks behind the phrase in quotation marks. But even pop-science expositions of GR do a good job explaining parallel transport.) The parallel transport of a vector depends on the transportation path; for the canonical example, imagine parallel transporting a vector on a sphere. But parallel transportation over an "infinitesimal distance" suffers no such ambiguity. (It's not hard to see the connection with curvature.)

To compute a divergence, we need to compare quantities (here vectors) on opposite faces. Using parallel transport for this leads to the covariant divergence. This is well-defined, because we're dealing with an infinitesimal hypervolume. But to add up fluxes all over a finite-sized hypervolume (as in the contemplated extension of Gauss's theorem) runs smack into the dependence on transportation path. So the flux integral is not well-defined, and we have no analogue for Gauss's theorem.

One way to get round this is to pick one coordinate system, and transport
vectors so their *components* stay constant. Partial derivatives replace
covariant derivatives, and Gauss's theorem is restored. The energy pseudo-tensors
take this approach (at least some of them do). If you can mangle Equation
3 (covariant_div(T) = 0) into the form:

coord_div(Theta) = 0then you can get an "energy conservation law" in integral form. Einstein was the first to do this; Dirac, Landau and Lifshitz, and Weinberg all came up with variations on this theme. We've said enough already on the pros and cons of this approach.

We will not delve into definitions of energy in general relativity such as the Hamiltonian (amusingly, the energy of a closed universe always works out to zero according to this definition), various kinds of energy one hopes to obtain by "deparametrizing" Einstein's equations, or "quasilocal energy". There's quite a bit to say about this sort of thing! Indeed, the issue of energy in general relativity has a lot to do with the notorious "problem of time" in quantum gravity.... but that's another can of worms.

References (vaguely in order of difficulty):

- Clifford Will, "The renaissance of general relativity", in "The New Physics" (ed. Paul Davies) gives a semi-technical discussion of the controversy over gravitational radiation.
- Wheeler, "A Journey into Gravity and Spacetime". Wheeler's try at a "pop-science" treatment of GR. Chapters 6 and 7 are a tour-de-force: Wheeler tries for a non-technical explanation of Cartan's formulation of Einstein's field equation. It might be easier just to read MTW!)
- Taylor and Wheeler, "Spacetime Physics".
- Goldstein, "Classical Mechanics".
- Arnold, "Mathematical Methods in Classical Mechanics".
- Misner, Thorne, and Wheeler (MTW), "Gravitation", chapters 7, 20, and 25
- Wald, "General Relativity", Appendix E. This has the Hamiltonian formalism and a bit about deparametrizing, and chapter 11 discusses energy in asymptotically flat spacetimes.
- H. A. Buchdahl, "Seventeen Simple Lectures on General Relativity Theory" Lecture 15 derives the energy-loss formula for the binary star, and criticizes the derivation.
- Sachs and Wu, "General Relativity for Mathematicians", chapter 3
- John Stewart, "Advanced General Relativity". Chapter 3 ("Asymptopia") shows just how careful one has to be in asymptotically flat spacetimes to recover energy conservation. Stewart also discusses the Bondi-Sachs mass, another contender for "energy".
- Damour, in "300 Years of Gravitation" (ed. Hawking and Israel). Damour heads the "Paris group", which has been active in the theory of gravitational radiation.
- Penrose and Rindler, "Spinors and Spacetime", vol II, chapter 9. The Bondi-Sachs mass generalized.
- J. David Brown and James York Jr., "Quasilocal energy in general relativity", in "Mathematical Aspects of Classical Field Theory".