Introduction to General Relativity

General Relativity : the Einstein field equation

General Relativity is the theory of physics that describes space-time as a curved geometrical space, extending Special Relativity theory (only accepted as the approximative description of small regions of space-time), to explain gravitation as the effect of the curvature of space-time.

Beyond some mere intuitive popularization comments explaining how gravitational effects can be explained by space-time curvature, while the gravitational force cannot be distinguished from inertial forces, the main formula of General Relativity that makes it precise, is the Einstein field equation, relating space-time curvature, to the stress-energy tensor.

To fully, properly express it, we need the formalism of tensors.

Still it is possible to give the essential idea of the meaning and justification of this equation without using the tensor formalism, by taking the case of cosmology (the expansion of the universe).

Other remark : Energy in GR can no more be described as a precise quantity simply obtained by integrating some explicit field along the space-like 3D surface we consider. This does not mean that there is no conservation of energy. But it takes a more subtle, complex form. In particular we can roughly define the mass inside a sphere with size r, as r times the integral of the intrinsic Riemannian curvature of space through this sphere. As this is computed from the surface only and not from the inside, it is not possible to modify it by purely local processes : it is necessary to do something that can affect the surface. Thus if you take a large sphere away from the system you consider, you can only increase your local energy by bringing it from far away.

Sorry that in this site is only an introduction to tensors, giving some basic definitions but not yet the needed developments... that anyway you can find in any course elsewhere about tensors.

Now the below explanations will assume the reader to have already learned about tensorial operations not explained here (until this gap will be filled someday), and also of course, what is the Riemann curvature tensor.

Here again, like with other subjects in the foundations of maths and physics, it is a pity to see that almost one century after its discovery, the courses on General Relativity that are usually found (at least the presentation on Wikipedia) still express this equation in its draft, messy form, not mentioning the cleaner way that more directly expresses its meaning.

Let us recall that expression.
There is of course the essential idea, the simple statement, saying that we have a relation between the stress-energy tensor, and the Riemann curvature of space-time, that we can write in a compact form as

G_ij + Λ g_ij =

8πG

c⁴

T_ij

where

T_ij represents the stress-energy tensor, but with lowered indices unlike its more canonical use as T^ij in relativistic mechanics
G_ij is the Einstein tensor, that is a function of the Riemann curvature tensor.
The constants G and c are the Newtonian gravitational constant and the speed of light
g_ij is the metric, and Λ is the cosmological constant, that may be taken as 0 to simplify the formula by changing the definition of T_ij (adding to it a constant multiple of g_ij), so that, finally, G_ij and T_ij become simple multiples of each other.

The problem, what is messy in usual texts, is how the Einstein tensor (with 2 symmetric indices) is expressed as a function of the Riemann curvature tensor (with 4 indices).
Indeed they first introduce the Ricci curvature tensor, that is a symmetric tensor with 2 indices just like the one we need, extracted from the Riemann tensor, and that has all the necessary information in it:

R_ij = R^k_ikj

But this is still not the tensor we need. So then we need another formula to get the Einstein tensor from it

G_ij = R_ij −

R g_ij

where R is the scalar curvature R = g^ij R_ij.
But it begs the question: what the f**k is this formula ??? Why is it this formula, made of 2 terms, rather than anything else (for example, why are not G_ij and R_ijjust equal) ?
The answer, also well-known, is that this is what we need for the conservation of the Einstein tensor to be deduced from the Bianchi identity.
Okay, the usual proof of this is not complicated, but here will be another way of writing it, which I think is more elegant.

Let us introduce the symmetrizer (using the Kronecker delta δⁱ_j)

S^ijk_lmn = δⁱ_l δ^j_m δ^k_n + δⁱ_m δ^j_n δ^k_l + δⁱ_n δ^j_l δ^k_m

It behaves as a total antisymmetrizer when applied to a tensor with 2 indices that is already antisymmetric between these 2 indices: for every antisymmetric tensor A_ij, the expression A_ij S^ijk_lmn is totally antisymmetric between the indices lmn.

Okay, now let us take the Riemann tensor R^a_bij. We already know that it is antisymmetric between indices ij. We deduce that R^a_bij S^ijk_lmn is totally antisymmetric between the indices lmn.
This expression can be used to write the second Bianchi identity on the Riemann curvature, as

∇_k R^a_bij S^ijk_lmn = 0

The intuitive meaning of this identity is that for every small closed curve along which we consider a parallel transport, the rotation produced by this transport does not depend on the choice of surface bordered by this curve, over which we integrate the curvature to calculate this small rotation : moving this surface (with direction ij) towards the direction k but keeping the same border, has no effect.

Now by uppering the index b, we get the expression R^ab_ij S^ijk_lmn, As each of both pairs (ab) and (lm) is antisymmetric, the contraction between these 2 pairs gives a factor 2 of redundance, which we can factor out without fractioning the underlying operations on coordinates. And this is how the Einstein tensor is actually obtained :

G_n^k = −

R^ab_ij S^ijk_abn

Remark: another notation for this expression R^ab_ij S^ijk_abn would be by using the [ ] for antisymmetrization:

3 R^ab_[ab δ^k_n]

Indeed when replacing S by its definition, this expression develops as :

−2 G_n^k = R^ab_ab δ^k_n + R^kb_bn + R^ak_na

We see here the Ricci tensor R_n^k = − R^kb_bn = − R^ak_na and the scalar curvature R = g^ij R_ij = R^ab_ab. This gives the (slightly rewritten) previous formula of the Einstein tensor

G_n^k = R_n^k −

R δ^k_n

The conservation of the Einstein tensor is directly deduced from the above Bianchi identity:

-2∇_kG_n^k = ∇_k R^ab_ijS^ijk_abn = 0

Now what is the interest of writing the Einstein tensor using S rather than the usual expression :
It presents this operation as a contraction between the Hodge duals of the antisymmetric pairs in the Riemann tensor (this contraction has order 1 = n−3, where n = 4 is the dimension of space-time, and 3 is the number of indices on which S operates), instead of between these pairs themselves.
Namely, it can be written using the Levi-Civita symbol ε as

G^nk =

R^a_bij ε_a^bn_l ε^ijkl

(with sign changed because of the odd signature of space-time)
Note that the lower position of ij in R^a_bij justifies to have k in upper position (to use ε^ijkl independently of the metric), while the good position (up or down) of n, is more ambiguous, as it depends on the more arbitrary choice of position of indices a and b.

This explicitly shows that the curvature in a given pair of dimensions contributes to the Einstein tensor only in the orthogonal dimensions. Which is more elegant than to first write the contribution as if it was in these dimensions (in the Ricci tensor), then make a global substraction in all dimensions by (1/2) R g_ij, which just happens to cancel the contribution in these dimensions and leave an opposite contribution in the rest of dimensions.

Basic examples

The uniform unidirectional curvature

Take the 4-dimensional Riemannian manifold (=curved 4-dimensional space that is approximately Euclidean at small scales) defined as the cartesian product M=E×F where E is a (2-dimensional) sphere, and F is an Euclidean plane.
It has the remarkable combination of properties of being very simple, with only one nonzero component of its Riemann curvature tensor at each point (the given by the Gaussian curvature of E), and still it turns out to be general enough so that any possible value of the Riemann curvature at a point of any 4-dimensional Riemannian manifold, equals some superposition of a number of rotated images of this one (this superposition is defined in first approximation near the point, by adding up the geometric distortions observed on a given figure when it is embedded in such rotated images of this manifold).
So, it suffices for us in a first time to describe the Ricci tensor and the Einstein tensor in this manifold, as those in any other manifold will be deduced from this case by superposition of its rotated images.

The Ricci tensor at every point of this space M, can be imagined as a symmetric bilinear form on vectors from this point, with only 2 nonzero components, that are in the 2 dimensions of E. Like any symmetric bilinear form it can equivalently be represented by a quadratic form. That is, a field whose values are a quadratic function of the position. The sets of points where it takes any given value, are quadrics centered on the chosen point ; and for every (small) value taken as reference, the more intense the field, the smaller the quadric (more precisely, shrinking the size by 2, corresponds to multiplying the field by 4). Here in particular, these quadrics are just the cylinders, cartesian product of a small circle C in E, with F.

We can modify the intensity of the curvature, just by changing the radius of E. Dividing its radius by 2, results in multiplying the curvature by 4, so that the cylinder also shrinks by 2. So, the circle C, that is a small circle around a point of the sphere E, keeps the same proportion to E when the size of E changes: The proportion of C's radius in E being a small angle, its square is the small dimensionless number that is the value the quadratic form defined by the Ricci tensor, takes on C×F.

Okay, now let us describe the Einstein tensor there.

If we look at its definition as G_ij = R_ij − 1/2 R g_ij, it would be represented by another quadratic form, that cancels in the direction of E and only varies as a function of the projection in F (because g_ij just gives the dot square function of vectors, and 1/2 R g_ij coincides with R_ij in the direction of E).

However, its other expression that we gave above, suggests to represent it as twice contravariant (i.e. as a combination of tensor squares of vectors, or a quadratic form on the space of covectors) rather than twice covariant (combination of tensor squares of covectors, or quadratic form on the space of vectors), which better fits with its identification with the stress-energy tensor (that is twice contravariant).
But the visual representation of a twice contravariant symmetric tensor, differs from that of a twice covariant one. It is still represented as an ellipsoid (if it is positive definite), but its size is bigger when the tensor is multiplied by a scalar quantity a (it is dilated by the square root of a) instead of being smaller (shrinked by the square root of a). And when, like here for our manifold M=E×F, it is not positive definite (namely the matrix is diagonal but some diagonal coefficients cancel), then this ellipsoid shrinks onto a disk, instead of extending as a cylinder.
Namely here, this disk extends in the (flat) direction of the plane F, with no extension in the (curved) direction of E. (The circularity of this disk comes from the implicit use of the metric when we lowered the index l to form the expression R^a_bij ε_a^bn_l ε^ijkl)

The isolated (dense or singular) axis of curvature

Let us modify the above example by replacing the sphere E by a surface that is only curved in one region, and flat outside. For example we can represent E by a sheet of paper forming a cone in the 3-dimensional Euclidean space, and whose vertex would have been smoothed, replaced by a spherical cap. So here the curved region is shaped like a disk, but it does not really matter which shape it has: it may be a triangle or anything. What matters is that it is a small region that is curved, and extended by a flat surface.
The "total curvature" (or surface integral of the curvature) of this region, is measured by parallel transport around it, that is, how the flat surface glues itself. When we imagine it as made of a piece of paper that was cut to be glued as an extension of this region like a cone around a spherical cap, this piece of paper is then glued back to itself, and the total curvature of this the angle by which the paper is glued to itself.

We can consider the limit case of a singular curvature, when the curved region shrinks to a point while its total curvature remains constant. Then the surface is just a cone, flat except at one point of singularity (the vertex), whose "total curvature" is the angle of difference between the full turn around it, and 2π radians.

Now let us still take the cartesian product of this surface E with the simple Euclidean plane F.

And describe the Einstein tensor of the result:

Inside the curved region, it is just the same as the previous case: the Einstein tensor is shaped like a disk in the direction of F
Outside, since the space is flat, the Einstein tensor cancels too.

Natural, isn't it ? The forces are just flowing in the direction where the curvature extends and must be conserved. They cannot flow in any other direction.
Let us now describe this more concretely in the usual space-time language.

If the time dimension is one of the dimensions of F, we see a 3-dimensional space with a fixed curvature on an axis. In the case of a positive curvature (an angle of full turn around this axis smaller than 2π radians), this axis is endowed with a positive linear density of energy, together with an attractive force along this axis.
If, instead, the time dimension is one of the dimensions of E, we have an instantaneous "flash" of presence of a space-time curvature along a plane F'={0}×F where 0 is the singularity of curvature in E. Only at a precise time on F', something happens, while no effect is noticed elsewhere. What happens is that if there were 2 objects initially at rest relatively to each other, and then at time of the flash, they happen to separated by F', then this makes them turn out to be moving relatively to each other after the flash, with a direction of speed orthogonal to F'.

The case of "repulsive acceleration" between sides of the space separated by F' (the acceleration is orthogonal to F') corresponds to the case of presence of an attractive force along F'. This is just like the surface tension that is found at the surface of a liquid, except that it is only at one time (while the surface of a liquid persists in time), and cannot come together with any energy density (because the energy, that needs to be preserved, has nowhere to go here).
The case of "attractive acceleration" between sides of F', corresponds to the presence of a repulsive force along F'.

(To be continued)

General Relativity : the Einstein field equation

Basic examples

The uniform unidirectional curvature

The isolated (dense or singular) axis of curvature

See also