A Little Tensor Geometry

Back to Contents

    Enough, I hope, to give the reader a feel for what these weird mathematical constructs mean and what they do for us.

Surveying Space and Time

    We use rulers and clocks to measure space and time, distance and duration. But measurement by itself only gives us results analogous to an indeterminate metes and bounds description of a piece of land: it sits in isolation, referred only to on-site landmarks. In order to discern more general patterns (i.e. the laws of physics on a proper cosmic scale) we need to refer our measurements to a system of reference that, in concept at least, spans the cosmos and its history.

    To fill that need we must devise a means of describing the warp and weft of a four-dimensional tapestry, one on which we can embroider our measurements. That means consists of conceiving an infinite set of imaginary lines occupying and marking the continuum of space and time. We construct the set from four subsets, each subset consisting of lines that do not cross each other but which cross lines in all three of the other subsets. On each line of a subset we use real numbers to denumerate the locations of points on the line relative to an arbitrarily chosen zero point and we call those numbers the coordinates of the subset: we call the difference between two coordinates on any given line the distance between the points bearing those coordinates (calling the difference duration if the line represents the elapse of time). We call the entire set a coordinate grid and, in concept, we can lay it out by surveying it with our rulers and clocks (each clock tracing a line through time).

    Within the coordinate grid we identify geometric points (having only spatial coordinates) and instantaneous events (the spatio-temporal analogue of geometric points) and we designate them with three- or four-component vectors, q=(q1, q2, q3, q4), in which q1, q2, and q3 represent spatial coordinates and q4 represents a temporal coordinate (some physicists and mathematicians use 0,1,2,3 as coordinate indices with q0 representing the temporal coordinate. I prefer not to use that convention, having been brought up to think of time as the fourth, not the zeroth, dimension).

    In order to make our grid useful, to enable ourselves to embroider our measurements onto it, we must give it a basis, a set of linearly independent vectors of which we can represent any other vector on the grid as a finite linear combination. Ideally we want to use a finite-dimensional orthonormal basis; specifically, we want to use a finite set of mutually orthogonal unit vectors as our basis. The space must also support an inner product, the analogue of the vector dot product in three-space (which we calculate as the product of the magnitudes of the two vectors and the cosine of the angle between them). In Euclidean four-space, then, we have the unit vectors as 1=(1, 0, 0, 0), 2=(0, 1, 0, 0), 3=(0, 0, 1, 0), and 4=(0, 0, 0, 1). For those vectors we have the inner product as

(Eqn 1)

in which δik represents the Kronecker delta and the minus sign applies only when i=k=4. Those unit vectors have magnitudes equal to one. Now we know that we can represent any vector in Euclidean four-space (Hermann Minkowskis spacetime) as a linear sum,

(Eqn 2)

    The process of extracting components from a vector at a given point (or event) consists, in mathematical concept, of projecting the vector onto the coordinate lines passing through the point and taking the projections as the components. The inner product gives us the algebraic analogue of that geometric process; in essence, we project the vector onto the basis vectors. Because the basis vectors bi differ from the unit vectors only in not having a magnitude necessarily equal to one, we must normalize the inner product and the associated basis vector in order to obtain a correct description of the vector,

(Eqn 3)

    If, instead of directly measuring the vector itself, we measure its components, we can calculate the magnitude of the vector from the description in Equation 2. The first step in that calculation gives us the inner product of the vector with itself, the norm,

(Eqn 4)

In going to the last step in that equation I made use of Equation 1. Further analysis of Equation 4 gives us the metric tensor.

The Metric Tensor

We define the metric tensor through an invariant quantity derived from a generalized version of the Pythagorean theorem, which we see expressed in Equation 4. Assume that we have two events arbitrarily close together. We measure a differential vector ds= dqii extending straight from one event to the other with components dqi=<ds,i>. In that case we get Equation 4 as

(Eqn 5)

In the last step of that equation I have tacitly replaced the summation sigma with the Einstein convention of automatically summing over repeated indices. We now have the metric tensor defined as

(Eqn 6)

    Hermann Minkowskis flat spacetime provides us with the simplest example of a metric tensor. For that four-dimensional analogue of Euclidean three-space we have

(Eqn 7)

That, in turn, gives us Equation 5 as

(Eqn 8)

In that formulation I have used the common Cartesian grid of rectangular coordinates, but I also have other options available to me.

    If we choose, for example, to make our measurements in spherical coordinates, with θ representing the co-latitude and ϕ representing the longitude, we have the associated metric tensor as

(Eqn 9)

so that we have Equation 5 as

(Eqn 10)

If we make a slight modification in that metric tensor, we get

(Eqn 11)

which represents the Schwarzschild solution of Einsteins field equation in the case of a region of space and time that has a uniform, simple, spherical body of mass M whose center occupies the origin of the coordinate grid.

    In Equations 9 and 11 I seem to have violated the definition of Equation 6. However, a comparison of Equations 9 and 10 shows us that Ive merely shifted the coefficients of the squared coordinates in the metric equation into the metric tensor. Making that shift alters the metric tensor, not so much in Equation 9 but certainly in Equation 11 and others like it, in a way that automatically encodes Einsteins equivalence principle into the metric tensor: to the extent that a metric tensor differs from the Minkowski tensor it represents a gravitational field. That fact, that space and time can be warped out of true, will necessitate the existence of other tensors and related entities.

    We can see clearly now that the metric tensor, by taking the coefficients of the coordinates, tells us the shape of space and time. Also note that Equation 5 gives us a generalized version of the Pythagorean theorem. The version that I have written contains a subtle error: in order to multiply tensors together we must have one as covariant and the other as contravariant. In Equation 5 I have written all of the tensors in the product as covariant. Mathematical propriety requires that we rewrite Equation 5 as

(Eqn 12)

In that equation the subscripts mark the covariant tensor and the superscripts mark the contravariant tensors.

The Christoffel Symbols

    We have all heard that a straight line gives us the shortest distance between two fixed points. But if warped space lies between the points A and B, then the shortest distance between those points will come manifest as a curve. We call that curve a geodesic and describe it through the statement

(Eqn 13)

That statement looks like the principle of least action, so we can apply the calculus of variations to work out an explicit description of the geodesic.

    Define a function vi=dqi/dτ in which τ represents a parameter, usually taken as representing elapsed time. We can then rewrite Equation 5 as

(Eqn 14)

so we can rewrite Equation 13 as

(Eqn 15)

As in the case of Lagrangian dynamics, we have Euler-Lagrange equations

(Eqn 16)

In order to evaluate the derivatives directly we exploit the facts that

(Eqn 17)

(in which n=i or k) and that

(Eqn 18)

We thus get our Euler-Lagrange equation as

(Eqn 19)

We can rewrite that equation into a simpler form if we multiply it by the square root and exploit the facts that dτ=dqm/vm and that the metric tensor is not a function of the velocities, which makes gik/vm=0. We then have the rewritten equation as

(Eqn 20)

    In going from the first line to the second line in that equation I have exploited the fact that gik=gki, that the metric tensor appears as a symmetric matrix, to combine the third and fourth terms on the first line into the single first term on the second line. For the first two terms on the first line we have

(Eqn 21)

In going from the first line of that equation to the second line I exploited the fact that the indices merely represent dummy numbers and, thus, can be interchanged so long as we do so consistently.

    Now Equation 20 gives us the mathematical description of a geodesic curve, the shortest distance between two points in distorted space. Writing it more simply, we have

(Eqn 22)

in which

(Eqn 23)

the Christoffel symbol of the first kind. (And though its tempting to pronounce it Christ-awful, its actually pronounced Krist-oh-FELL, named for Elwin Bruno Christoffel (1829 Nov 10 1900 Mar 15), the German mathematician who discovered them.)

    If we multiply Equation 22 by gmp, we get

(Eqn 24)

in which

(Eqn 25)

represents the Christoffel symbol of the second kind. In going from Equation 22 to Equation 24 I exploited the fact that gmkgmp=δkp, the Kronecker delta, which, because it only has a non-zero value when k=p, converts vk into vp.

    Equation 24 shows us that the Christoffel symbol relates to the acceleration inherent in the metric gik and General Relativity, via the equivalence principle, relates that acceleration to gravitation. If we identify the metric tensor with the gravitational potential, albeit in somewhat spread out form, then the Christoffel symbols correspond to the gravitational forcefield.

    Thus we have raw mathematical manipulation. But can we find a more intuitive way to understand the Christoffel symbols?

    Start by noting that the scalar product of two purely arbitrary four-vectors, S=AiBi=AiBkgik remains invariant when we subject the vectors to parallel transport (see below under the Riemann Curvature Tensor). If we covariantly differentiate that scalar invariant, we must get zero;

(Eqn 26)

In this case the primed differentiation operator represents the covariant derivative. Because the only changes in the vectors come from parallel transport, the covariant derivatives of those vectors necessarily equal zero, which necessitates that the third term on the second line of Equation 26 also equal zero. And because we have chosen the two vectors arbitrarily, the covariant derivative of the metric tensor must necessarily equal zero,

(Eqn 27)

in which the upper-case omegas represent the Levi-Civita connection coefficients. The unprimed differentiation operator represents the ordinary partial derivative, of course.

    By permuting the indices of the derivative we can generate two other equations equivalent to Equation 27,

(Eqns 28)

Add those two equations together and subtract Equation 27 from their sum. Recognizing that the metric tensor and the connection coefficients possess transpose symmetry allows us to simplify the result to

(Eqn 29)

We can easily solve that equation for the connection coefficient if we remember that grmgmp=δpr. We get

(Eqn 30)

which tells us that the connection coefficients coincide precisely with the Christoffel symbols of the second kind.

The Riemann Curvature Tensor

    Given a vector field Vi, we want to differentiate it with respect to two different coordinates qm and qn, in essence differentiating the vector field with respect to a minuscule change in area. The covariant derivatives do not commute with each other mnVi nmVi so we need to determine how the two double derivatives differ from each other. Lets start by calculating out mnVi.

    Because nVi represents a second-rank tensor, we have the covariant derivative m as

(Eqn 31)

And because Vi represents a first-rank tensor (a vector), we have the covariant derivative n as

(Eqn 32)

If we use that equation to make the appropriate substitutions into Equation 31, we get

(Eqn 33)

To calculate nmVi we merely interchange the indices m and n in that equation. If we subtract that equation from the index-exchanged version, the terms in the square brackets drop out (due to symmetry) and we get

(Eqn 34)

in which we have the Riemann curvature tensor,

(Eqn 35)

We thus have a fourth-rank tensor that is anti-symmetric with respect to the indices m and n; which means that Rpimn=-Rpinm. In four-dimensional space and time that tensor has 256 components organized into block, matrix, row, and column, which we might represent as Rbmrc in order to remember how we organize those elements.

    To gain some understanding of what the Riemann tensor does we need to take another look at the double differentiation through which we derived it. Establish a point P0 (and, yes, Euclidean elephants eat pee-noughts) and extend from it two differential line segments dxm and dym, parallel to the appropriate axes, to thereby define points P1 and P2. From each of those latter points extend the alternate differential line segment to a point P3, thereby drawing a minuscule parallelogram.

    At the point P0 establish a constant contravariant vector Ai and subject it to two parallel transports. The first transport follows the path P0P1P3, which consists of a displacement dxm followed by a displacement dym. The second transport follows the path P0P2P3, which consists of a displacement dyn followed by a displacement dxn.

    Parallel transport of Ai from P0 to P1 produces the vector

(Eqn 36)

Transport of that vector to P3 then produces the vector

(Eqn 37)

The Christoffel symbol at P1 differs from the one at P0 by a minuscule increment so that

(Eqn 38)

which lets us rewrite Equation 37 as

(Eqn 39)

In cobbling up that equation I have left off the term in the square of dxm as ignorably small. We obtain the equivalent equation for the parallel transport of Ai along the path P0P2P3 by interchanging the indices m and n.

    We can calculate the net change that we would make in Ai in taking it around the closed path P0P1P3P2P0 by calculating the difference

(Eqn 40)

in which we recognize that the remainder left behind by the subtraction coincides with the Riemann curvature tensor (multiplied by the appropriate factors). That gives us a result similar to the one of Equation 34. Because the product BiAi yields a scalar invariant, we can use it to work out the covariant equivalent of Equation 40. We write

(Eqn 41)

Note that in going from the second line to the third I interchanged the indices on Bi and Ak, justifying the change by noting that because we sum over those indices the interchange does not change the value of the expression. The third line in that equation only zeroes out necessarily when the coefficient of Ai equals zero. That criterion necessitates that

(Eqn 42)

    Because we derive the Riemann curvature tensor (also know as the Riemann-Christoffel tensor) from displacing a vector around a closed loop enclosing a minuscule element of area on a surface, I conceive an analogy between the Riemann tensor and the curl operator of ordinary vector calculus. Of course a fundamental difference comes between the two operators: the curl involves a six-fold differentiation of the vector field while the Riemann tensor involves multiplication of the vector field by a 256-fold array of elements made by combining derivatives of the metric tensor. But both operators give us a measure of the curliness of the vector field, either due to the inherent curvature of the field itself or due to curvature induced by the curvature of the space and time in which the field exists.

    Let the constant vector Ak=dzk, a minuscule unit vector different from the minuscule unit vectors dxn and dym, in Equation 40. The unit vectors dzk, dxn, and dym mark the primary sides of the minuscule four-cube (tesseract) that serves as a fundamental unit of the coordinate grid (the variables x, y, z in this case do not represent the standard Cartesian coordinates: I use them here to represent the generalized coordinates q in a manner that avoids confusion). In describing the change in one unit vector Equation 40 also describes the difference between the volume of the unit four-cube in the curved space and time of the Riemann tensor and the volume of the corresponding unit four-cube of Minkowskis flat spacetime.

    To calculate the total difference in the unit volume we want to calculate the change in Equation 40 in all four directions in space and time. But the calculation in Equation 40 automatically permutes the variables x, y, and z, which makes the result six times as large as it should be. Thus we calculate the element of volume dVR in curved spacetime in proportion to the corresponding volume element dVM in Minkowski spacetime as

(Eqn 43)

Now we want to look at another way of treating volumes in curved space.

The Ricci Tensor and Scalar

    Here we get into the deepest curvature of space and time, the geometry of warped space. If we have a space described by a metric tensor, then in that space we have geodesic curves. Although we have already described a geodesic as the line that produces the minimum distance between two given points, the mathematical definition of a geodesic tells us that it is a curve that subjects any vector tangent to it to parallel transport automatically. That criterion necessitates that the covariant derivative of that vector equal zero,

(Eqn 44)

in which v represents the tangent vector in question.

    If we take elapsed time as a parameter and take vk=dxk/dt as representing the velocity at which a point moves along the geodesic, then we can convert the space derivatives into time derivatives by using the fact that viif=df/dt for any function f defined in the space. If we multiply Equation 44 by vi, then we get the geodesic equation,

(Eqn 45)

which gives us what we derived above by minimizing the length of the curve.

    Of course we can fill any given space with geodesics and assign the same parameter t to all of them. For any given value of t, then, we get a set of points that define a surface in our space. At least one line in that surface lies normal to the geodesics. On that line we have a tangent vector y and we define a parameter s that determines the location of a point on the line. Again we get parallel transport, this time of the tangent vector y.

    Pick two geodesics arbitrarily close to each other and find two points (xm(t) and xm(t)) that have the same value of the parameter t. Between those two points we can construct a minuscule vector, ym(t), which we call a deviation vector and define algebraically through the equation

(Eqn 46)

Because we have made ym(t) arbitrarily small, we can use a Taylor series expansion, drop all terms beyond those of first order in ym(t), and thereby calculate the difference between the Christoffel symbols evaluated at the two points as

(Eqn 47)

    In locating our two points arbitrarily close together we have put the un-italicized point at the center of a local inertial frame. On that minuscule patch of space the curvature of the space differs insignificantly from zero (think of a small area of the San Joaquin Valley, where the spherical Earth appears flat). At the point xa(t), then, the elements of the Christoffel symbol differ negligibly from zero, so we can treat the Christoffel symbol at that point as if it had zeroed out. Thus we have the geodesic equation at the two points as

(Eqns 48)

If we subtract the first of those equations from the second, we calculate the second parameter derivative of the deviation vector,

(Eqn 49)

    Next we increment the parameter by a minuscule amount, thereby bringing our attention to two new points (xa(t) and xa(t)). That move also gives us a new deviation vector, ya(t), extending between the two new points. We can produce that new deviation vector by parallel transporting the old deviation vector from the old pair of points to the new one. In order to calculate a description of the new deviation vector from the old one we need to know the second covariant parameter derivative of the old deviation vector. Relating the covariant parameter derivative to the ordinary covariant derivative by Dv=vqv, we have

(Eqn 50)

    In devising that equation I have made several tacit algebraic moves. In writing the covariant parameter derivative operator as

(Eqn 51)

I have made use of the fact that d/dt=vcc and applied it to Equation 32. In going from the first line to the second and from the second to the third I have exploited the fact that in our local inertial frame the Christoffel symbols effectively zero out, even though their derivatives dont. And in going from the fourth line to the fifth I have replaced the difference between the derivatives of the Christoffel symbols with the corresponding Riemann curvature tensor as it exists in a local inertial frame.

    So now we know how to calculate the length of the deviation vector extending between nearby geodesics, the paths that free particles follow. Imagine now that a minuscule patch occupies the un-italicized geodesic where the deviation vector meets it. The area vector associated with the patch stands perpendicular to any line drawn across the patch and also lies parallel to the deviation vector: the dot product of the area vector and the deviation vector thus define an arbitrarily small element of volume associated with our geodesics. If we multiply Equation 50 by that area vector daa, we get

(Eqn 52)

In writing that equation I have invoked the rule that tells us to make one covariant index and one contravariant index the same when multiplying tensors to form an inner product. But ya already has a matching covariant index on the Riemann tensor, so to enable the proper multiplication of the area vector we must change the contravariant index on the Riemann tensor to match it; thus, we get

(Eqn 53)

in which Rbc represents the Ricci tensor. In going from the first line to the second line in that equation I used the fact that Rmbcm=-Rmbmc and then applied the Einstein summation rule, in accordance with the convention for calculating the trace of a fourth-rank tensor in a process called contraction (see appendix).

    The Ricci (REECH-chee) tensor, discovered by and named after Gregorio Ricci-Curbastro (1853 Jan 12 1925 Aug 06), thus tells us the degree to which the volume of a geodesic sphere differs from that of the same sphere in a Euclidean space. Because we can represent any geometric solid as a collection of cylinders, like the one we employed above, we can apply Equation 53 to any suitably small volume element.

    Equation 53 also shows us that the Ricci tensor plays a role analogous to the role played by the metric tensor in determining the length of a minuscule line segment (Equation 5). Just as the metric tensor, multiplied by paired differential coordinates, gives us the squared differential length element, so the Ricci tensor, multiplied by paired velocities, gives us the second parameter derivative of a ratio. And just as the product gikgik=δii gives us a measure of the dimension of the space described by the metric tensor, so the product

(Eqn 54)

gives us a measure of the scalar curvature (associated with the Ricci scalar R) of the space that the metric tensor describes. More specifically, we say that the Ricci scalar represents the degree to which the volume of a given small geometric solid in a curved Riemannian space differs from the volume of the same figure in Euclidean space, in this case independent of any motion of the figure.

    That latter fact leads to the Ricci scalar playing a central role in the Hilbert action, postulated in 1915 by David Hilbert and used by him to deduce Einsteins field equation via a variational principle. A discussion of that feature of tensor geometry goes beyond the scope of this essay: I will address it in a suitable essay in the Map of Physics. Thus, we now have the basic facts of tensor geometry, the geometry of warped spaces.

Appendix: Tensor Contraction

    Contraction of a tensor in index notation consists of setting two of a tensors indices, one contravariant and the other covariant, equal to each other and then applying the Einstein summation convention. The resulting contracted tensor retains the other indices of the original tensor. For example, we can contract a fourth-rank tensor Tabcd of valence (2,2) on the second and third indices to create a new second-rank tensor Uad of valence (1,1). We write that contraction as

(Eqn A-1)

    Note that we cannot perform a contraction on a pair of indices that are either both contravariant or both covariant. In order to carry out such a contraction we must first raise or lower an index through calculating an inner product with an appropriate metric tensor, either covariant (gab) or contravariant (gab). We use the metric tensor, then, to raise or lower one of the indices, as needed, and then apply the operation of contraction. Mathematicians call that combined operation a metric contraction. We can thus conceive tensor contraction as a generalization of the trace of a matrix.


Back to Contents