Covariant and Contravariant

Yes, I know: it sounds like the title of an M.C. Escher etching. But in fact those words denote the two ways in which multiple-component mathematical entities, such as vectors, transform when we translate them from one coordinate frame of reference into another.

The terms covariant and contravariant were introduced by James Joseph Sylvester (1814 Sep 03 – 1897 Mar 15) in 1853 in order to study algebraic invariant theory. In this context, for instance, a system of simultaneous equations is contravariant in the variables. The use of both terms in the modern context of multilinear algebra is a specific example of corresponding notions in category theory, essentially the theory of analogies within mathematics.

In multilinear algebra and tensor analysis (which includes the vector analysis of modern physics), covariance and contravariance describe how the quantitative description of certain geometric or physical entities changes with a change of basis from one coordinate system to another. When one coordinate system is just a rotation or translation of the other, this distinction between covariant and contravariant remains invisible. However, when we consider more general coordinate systems such as skew coordinates, curvilinear coordinates (as we find in General Relativity), and coordinate systems on differentiable manifolds, the distinction becomes critically important.

For a vector (such as a direction vector or velocity vector) to represent something that does not depend inherently on the coordinate system, the components of the vector must vary with a change of basis in a way that compensates that change. That is, the components must vary in the way opposite to the change of basis, an inverse transformation; otherwise, the vector would change. Vectors (as opposed to dual vectors) are said to be contravariant. Examples of contravariant vectors include the position of an object relative to an observer, or any derivative of position with respect to time, such as velocity or acceleration. In Einstein notation, contravariant components have upper indices as in

(Eq’n 1)

in which the hatted e represents the unit vector of the i-th coordinate axis.

For a dual vector, (such as a gradient) to be coordinate system invariant, the components of the vector must vary in the same way as the change of basis does in order to keep the vector unchanged. That is, the components must vary by the same transformation that we apply to the change of basis. Dual vectors (as opposed to vectors) are said to be covariant. Examples of covariant vectors generally appear when taking a gradient of a function (effectively dividing by a vector), such as the electrostatic field derived from a potential. In Einstein notation, covariant components have lower indices as in

(Eq’n 2)

In physics, vectors often have units of distance or distance times some other unit (such as the velocity), whereas covectors have units the inverse of distance or the inverse of distance times some other unit. The distinction between covariant and contravariant vectors becomes particularly important in computations with tensors, which often have mixed variance (or rank). This means that they have both covariant and contravariant components, or both vectors and covectors. The duality between covariance and contravariance intervenes whenever someone represents a vector or tensor quantity by its components, although modern differential geometry uses more sophisticated index-free methods to represent tensors.

In physics, a vector typically comes from a measurement or
series of measurements. Physicists represent the vector as a list (or tuple) of
numbers, such as (v_{1},v_{2},v_{3}) for velocity. The
numbers on the list depend on the choice of the coordinate system against which
they’re measured. For example, if the vector represents some velocity with
respect to an observer, then the observer uses a coordinate system corresponding
to a system of rigid rods, or reference axes, measuring the components v_{1},
v_{2}, and v_{3} as projections of the velocity vector onto the
axes. For a vector to represent a proper geometric object, it must be possible
to describe how that vector looks in any other coordinate system. That means
that the components of the vector will transform in a certain way when we
translate the vector from one coordinate system to another. The transformation
reflects the fact that the axes of one coordinate frame are vectors in the other
frame.

A contravariant vector has components that change in the same way as the coordinates do (opposite the way in which the reference axes change) under changes of the coordinate frame, such changes as rotation or dilation. The vector itself does not change under these operations; rather, the components of the vector change in a way that cancels the change in the spatial axes, changing in the same way that the coordinates change. So if the reference axes were rotated in one direction, the component representation of the vector would rotate in the opposite direction by the same amount of angle: that’s true because an observer riding the coordinate frame would see the vector appear to rotate. Similarly, if the reference axes were stretched in one direction, the components of the vector, like the coordinates, would diminish in exactly the way that compensates the stretch.

In mathematical terms, we say that if the coordinate
system undergoes a transformation that we describe by an invertible matrix M, so
that we transform a coordinate vector **x** to **x**' = M**x**, then we
must also transform a contravariant vector **v** to **v**' = M**v**.
This requirement distinguishes a contravariant vector from any other triple of
physically meaningful quantities. Examples of contravariant vectors used in
mathematical physics include displacement, velocity, momentum, force, and
acceleration.

A covariant vector has components that change in a way opposite to the way in which the coordinates do; that is, equivalently, they transform as the reference axes do. For example, the components of the gradient vector of a function

(Eq’n 3)

such as the electric field due to an electrostatic potential, transform as the reference axes themselves do. When we consider only rotations of the spatial reference frame, the components of contravariant and covariant vectors behave in the same way. It is only when we bring in other transformations that the difference becomes apparent.

Definition

The general formulation of covariance and contravariance
refers to how the components of a coordinate vector transform under a change of
basis, a passive transformation; that is, we want to know how the components of
the vector change when we change the coordinate frame. Thus we let V represent a
vector space of dimension n over the field of scalars S, and let each of the
basis sets **f** = (X_{1},...,X_{n})
and **f'** = (Y_{1},...,Y_{n})
define a coordinate frame on V so that any vector consists of a linear
combination of the basis elements. Also, let the change of basis from **f**
to **f'** be given by

(Eq’n 4)

for some invertible n×n matrix A with entries a^{i}_{k}.
Here, each vector Y_{k}
of the **f'** basis is a linear combination of the vectors X_{i} of
the **f** basis, so that

(Eq’n 5)

Contravariant Transformation:

We express a vector v in the vector space V uniquely as a
linear combination of the elements X_{i} of the **f** basis as

(Eq’n 6)

where v^{i}[**f**]
are scalars in S, which scalars are the components of v in the **f** basis.
Write the column vector of components of v as

(Eq’n 7)

so that we can rewrite Equation 6 as a matrix product v=**f·v**[**f**].
We can also express the vector v in terms of the **f'** basis, so that v=**f’·v**[**f**’].
However, because the vector v itself is invariant under the choice of basis, we
have **f·v**[**f**]=v=**f’·v**[**f**’]. The invariance of v combined
with the relationship of Equation 4 between **f** and **f'** implies that
**f·v**[**f**]=**f**A**v**[**f**A], which gives us the
transformation rule **v**[**f**A]=A^{-1}**v**[**f**]. In
terms of components,

(Eq’n 8)

where the coefficients ã^{i}_{k} are the entries of the
inverse matrix of A.

Because the components of the vector v transform through the inverse of the matrix A, these components are said to transform contravariantly under a change of basis.

Covariant Transformation:

A linear functional
α
on V is expressed uniquely in terms of its components (which are scalars in S)
in the **f** basis as

(Eq’n 9)

These components are the action of
α
on the basis vectors X_{i}
of the **f** basis.

Under the change of basis from **f** to **f'**, the
components transform so that,

,

(Eq’n 10)

Denote the row vector of components of **
α**
by **α**[**f**]:

(Eq’n 11)

so that we can rewrite Equation 10 as the matrix product
α[**f**A]=α[**f**]A.
Because the components of the linear functional
α
transform with the matrix A, mathematicians say that these components transform
covariantly under a change of basis.

Had we used a column vector representation instead, the transformation law would be the transpose

(Eq’n 12)

Coordinates:

The choice of basis **f** on the vector space V defines
uniquely a set of coordinate functions x^{i} on V. Those coordinates on
V are contravariant in the sense that when we apply the transformation matrix A
to the basis we get

(Eq’n 13)

A system of n quantities v^{i}
that transform like the coordinates x^{i}
on V defines a contravariant vector. A system of n quantities that transform
oppositely to the coordinates then constitute a covariant vector.

This formulation of contravariance and covariance is often
more natural in applications in which there is a coordinate space (a manifold)
on which vectors reside as tangent vectors or cotangent vectors. Given a local
coordinate system x^{i}
on the manifold, the reference axes for the coordinate system are the vector
fields

(Eq’n 14)

This gives rise to the frame **f** = (X_{1},...,X_{n})
at every point of the coordinate patch.

If y^{i}
is a different coordinate system and

(Eq’n 15)

then the frame **f'** is related to the frame **f** by the inverse of
the Jacobian matrix of the coordinate transition:

(Eq’ns 16)

Or, in indices,

(Eq’n 17)

A tangent vector is by definition a vector that is a linear combination of
the coordinate partials ∂/∂x^{i}.
Thus a tangent vector is defined by

(Eq’n 18)

Such a vector is contravariant with respect to change of frame. Under changes in the coordinate system, one has

(Eq’n 19)

Therefore the components of a tangent vector transform via

(Eq’n 20)

If a system of n quantities v^{i}
depends on the coordinates that transform in this way when they pass from one
coordinate system to another, we call it a contravariant vector.

Covariant and contravariant components of a vector:

In a Euclidean space V, there is little distinction between covariant and contravariant vectors, because the dot product allows for covectors to be identified with vectors. That is, a vector v determines uniquely a covector α via the dot product

(Eq’n 21)

for all vectors w. In other words, the covector of w is just the projection of w onto v. Conversely, each covector α determines a unique vector v by this equation. Because of this identification of vectors with covectors, one may speak of the covariant components or contravariant components of a vector; that is, the components are just representations of the same vector using reciprocal bases.

If we have a basis **f** = (X_{1},...,X_{n})
of the vector space V, then we have a unique reciprocal basis **f**^{#}
= (Y^{1},...,Y^{n})
of V determined by requiring

(Eq’n 22)

the Kronecker delta. In terms of these bases, we can represent any vector v in two ways:

(Eq’n 23)

The components v^{i}[**f**]
are the contravariant components of the vector v in the basis **f**, and the
components v_{i}[**f**]
are the covariant components of v in the basis **f**.

The contravariant components of a vector are obtained by projecting the vector onto the coordinate axes. The covariant components are obtained by projecting the vector onto the normal lines to the coordinate hyperplanes; that is, onto the lines that stand perpendicular to the hyperplanes defined by the coordinate axes other than the one associated with the component being determined.

Euclidean plane:

In the Euclidean plane, the dot product allows us to
identify a vector with the unit (or basis) vectors. If **e**_{1},**e**_{2}
is a basis, then the dual basis **e**^{1},**e**^{2}
satisfies

(Eq’ns 24)

Thus, **e**^{1}
and **e**_{2}
lie perpendicular to each other, as do **e**^{2}
and **e**_{1},
and we have normalized the lengths of **e**^{1}
and **e**^{2}
against **e**_{1}
and **e**_{2},
respectively. The covariant components are obtained by equating the two
expressions for the vector v:

(Eq’n 25)

Three-dimensional Euclidean space:

In a three-dimensional Euclidean space, we can also
determine explicitly the dual basis, the contravariant equivalent, of a given
set of basis vectors **e**_{1},
**e**_{2},
**e**_{3}
that we do not assume to be orthogonal or of unit norm. The contravariant (dual)
basis vectors are:

(Eq’ns 26)

Even when the **e**_{i}
and **e**^{i}
are not orthonormal, they are still mutually dual, in the sense that they are
related to each other through the Kronecker delta:

(Eq’n 27)

We can obtain the contravariant coordinates of any vector **v** through
the dot product of **v** with the contravariant basis vectors:

(Eq’ns 28)

Likewise, we can obtain the covariant components of **v** from the dot
product of **v** with covariant basis vectors,

(Eq’ns 29)

Then **v** can be expressed in two (reciprocal) ways, viz.

(Eq’ns 30)

Combining the above relations, we have

(Eq’ns 31)

and we can convert the vector’s components between the covariant and contravariant bases with

(Eq’ns 32)

We use subscripts as the indices of covariant coordinates, vectors, and tensors. If the contravariant basis vectors (written with superscripts) are orthonormal, then they are equivalent to the covariant basis vectors and we have no need to distinguish between the covariant and contravariant coordinates.

Informal usage:

Physicists use the adjective covariant informally as a synonym for invariant. For example, the Schrödinger equation does not keep its written form under the Lorentz Transformation of Special Relativity. Thus, a physicist might say that the Schrödinger equation is not covariant. In contrast, the Klein-Gordon equation and the Dirac equation do keep their written form under these coordinate transformations. Thus, a physicist might say that these equations are covariant.

Despite the dominant usage of the word "covariant", we would more accurately to say that the Klein-Gordon and Dirac equations are "invariant" and that the Schrödinger equation is "not invariant". Additionally, to remove ambiguity, we should indicate the specific transformation with respect to which we evaluate the invariance. Continuing with the above example, neither the Klein-Gordon nor the Dirac equations are universally invariant under any coordinate transformation (e.g. those of General Relativity), so an unambiguous description of these equations would say that they are invariant with respect to the Lorentz Transformation of Special Relativity.

Use in tensor analysis:

In computations with tensors, which often have mixed indices, the distinction between covariance and contravariance is particularly important. That statement means that the tensors have both covariant and contravariant components, or both vector and dual vector components, and we must take care to acknowledge the difference between them. In Einstein notation, covariant components have lower indices, while contravariant components have upper indices. The duality between covariance and contravariance intervenes whenever someone represents a vector or tensor quantity by its components, although modern differential geometry uses more sophisticated index-free methods to represent tensors.

On a manifold, a tensor field will have multiple indices of two kinds. By convention, we write covariant indices as lower indices, whereas we write contravariant indices as upper indices, both on the right side of the symbol that represents the tensor. When the manifold has a metric tensor associated with it, covariant and contravariant indices become very closely related to one another. We can convert contravariant indices into covariant indices through matrix multiplication by the metric tensor. We can convert covariant indices into contravariant indices by multiplying the tensor by the matrix inverse of the metric tensor. Of course, if a space is not endowed with a metric tensor, we cannot carry out that conversion. We can see, from a more abstract standpoint, that a tensor simply exists and its components of either kind are only calculational artifacts whose values depend on the coordinates that we have chosen to impose upon the manifold.

habg