Covariant and Contravariant
Back to Contents
Yes, I know: it sounds like the title of an M.C. Escher etching. But in fact those words denote the two ways in which multiple-component mathematical entities, such as vectors, transform when we translate them from one coordinate frame of reference into another.
The terms covariant and contravariant were introduced by James Joseph Sylvester (1814 Sep 03 – 1897 Mar 15) in 1853 in order to study algebraic invariant theory. In this context, for instance, a system of simultaneous equations is contravariant in the variables. The use of both terms in the modern context of multilinear algebra is a specific example of corresponding notions in category theory, essentially the theory of analogies within mathematics.
In multilinear algebra and tensor analysis (which includes the vector analysis of modern physics), covariance and contravariance describe how the quantitative description of certain geometric or physical entities changes with a change of basis from one coordinate system to another. When one coordinate system is just a rotation or translation of the other, this distinction between covariant and contravariant remains invisible. However, when we consider more general coordinate systems such as skew coordinates, curvilinear coordinates (as we find in General Relativity), and coordinate systems on differentiable manifolds, the distinction becomes critically important.
For a vector (such as a direction vector or velocity vector) to represent something that does not depend inherently on the coordinate system, the components of the vector must vary with a change of basis in a way that compensates that change. That is, the components must vary in the way opposite to the change of basis, an inverse transformation; otherwise, the vector would change. Vectors (as opposed to dual vectors) are said to be contravariant. Examples of contravariant vectors include the position of an object relative to an observer, or any derivative of position with respect to time, such as velocity or acceleration. In Einstein notation, contravariant components have upper indices as in
in which the hatted e represents the unit vector of the i-th coordinate axis.
For a dual vector, (such as a gradient) to be coordinate system invariant, the components of the vector must vary in the same way as the change of basis does in order to keep the vector unchanged. That is, the components must vary by the same transformation that we apply to the change of basis. Dual vectors (as opposed to vectors) are said to be covariant. Examples of covariant vectors generally appear when taking a gradient of a function (effectively dividing by a vector), such as the electrostatic field derived from a potential. In Einstein notation, covariant components have lower indices as in
In physics, vectors often have units of distance or distance times some other unit (such as the velocity), whereas covectors have units the inverse of distance or the inverse of distance times some other unit. The distinction between covariant and contravariant vectors becomes particularly important in computations with tensors, which often have mixed variance (or rank). This means that they have both covariant and contravariant components, or both vectors and covectors. The duality between covariance and contravariance intervenes whenever someone represents a vector or tensor quantity by its components, although modern differential geometry uses more sophisticated index-free methods to represent tensors.
In physics, a vector typically comes from a measurement or series of measurements. Physicists represent the vector as a list (or tuple) of numbers, such as (v1,v2,v3) for velocity. The numbers on the list depend on the choice of the coordinate system against which they’re measured. For example, if the vector represents some velocity with respect to an observer, then the observer uses a coordinate system corresponding to a system of rigid rods, or reference axes, measuring the components v1, v2, and v3 as projections of the velocity vector onto the axes. For a vector to represent a proper geometric object, it must be possible to describe how that vector looks in any other coordinate system. That means that the components of the vector will transform in a certain way when we translate the vector from one coordinate system to another. The transformation reflects the fact that the axes of one coordinate frame are vectors in the other frame.
A contravariant vector has components that change in the same way as the coordinates do (opposite the way in which the reference axes change) under changes of the coordinate frame, such changes as rotation or dilation. The vector itself does not change under these operations; rather, the components of the vector change in a way that cancels the change in the spatial axes, changing in the same way that the coordinates change. So if the reference axes were rotated in one direction, the component representation of the vector would rotate in the opposite direction by the same amount of angle: that’s true because an observer riding the coordinate frame would see the vector appear to rotate. Similarly, if the reference axes were stretched in one direction, the components of the vector, like the coordinates, would diminish in exactly the way that compensates the stretch.
In mathematical terms, we say that if the coordinate system undergoes a transformation that we describe by an invertible matrix M, so that we transform a coordinate vector x to x' = Mx, then we must also transform a contravariant vector v to v' = Mv. This requirement distinguishes a contravariant vector from any other triple of physically meaningful quantities. Examples of contravariant vectors used in mathematical physics include displacement, velocity, momentum, force, and acceleration.
A covariant vector has components that change in a way opposite to the way in which the coordinates do; that is, equivalently, they transform as the reference axes do. For example, the components of the gradient vector of a function
such as the electric field due to an electrostatic potential, transform as the reference axes themselves do. When we consider only rotations of the spatial reference frame, the components of contravariant and covariant vectors behave in the same way. It is only when we bring in other transformations that the difference becomes apparent.
The general formulation of covariance and contravariance refers to how the components of a coordinate vector transform under a change of basis, a passive transformation; that is, we want to know how the components of the vector change when we change the coordinate frame. Thus we let V represent a vector space of dimension n over the field of scalars S, and let each of the basis sets f = (X1,...,Xn) and f' = (Y1,...,Yn) define a coordinate frame on V so that any vector consists of a linear combination of the basis elements. Also, let the change of basis from f to f' be given by
for some invertible n×n matrix A with entries aik. Here, each vector Yk of the f' basis is a linear combination of the vectors Xi of the f basis, so that
We express a vector v in the vector space V uniquely as a linear combination of the elements Xi of the f basis as
where vi[f] are scalars in S, which scalars are the components of v in the f basis. Write the column vector of components of v as
so that we can rewrite Equation 6 as a matrix product v=f·v[f]. We can also express the vector v in terms of the f' basis, so that v=f’·v[f’]. However, because the vector v itself is invariant under the choice of basis, we have f·v[f]=v=f’·v[f’]. The invariance of v combined with the relationship of Equation 4 between f and f' implies that f·v[f]=fAv[fA], which gives us the transformation rule v[fA]=A-1v[f]. In terms of components,
where the coefficients ãik are the entries of the inverse matrix of A.
Because the components of the vector v transform through the inverse of the matrix A, these components are said to transform contravariantly under a change of basis.
A linear functionalα on V is expressed uniquely in terms of its components (which are scalars in S) in the f basis as
These components are the action ofα on the basis vectors Xi of the f basis.
Under the change of basis from f to f', the components transform so that,
Denote the row vector of components of α by α[f]:
so that we can rewrite Equation 10 as the matrix productα[fA]=α[f]A. Because the components of the linear functional α transform with the matrix A, mathematicians say that these components transform covariantly under a change of basis.
Had we used a column vector representation instead, the transformation law would be the transpose
The choice of basis f on the vector space V defines uniquely a set of coordinate functions xi on V. Those coordinates on V are contravariant in the sense that when we apply the transformation matrix A to the basis we get
A system of n quantities vi that transform like the coordinates xi on V defines a contravariant vector. A system of n quantities that transform oppositely to the coordinates then constitute a covariant vector.
This formulation of contravariance and covariance is often more natural in applications in which there is a coordinate space (a manifold) on which vectors reside as tangent vectors or cotangent vectors. Given a local coordinate system xi on the manifold, the reference axes for the coordinate system are the vector fields
This gives rise to the frame f = (X1,...,Xn) at every point of the coordinate patch.
If yi is a different coordinate system and
then the frame f' is related to the frame f by the inverse of the Jacobian matrix of the coordinate transition:
Or, in indices,
A tangent vector is by definition a vector that is a linear combination of the coordinate partials∂/∂xi. Thus a tangent vector is defined by
Such a vector is contravariant with respect to change of frame. Under changes in the coordinate system, one has
Therefore the components of a tangent vector transform via
If a system of n quantities vi depends on the coordinates that transform in this way when they pass from one coordinate system to another, we call it a contravariant vector.
Covariant and contravariant components of a vector:
In a Euclidean space V, there is little distinction between covariant and contravariant vectors, because the dot product allows for covectors to be identified with vectors. That is, a vector v determines uniquely a covectorα via the dot product
for all vectors w. In other words, the covector of w is just the projection of w onto v. Conversely, each covectorα determines a unique vector v by this equation. Because of this identification of vectors with covectors, one may speak of the covariant components or contravariant components of a vector; that is, the components are just representations of the same vector using reciprocal bases.
If we have a basis f = (X1,...,Xn) of the vector space V, then we have a unique reciprocal basis f# = (Y1,...,Yn) of V determined by requiring
the Kronecker delta. In terms of these bases, we can represent any vector v in two ways:
The components vi[f] are the contravariant components of the vector v in the basis f, and the components vi[f] are the covariant components of v in the basis f.
The contravariant components of a vector are obtained by projecting the vector onto the coordinate axes. The covariant components are obtained by projecting the vector onto the normal lines to the coordinate hyperplanes; that is, onto the lines that stand perpendicular to the hyperplanes defined by the coordinate axes other than the one associated with the component being determined.
In the Euclidean plane, the dot product allows us to identify a vector with the unit (or basis) vectors. If e1,e2 is a basis, then the dual basis e1,e2 satisfies
Thus, e1 and e2 lie perpendicular to each other, as do e2 and e1, and we have normalized the lengths of e1 and e2 against e1 and e2, respectively. The covariant components are obtained by equating the two expressions for the vector v:
Three-dimensional Euclidean space:
In a three-dimensional Euclidean space, we can also determine explicitly the dual basis, the contravariant equivalent, of a given set of basis vectors e1, e2, e3 that we do not assume to be orthogonal or of unit norm. The contravariant (dual) basis vectors are:
Even when the ei and ei are not orthonormal, they are still mutually dual, in the sense that they are related to each other through the Kronecker delta:
We can obtain the contravariant coordinates of any vector v through the dot product of v with the contravariant basis vectors:
Likewise, we can obtain the covariant components of v from the dot product of v with covariant basis vectors,
Then v can be expressed in two (reciprocal) ways, viz.
Combining the above relations, we have
and we can convert the vector’s components between the covariant and contravariant bases with
We use subscripts as the indices of covariant coordinates, vectors, and tensors. If the contravariant basis vectors (written with superscripts) are orthonormal, then they are equivalent to the covariant basis vectors and we have no need to distinguish between the covariant and contravariant coordinates.
Physicists use the adjective covariant informally as a synonym for invariant. For example, the Schrödinger equation does not keep its written form under the Lorentz Transformation of Special Relativity. Thus, a physicist might say that the Schrödinger equation is not covariant. In contrast, the Klein-Gordon equation and the Dirac equation do keep their written form under these coordinate transformations. Thus, a physicist might say that these equations are covariant.
Despite the dominant usage of the word "covariant", we would more accurately to say that the Klein-Gordon and Dirac equations are "invariant" and that the Schrödinger equation is "not invariant". Additionally, to remove ambiguity, we should indicate the specific transformation with respect to which we evaluate the invariance. Continuing with the above example, neither the Klein-Gordon nor the Dirac equations are universally invariant under any coordinate transformation (e.g. those of General Relativity), so an unambiguous description of these equations would say that they are invariant with respect to the Lorentz Transformation of Special Relativity.
Use in tensor analysis:
In computations with tensors, which often have mixed indices, the distinction between covariance and contravariance is particularly important. That statement means that the tensors have both covariant and contravariant components, or both vector and dual vector components, and we must take care to acknowledge the difference between them. In Einstein notation, covariant components have lower indices, while contravariant components have upper indices. The duality between covariance and contravariance intervenes whenever someone represents a vector or tensor quantity by its components, although modern differential geometry uses more sophisticated index-free methods to represent tensors.
On a manifold, a tensor field will have multiple indices of two kinds. By convention, we write covariant indices as lower indices, whereas we write contravariant indices as upper indices, both on the right side of the symbol that represents the tensor. When the manifold has a metric tensor associated with it, covariant and contravariant indices become very closely related to one another. We can convert contravariant indices into covariant indices through matrix multiplication by the metric tensor. We can convert covariant indices into contravariant indices by multiplying the tensor by the matrix inverse of the metric tensor. Of course, if a space is not endowed with a metric tensor, we cannot carry out that conversion. We can see, from a more abstract standpoint, that a tensor simply exists and its components of either kind are only calculational artifacts whose values depend on the coordinates that we have chosen to impose upon the manifold.
Back to Contents