General Relativity/Contravariant and Covariant Indices

Rank and Dimension

Now that we have talked about tensors, we need to figure out how to classify them. One important characteristic is the rank of a tensor, which is the number of indicies needed to specify the tensor. An ordinary matrix is a rank 2 tensor, a vector is a rank 1 tensor, and a scalar is rank 0. Tensors can, in general, have rank greater than 2, and often do.

Another characteristic of a tensor is the dimension of the tensor, which is the count of each index. For example, if we have a matrix consisting of 3 rows, with 4 elements in each row (columns), then the matrix is a tensor of dimension (3,4), or equivalently, dimension 12.

The important thing about rank and dimension is that they are invariant to changes in the coordinate system. You can change the coordinate system all you want, and the rank and the dimensions don't change. This brings up the important question of how tensors do change when you change the coordinate system. One thing we shall find when we look at the question is that in reality there are two different types of vectors.

Contravariant and Covariant Vectors

Imagine that you are flying a bomber at 1,000 kilometers per hour to the east, or along the positive x-axis. We shall call your velocity vector v. For now, we will keep the vectors one-dimensional. Suddenly you realize that you are in a meter-ish mood and so we want to figure out how fast you are going using meters instead of kilometers. Quickly changing your coordinate system, you find that you are traveling 1000 * 1000 = 1000 000 meters per hour easterly. We will call this vector v'. No problem.

Now you decide to climb, and you notice the temperature changing. We then draw a map of how the temperature changes as we fly. We then travel along the path of steepest ascent, or fastest cooling. At our current position, the temperature falls at 10 Celsius degrees per kilometer toward the east. Let's call this temperature gradient vector w. Again, you go into a meter-ish mood. Doing a quick calculation you figure out that the gradient of the temperature change is -10/1000 = -.01 Celsius degrees per meter. We shall call this vector w'.

Did you notice something interesting?

Even though we are talking about two vectors we are treating them very differently when we change our coordinates. In the first case, the vector reacted to the coordinate change by a multiplication. That is to say, v'=k•v. In the second case, we did a division: w'=1/k•w. The first case we were changing a vector that was distance per something, while in the second case, the vector was something per distance. These are two very different types of vectors. The graphic below depicts the vectors representing v, v', w, and w'

The first set of vectors, representing velocity, is contravariant; as the scale decreases from kilometers to meters, the length of the vector increases. The second set of vectors, representing temperature gradient, is covariant; as the scale decreases from kilometers to meters, the length of the vector decreases as well

The mathematical term for the first type of vector is called a contravariant vector. The second type of vector is called a covariant vector. Sometimes a covariant vector is called a one form.

Attempting a fuller explanation

It is easy to see why w is called covariant. Covariant simply means that the characteristic that w measures, change in temperature, increases in magnitude with an increase in displacement along the coordinate system. In other words, the further you travel from a fixed point, the more the temperature changes, or equivalently, change in temperature covaries with change in displacement.

Although it is a bit more difficult to see, v is called contravariant for precisely the opposite reason. Since v represents a velocity, or distance per unit time, we can think of v as the inverse of time per unit distance, meaning the amount of time that passes in traveling a certain fixed amount of distance. Time per unit distance is clearly covariant, because as you travel further and further from a fixed point, more and more time elapses. In other words, time covaries with displacement. Since velocity is the inverse of time per unit distance, than it follows that velocity must be contravariant.

The difference is also evident in the units of measure. The units of measure for v are meters per hour, whereas the units for w are degrees Celsius per meter. The coordinate system is position in space, measured in units of meters. So again, we see that the coordinate system appears in the numerator of v, which suggests that v is contravariant (with inverse time in this case), whereas the coordinate system appears in the denominator of w, which indicates that w is covariant (with change in temperature).

Contravariant vectors describe those quantities where the distance units comes at the numerator (like velocity), whereas covariant are those where the distance unit is at the denominator (like temperature gradient).

These are, of course, just fancy mathematical names. As we can see contravariant vectors and covariant vectors are very different from each other and we want to avoid confusing them with each other. To do this mathematicians have come up with a clever notation. The components of a contravariant vector are represented by superscripts, while the components of a covariant vector are represented by subscripts. So the components of vector v are v¹ and v² while the components of vector w are w₁ and w₂.

Scale Invariance

Now that we have contravariant vectors and covariant vectors, we can do something very interesting and combine them. We have a contravariant vector that describes the direction and speed at which we are going. We have covariant vector that describes the rate and direction at which the temperature changes. If we combine them using the dot product

f\ \ =\ \ \mathbf {v} \cdot \mathbf {w}

dT/dt = 1000 · -10 = -10000 degrees Celsius per hour

we get the rate at which the temperature changes, f, as we move in a certain direction, with units of degrees Celsius per hour. The interesting thing about the units of f is that they do not include any units of distance, such as meters or kilometers. So now suppose we change the coordinate system from meters to kilometers. How does f change?

f\ \ =\ \ \mathbf {v'} \cdot \mathbf {w'}

dT/dt = 100,0000 · -.01 = -10000 degrees Celsius per hour

It doesn't. We call this characteristic scale invariance, and we say that f is a scale invariant quantity. The value of f is invariant with changes in the scale of the coordinate system.

Now so far we have been treating w as if it were just an odd type of vector. But there is a another more powerful way of thinking about w. Look at what we just did. We took v, combined it with w and got something that doesn't change when you change the coordinate system. Now one way of thinking about it is to say that w is a function, that takes v and converts it into a scale invariant value, f. In plainspeak, w would be the function that takes in any velocity of a particle and produces the change in temperature that the particle experiences each hour (for the specific temperature field declared earlier).

Vector Spaces and Basis Vectors

This fact that a covariant vector like w can convert any contravariant vector like v into a scale invariant value like f is summarized by saying that w is a linear functional.

Let us be more precise about the word like. Mathematical operations, such as converting one sort of vector into another sort of vector, are done on vector spaces. See vector space for a careful definition of vector spaces. Here, loosely speaking, let us say that a vector space is a set of vectors which can be added together and multiplied by numbers and that the result is always another vector in the same vector space.

Let us define $V$ to be the vector space of contravariant vectors like v.

Then, the set of all covariant vectors like w, which convert vectors like v from $V$ into scalars like f, which we can also call the set of all linear functionals w on $V$ , can be given the name $V^{*}$ , which we call the dual space.

$V^{*}$ is also a vector space. Remember, we can view w as a vector or as a function, depending on which of its properties we wish to emphasize.

Now we can be more careful about the word like by saying which spaces w and v must be a member of: any vector w in $V^{*}$ (called a covariant vector, or a 1-form) can convert any vector v in $V$ (called a contravariant vector) into a scale invariant value like f. (We have not said what space or set f is a member of: in practice, we will usually only be interested in f as a member of the set of real numbers.)

Any vector space has a set of basis vectors. That is to say, if $\mathbf {v} \in V$ , then $\mathbf {v}$ may be written as $\sum _{\alpha }v^{\alpha }\mathbf {e} _{\alpha }$ where,

${\alpha }$ is an index ranging from 1 to the dimension of $\mathbf {v}$ .
The set { $\mathbf {e} _{\alpha }$ } are the basis vectors of vector space $\mathbf {V}$ .
$v^{\alpha }$ is a constant.

Note that although components of contravariant vectors are written with superscript ("upper") indices, the basis vectors are written with subscript ("lower") indicies. If the set { $\mathbf {e} _{\alpha }$ } is a basis for $V$ , then $\mathbf {v} \in V$ is written as the linear combination $\mathbf {v} =v^{\mu }\mathbf {e} _{\mu }$ . (We are using Einstein summation notation, detailed in the next section; this is shorthand for $\sum _{\mu }v^{\mu }\mathbf {e} _{\mu }$ .)

Before moving on to covariant vectors, we must define the notion of a dual basis. Remember that elements of $V^{*}$ are linear functionals on $V$ . So we can "apply" covariant vectors to contravariant vectors to get a scalar. For example, if $\mathbf {\sigma } \in V^{*}$ and $\mathbf {v} \in V$ , then $\mathbf {\sigma } (\mathbf {v} )$ returns a scalar. Now, the dual basis is defined as follows: if { $\mathbf {e} _{\alpha }$ } is a basis for $V$ , then the dual basis is a basis { $\mathbf {\omega ^{\alpha }}$ } for $V^{*}$ which satisfies $\mathbf {\omega } ^{\mu }(\mathbf {e} _{\nu })=\delta _{\nu }^{\mu }$ (where $\delta _{\nu }^{\mu }$ is the Kronecker delta) for every $\mu$ and $\nu$ .

Now, the components of covariant vectors are written with subscript ("lower") indices. As { $\mathbf {\omega ^{\alpha }}$ } is a basis for $V^{*}$ , we can write a covariant vector $\mathbf {\sigma }$ as $\mathbf {\sigma } =\sigma _{\mu }\mathbf {\omega } ^{\mu }$ .

We can now evaluate any functional (covariant vector) applied to any vector (contravariant vector). If $\mathbf {\sigma } \in V^{*}$ and $\mathbf {v} \in V$ , then by linearity $\mathbf {\sigma } (\mathbf {v} )=\sigma _{\alpha }\mathbf {\omega } ^{\alpha }(v^{\beta }\mathbf {e} _{\beta })=\sigma _{\alpha }v^{\beta }\mathbf {\omega } ^{\alpha }(\mathbf {e} _{\beta })=\sigma _{\alpha }v^{\beta }\delta _{\beta }^{\alpha }=\sigma _{\alpha }v^{\alpha }$ . Finally, if we define $\mathbf {e} _{\alpha }(\mathbf {\omega } ^{\beta })=\delta _{\alpha }^{\beta }$ , we see that $\mathbf {v} (\mathbf {\sigma } )=\mathbf {\sigma } (\mathbf {v} )$ .