The covariant derivative is a differential operator which plays an important role in differential geometry and gives the rate of change or total derivative of a scalar field, vector field or general tensor field along some path through curved space. Consider specifying a vector field in terms of some coordinate basis vectors. In generally curved space, these basis vectors change from point to point, so that finding the derivative of a vector field we must not only know how the components of the vector change, but also how the basis vectors change. Anyone who has seen expressions for ∇× (curl) or ∇2 (Laplacian) in spherical, polar or other curvilinear coordinates has encountered the difficulties that arise when not dealing with a fixed coordinate basis.


A linear operator ∇ : Tmn → Tmn+1 where Tmn is a (m,n) tensor field and Tmn+1 is a type (m,n+1) tensor field, that, for any two tensor fields A and B, satisfies the following properties:

(Throughout this document the notation ∂α ≡ ∂/∂xα is used.)

Note: In the above definition I chose to suppress the indices of A and B in favour of readability. However, A and B are general tensors e.g. Abcde...pqrst.... The covariant derivative of this tensor is then &nablaaAbcde...pqrst..., wherein the indices begin to obscure the intended meaning. Throughout the rest of this document I will also take tensor to mean tensor field for brevity, since a tensor field associates with every point on a manifold a tensor, and it is properly the tensor field upon which the covariant derivative operates. Thanks to krimson for suggesting that I point this out.

The covariant derivative is a derivative of tensors that takes into account the curvature of the manifold in which these tensors live, as well as dynamics of the coordinate basis vectors. In cartesian coordinates, the covariant derivative is simply a partial derivative ∂α. In spherical coordinates, for example, the coordinate basis vectors change between different points, so the derivative of a vector expressed in terms of these basis vectors must take this into account.

The covariant derivative is also known as the semi-colon derivative and is written as A;a = ∇aA.

If we further require that, for any vectors u, vV, vaaub = ∇v(u) where ∇v is the affine connection, then we can completely specifiy the the action of ∇ on any tensor. To do this we will also place a torsion-free condition on the connection.

vaaub = (vαeaα)(eaβ&beta)(uγebγ)

vaaub = vαδαβ&beta(uγebγ)

vaaub = vαα(uγebγ)

vaaub = vαα(uβebβ) (relabling indices)

Now ∇v(u) = vαα(uβ)ebβ + vαuβΓγα β ebγ where Γγα β is the Christoffel symbol of the 2nd kind.

We can exchange β and γ in the second term to obtain ∇v(u) = vαα(uβ)ebβ + vαuγΓβα γ ebβ which is equal to vα( ∂αuβ + Γβα γ uγ) ebβ

Thus we have:

vαα(uβebβ) = vα( ∂αuβ + Γβα γ uγ) ebβ which is true for all vα, so that we can write:

a(ub) = eaα(∂αuβ + Γβα γ uγ) ebβ, or

a(ub) = (∂αuβ + Γβα γ uγ) eaα ebβ

Armed with this information, we can find the covariant derivative of any tensor. The method in each case will be the same: Contract the tensor with objects for which the covariant derivative is known, in such a way that the result is a scalar. Compute the covariant deriviative of the product using the both the Leibniz rule for the covariant derivative and for partial derivatives, keeping in mind that the covariant derivative of a scalar is merely the gradient of that scalar.

As an example, consider the covariant derivative of a oneform ωb, ∇aωb. Contracting ωb with a vector vb yields a scalar, ωβvβ. Thus we can compute ∇abvb) in two ways:

Firstly, ∇abvb) = vbab) + ωba(vb) (Leibniz rule)

Since we already know the covariant derivative's action on vectors, we can expand the second term:

abvb) = vbab) + ωβ(∂αvβ + Γβα γ vγ) eαa.

Secondly, ∇abvb) = ∇aβvβ) = ∂αβvβ) eαa, which is equal to (ωβαvβ + vβαωβ) eαa by the Leibniz rule.

We can now equate these two results and solve for the term we want, ∇ab):

vbab) + ωβ(∂αvβ + Γβα γ vγ) eαa = (ωβαvβ + vβαωβ) eαa

vbαb) eαa = (vβαωβ - ωβΓβα γ vγ) eαa

We can exchange γ and β in the second term, to give:

vbαb) eαa = vβ(∂αωβ - ωγΓγα β) eαa

We can commute with respect to contraction with β on the left hand side, giving vβαβ) eαa. This result must be true independent of the choice of vb so:

ab) = (∂αωβ - Γγα βωγ)eαaeβb

To summarize:

  • a(f) = (∂αf)eaα
  • a(ub) = (∂αuβ + Γβα γ uγ) eaα eβb
  • ab) = (∂αωβ - Γγα βωγ)eαaeβb

This method generalizes to any tensor, for example:

a(Tbcde) = (∂α Tβγδε - ΓκαβTκγδε - ΓκαγTβκδε + ΓδακTβγκε + ΓεακTβγδκ )eαa ebβ ecγ eδd eεe

For each raised index we contract with a lowered index on the Christoffel symbol, and for each lowered index we contract with a raised index on the Christoffel symbol, whilst taking a negative sign.

Finally, if we require that the covariant derivative be torsion free, which means that covariant derivatives of a scalar field commute, then

αβ(f) = ∇βα(f)

We can expand either side since ∇α(f) = ∂αf are the components of a oneform (gradient). Then

αβf - Γγαβγf = ∂βαf - Γγβαγf

Since partial derivatives commute, the torsion free condition requires that Γγαβ = Γγβα .

More formally, the torsion free condition requires that (∇uv - ∇vu)A = [u,v]A ∀ A where [u,v] is the Lie bracket or commutator of u and v, and A is any tensor. Here (∇uv)A is short-hand for vaa(ubb A). The commutator is such that if u = d/dλ and v = d/dμ then [u,v]f = ∂2f/∂λ∂μ - ∂2f/∂μ∂λ. If λ and μ are coordinates then the partial derivatives commute and the commutator vanishes. Above we only considerd a special case of the torsion free condition applied to scalar fields in order to discover the constraint imposed on Γγαβ. Kudos to krimson for helping me out with this one.

Note that the original definition of the covariant derivative was made without fixing its effect on any general tensors. However, by requiring that it behave like the affine connection, and by requiring that the connection be torsion free, we were able to completely specify the covariant derivative. The result is what is usually meant by the term "covariant derivative", but it is not the only one. Normally, though, the metric induces a unique metric compatible connection, which in turn specifies the covariant derivative uniquely. Again, thanks to krimson for suggesting this clarification.


Using the covariant derivative we can succinctly express various concepts in differential geometry. For example, consider parallel transporting the vector ub along the curve Γ(λ) with tangent vector va = (d/dλ)a. Using the affine connection we would require that ∇vu = 0, i.e. the vector remains parallel to it's original direction as it is moved across the manifold. Using the covariant derivative we can write the equivalent expression vaaub = 0. Thus for a given curve, once the Christoffel symbols are known, this expression results in a set of differential equiations that describe how the components of u change as it is parallel transported along the curve.

We can also obtain the geodesic equations which describe curves that are geodesics, or curves between two points for which the arclength is an extremum (local minimum/maximum). These curves can be thought of as "straight lines" in curved space, for example great circles on the surface of a 2-sphere. It can be shown that a geodesic is a curve that parallel transports its own tangent vector. Thus for a curve Γ(λ) with tangent vector va = (d/dλ)a, we require that vaavb = 0.

Although the Christoffel symbols that define the covariant derivative may be chosen with some degree of freedom, we can choose them in a way that is compatible with the metric g. Consider parallel transporting two vectors wb and uc along a curve with tangent vector va. We first require that vaawb = 0 and vaauc = 0. Since the dot product of u and w, gbcubwc should remain unaltered by the parallel transport, we then require vaa(gbcubwc) = 0. Then:

vaa(gbcubwc) = vaa(gbc)ubwc+ vaa(ub)gbcwc+ vaa(wc)gbcubwc = 0

The third and forth terms vanish since vaawb = 0 and vaauc = 0, and since va was arbitrarily chosen, it must be true that ∇a(gbc) = 0. Using this equation, along with the torsion condition, Γγαβ can be determined in terms of gαβ:

First, we expand the components of ∇a(gbc):

α(gβγ) = ∂αgβγ - Γκαβgκγ - Γκαγgβκ = 0

We can then permute the indices α, β, and γ, and then adding/subtracting gives:

αgβγ - Γκαβgκγ - Γκαγgβκ +
βgγα - Γκβγgκα - Γκβαgγκ -
γgαβ + Γκγαgκβ + Γκγβgακ = 0

Now, since the metric is symmetric, gαβ = gβα, and since we have imposed the torsion free condition that Γγαβ = Γγβα, we have:

αgβγ - Γκαβgκγ + ∂βgγα - Γκβαgγκ - ∂γgαβ = 0

καβgγκ = ∂αgβγ + ∂βgγα - ∂γgαβ, so multiplying both sides by the inverse metric, gγκ, gives the metric compatible connection:

Γκαβ = ½gγκ(∂αgβγ + ∂βgγα - ∂γgαβ)

As an example of using the covariant derivative, we can write the divergence of a vector field as:

∇.v = ∇ava

We can also write the curl of a vector field, ∇×v, as:

(∇×v)α = εαμλλvμ

where εαμλ is the volume form.