The generalized name for mathematical objects that consist of multi-dimensional arrays of numbers. A scalar (what most of us know as "number") is a tensor of rank 0, a vector is a tensor of rank 1, a matrix is a tensor of rank 2, and so forth.

Many people get very confused when they first start dealing with tensors, partly because they've been told that they are "like" scalars, vectors and matrices, and because they think they know what these things - in some sense - are. The notion of a difference between covariant and contravariant vectors (for example) will thus completely confuse them.

While reading this writeup, please bear in mind that tensors are defined by what they do, not what they are.


Having said that, I can now define what a tensor is.

Structure

The structure of a tensor can be described by two things: its rank and its type.

Rank

If you think of a tensor as an grid-like array of elements, then the rank describes the number of dimensions the array possesses. For example, this could represent a rank 2 tensor:

      a11  a12  a13 ... 
      a21  a22  a23 ... 
Aij = a31  a32  a33 ... 
      .    .    .   . 
      .    .    .    . 
      .    .    .     .

because it's a 2-dimensional array; the elements are arranged in two directions. Geddit? A rank 3 tensor could have its elements arranged in a cube, and so on. Note that tensors are usually defined over a space with n dimensions; this number n gives the size of the array in every direction (I've just used dots to show the array extending arbitrarily above). A rank 2 tensor in a 3-dimensional space would be a 3-by-3 grid, for example.

That wasn't too painful now, was it?

Type

This is a bit more tricky. You saw in the example that we could locate any given element by means of its indices; i and j above. For obvious reasons (and practically by definition) a rank n tensor will have n indices. However, it turns out that these indices can behave in two different ways. We call those of one type contravariant indices (I'll define what that means later), and those of the other type covariant indices. The contravariant ones are written as superscripts, the covariant as subscripts; for this reason, they're often referred to as "upper" and "lower" indices, respectively. Thus, a tensor with one contravariant (upper) and two covariant (lower) indices would be written as Aijk.

What does this have to do with type? Well, the type of a tensor is an ordered pair of numbers which state how many of each index it has. For example, Aijk has type (1,2), and a general tensor with r contravariant indices and s covariant ones has type (r,s).

Of course, type makes rank redundant since a tensor of type (r,s) will necessarily have rank r+s.


Behaviour

This is the real meat of tensor theory. It's also quite difficult to write, involving as it does a lot of mathematical notation, including any number of partial differentials. I'll represent these using an operator notation, to make life easier for both of us. The partial differential of f with respect to x, normally written something like:

  ∂f
  --
  ∂x
will be written as
  x f
I'll also employ the standard operator notation for full differentiation if it comes up:
  Dx f

Scalars

Okay, these are fairly easy to understand. A scalar is a rank 0 tensor, which means it extends in no directions; it has only one element, no matter what. The elements of tensors can be thought of as functions; for any position in the n-dimensional space, the elements will have particular values. Read the function node if you aren't sure what that means; no point me reproducing it all here.

Anyway. You might then suppose (as most people do) that any single function is thus a scalar. This is not completely true (scalars are always relative tensors, but not necessarily absolute ones; never worry what that means for now). Basically, a function constitutes a scalar if its value at any position does not depend on the co-ordinate system used to locate that position. The polar co-ordinate system, in particular, has a tendency to produce "false infinities" at the origin. For all practical purposes, you can probably think of a scalar as a function.

Vectors

This is where things get tricky. Forget what you know about vectors, it'll only cause you confusion. There are two types of vectors; covariant vectors Ai, and contravariant ones Ai; both consist of a linear (one-dimensional) array. A linear array of functions (the elements of the tensor) is a vector if and only if it obeys one of these tensor transformation laws when converting the vector Ai (or Ai, depending on which transformation law we are checking) from one set of co-ordinates xi to a different set yi (where we call the vector Bi to distinguish it):

Covariant vectors

Bi =  Σnj=0  yi xj ⋅ Aj

Contravariant vectors

Bi =  Σnj=0  xj yi ⋅ Aj

Remembering that we defined x so that

        ∂y
x yi = --
        ∂x

If it does not obey either law, it is not a vector. This is the part that confuses people; basically, we've defined vector to mean something that transforms in this way. Those summation signs really get in the way, so we usually just dispense with them and say something like:

Bi = xj yi ⋅ Aj

This is called the Einstein summation convention, because you-know-who came up with it. Any index that appears twice in a single term (like j above) is summed over.

So that's vectors for you.

Rank 2 tensors

Contrary to popular belief, these are not all matrices; only the type (1,1) tensors are. Rank 2 tensors can be split into three categories, according to which of these transformation laws they obey:

Type (0,2) ("Covariant tensor")

Bij = yi xayj xb ⋅ Aab

Type (2,0) ("Contravariant tensor")

Bij = xa yixb yj ⋅ Aab

Type (1,1) ("Mixed tensor")

Bij = xa yiyj xb ⋅ Aab

I won't go into the details of why only the (1,1) tensors are like matrices here, preferring to save that for the tensor algebra node.

General tensors

I expect you've noticed the pattern in the above transformation laws; the pairs of "dummy" indices used in the summations cancel out in pairs (eg, one covariant b cancels with one contravariant b), and whatever indices are left must match with those on the other side of the equals sign (including their type). We can put together a general transformation law for any tensor of type (r,s) as follows. I haven't written them all, but there are r contravariant (upper) indices and s covariant (lower) ones. We are transforming a tensor Tij...kl... in the co-ordinate system xi into Sij...kl... in yi:

Sij...kl... = (xa yixb yj ⋅ ...) (yk xcyl xd  ⋅ ...) Tab...cd...

And that's it. Anything which transforms in that way is a tensor of type (r,s)

By their transformations, thou shalt know them.
--Unknown*

If it looks like a duck, walks like a duck, and quacks like a duck, it's a duck.
--Ancient duck proverb

What is a tensor? Some would say that a tensor is merely an array or set of numbers, represented by using an indexed notation, like T = {Tijkl}. This definition is not only unenlightening, it is simply wrong. I will attempt to give a better (if not comprehensive) description.

At the risk of sounding completely circular, a tensor is a mathematical object which transforms "like a tensor" under rotations and parity** (reflections through the origin). In an attempt to decrypt what I've just said, I offer a few examples, classified by rank.

A rank-1 tensor is a vector. That is, it is an object which transforms like a vector under rotations and parity. By this I mean, with any rotation or parity transformation of a coordinate system, there is associated a particular orthogonal matrix, R. All vectors should transform like v → Rv under this transformation. For example, under parity transformations, v → -v.

Why is this different from saying a vector is just a column of numbers? Because most columns of numbers don't have this property! A good counterexample is angular momentum. A simple definition of angular momentum would be L = r × p (where "×" refers to the cross product). It is the cross product of two vectors, which would seem to be a good candidate for a vector. How does it transform under parity?

Since r → -r and p → -p, L → (-r) × (-p) = r × p = L.

In other words, L does not change under parity! However, it does transform properly under rotations, and for this reason, it is called a pseudovector. Another example of a pseudovector is a magnetic field.

Moving backwards down the rank ladder, a rank-zero tensor is known as a scalar. One may easily think of scalars simply as single numerical values (as opposed to a set or array of numbers), but once again, a scalar is not just any number. It must "transform" like a scalar, which really means it must not change at all under rotations and parity. Scalars can be created by taking the dot product of vectors. Since both transform like vectors,

vTv → vTRTRv = vTR-1Rv = vTv,

so the dot product is invariant. The reason "scalars" are often thought of as "just numbers" is because normally we don't think of "numbers" as things that transform under rotations. Allow me to give a counterexample:

When you take the dot product of a vector with a pseudovector, you get an object which transforms like a scalar under rotations, but picks up a minus sign under parity. Any number which transforms this way is called a pseudoscalar. For example, the dot product of a particle's momentum with its spin is a pseudoscalar, known as the particle's helicity.

To generalize to higher-rank tensors, we need only put together combinations of smaller-rank tensors. Roughly, a rank-n tensor is an object which transforms like n vectors. Thus, a rank-2 tensor transforms like two vectors, and so on. So how do two vectors transform? In order to make a transformation on two vectors, each vector must be transformed, requiring two of the same orthogonal matrix (T → RMRT in the case where M is a matrix, one possible type of rank-2 tensor).

By analogy, there are also pseudotensors of every rank, and you can probably guess how they transform (like a tensor, but with an additional minus sign under parity). The most common example of a pseudotensor is the totally antisymmetric epsilon "tensor" (εijk being the rank-3 example).

The point is, most "arrays of numbers" don't transform in any special way. It is those sets of numbers which have a particular relationship under transformations that achieve special titles like "vector", "scalar" and "tensor".


*Unknown meaning unknown to me. If anyone can cite this, I would appreciate it.
**Usually in mathematics, the requirement that tensors properly transform under parity (and hence the distinction between tensors and pseudotensors) is ignored. This just implies a slightly different definition, which offers fewer "nice" counterexamples. Actually, while I'm on this footnote, I might as well note that since I didn't explicitly define my vector space, "rotation" should be replaced by "coordinate transformation", because we can't do a completely general treatment in terms of orthogonal transformations. However, only doing orthogonal transformations allows me to tiptoe around the subject of covariant vs. contravariant tensor components, which is good news for me. So, for the sake of simplicity and/or clarity, let's just talk about O(N) rotations in nice, flat, euclidean N-dimensional space, and tack it on to the list of things I've swept under the rug. In any case, everything I say here generalizes nicely to curved manifolds and nonorthogonal transformations. For a general treatment, see the above writeup by grey knight, if you dare...
There are "better" definitions for angular momentum, but this one is fine.

A tensor is something that transforms like a tensor.
--Your aged physics professor

The above is perhaps among the silliest statements you will hear regularly come out of the mouth of an otherwise pretty sharp person. It's not that the statement is wrong but rather that it's entirely useless to someone just trying to learn what a tensor is, and, I think, it misses the heart of the matter. So, let's go ahead, then, and answer the question of what a tensor is in a straightforward way. Then we can return to what that statement about transformations is supposed to mean in the section "Transforming like a Tensor". I'm going to assume that you know at least a bit about vectors, vector fields, and matrices. Throughout this write-up I will only be talking about finite dimensional spaces, whether vector spaces or surfaces.

In Physics and Mathematics, the idea of a vector is extremely useful. It allows you to organize systems of equations and model quantities with both a magnitude and direction (like position and momentum). Another concept that's pretty useful is the linear transformation, a linear function that maps vectors to vectors. This can model the behavior of a lot of simple systems and comes up often. The idea of tensors is to generalize these concepts, to have functions that can take many vectors as their arguments or can produce many vectors as their result. Furthermore, you can define a tensor field, which is a generalization of the idea of a vector field. This concept is used in many areas of physics and, probably more extensively, in the geometry of smooth surfaces. Now that we know why we want a tensor, we can start to try to define a tensor, but before we proceed we need to build up one small piece of groundwork.

A Prelude: One-forms

One can go into some detail about one-forms, and indeed someone has, but let me make a brief introduction to just the basics here. A one-form is a linear function that takes a vector as its argument and returns a scalar (meaning a plain old number), or, put another way, it's a map from vectors to scalars. One-forms are also sometimes known as co-vectors or linear functionals. As a convention, throughout this write-up I'll denote vectors by bold Latin letters, like v, and one-forms by bold Greek letters, like μ, unless otherwise specified. In Euclidean space, if you pick a row vector z then you can then define the one-form μ acting on a vector v simply by taking the dot product of z with v to get the scalar value μ(v). If z is a unit vector then that's sometimes also known as taking the projection of v along z. This might be as simple as finding the x component in a certain set of coordinates. So a one-form isn't anything fancy; it's just a linear function that eats a vector and spits out a number.

The set of one-forms that act on a vector space, V, themselves form a new vector space called the dual space, often denoted V*. It's a vector space because you can sensibly define what it means to add two one-forms or multiply a one-form by a scalar in the same way one does for functions. For one-forms μ and σ, vectors v and z, and scalar λ

(μ + σ)(v) = μ(v) + σ(v) and (λ μ)(v) = λ (μ(v))

There is a one-to-one correspondence between vectors and one-forms, which means, among other things, that the space of one-forms has the same dimension as the space of vectors.

Tensors Defined

Once we know what a one-form is, defining what a tensor is becomes simple, though understanding the import of what tensors are may take a bit more work.

Definition: A tensor T(μ1, μ2,…, μM, v1, v2, … , vN) of type (M,N) is a multilinear function from M one-forms and N vectors to a scalar value.

So a tensor takes some one-forms and some vectors and spits out a number. Multilinear just means that it's linear for each argument individually, so

T(μ1, μ2,…, μM, v1, α v2, … , vN) = α T(μ1, μ2,…, μM, v1, v2, … , vN)

and

T(μ1, μ2,…, μM, v1, v2 + z, … , vN) = T(μ1, μ2,…, μM, v1, v2, … , vN) + T(μ1, μ2,…, μM, v1, z, … , vN)

where α is a scalar and z is a vector1. People often say such a tensor has rank (M+N), and they also may say something like that it is N times covariant and M times contravariant. We'll get to what those mean a little later. So I've given you the definition, but so far it's probably not very enlightening, and it's also probably not clear how this has to do with what I was talking about in the beginning, so let's go look at some examples of tensors.

Vectors, One-forms, and Linear Transformations

The simplest sort of tensor is a type (0 0) tensor, which takes no arguments, so it's just a scalar value, a number. From the definition above, we can see that a tensor of type (0,1) is a function that takes one vector and gives a scalar value. If you recall, that's the definition of a one-form, so a one-form is a sort of tensor. A tensor of type (1,0) is a function that takes a one-form and gives you a number, and I claim that you can think of such a thing as a vector. If you chose a specific vector, then you can take any one-form and act on the vector to get a number, so you can define a tensor T(μ) = μ(v). So, from now on we can think of a (1,0) tensor as representing a vector and vice versa. A type (1,1) tensor takes a vector and a one-form as its arguments, so if we imagine only plugging the vector in but not the one-form, we'd get T( ,v). That's now a type (1,0) tensor, which we just said was a vector. So, we can think of a type (1,1) tensor as a linear transformation; it takes a vector v and relates it to a vector T( ,v). Other (M,N) type tensors are more general versions of these ideas.

Coordinate Representations

In the old days it used to be much more common to talk about everything only in terms of coordinates, which is where this "a tensor is something that transforms like a tensor" business comes from. So far I've been talking about things in a coordinate free way, but eventually you need to calculate things which usually means describing them in some set of coordinates.

We know that for any N dimensional vector space we can describe it in terms of a set of basis vectors

A = {e1, e2, …, eN}

so that v = v j ej

In that equation, and for the rest of this write up, I'm using the Einstein summation convention, which says that any index that is repeated in an expression is supposed to be summed over, so that equation for v is supposed to have a sum over all possible values of j. When we represent v in this way, the set of numbers v j are called the coordinates in coordinate system A (the one defined by the basis A).

Because the set of one-forms is also a vector space, we can describe every one-form in terms of coordinates, using a basis of one-forms. We can choose this set of basis one-forms {ω1, ω2, …, ωN} by the rule that

ωj(ek) = δ jk

which just means it's 1 if j = k and 0 if j ≠ k. Then, of course, any one-form μ can be written in terms of that basis as

μ = μj ωj

Notice that the components of a one-form are written with lowered indices in distinction from the components of a vector.

Armed with this coordinate system for vectors and one-forms, we can describe any tensor by a set of numbers, which we get just by plugging in our basis elements as arguments to the tensor T in every possible combination.

T i,j,…,k l,m,…,n = T(ωi, ωj, …, ωk,el, em, …, en)

This is analogous to the procedure you're probably used to describe a linear transformation. This might be easiest to see by recalling that if you represent a linear transformation on Euclidean space by a matrix M, then you can find the matrix elements of M by the formula Mij = eiT M ej, where the superscript T represents the transpose. Note that for tensors it's conventional to have the indices for the one-form arguments be superscripts (since the one-form basis elements have superscripts) while the indices for the vector arguments are subscripts (since the basis vectors have subscripts). Since we said that a tensor that acts on a vector is a one-form and a tensor that acts on a one-form is a vector, that convention will agree with our conventions for writing the elements of vectors and one-forms from earlier. Also note that a type (0,0) tensor is just a scalar and is the same in any coordinate system.

In linear algebra, when you represent a linear transformation F in a certain basis and F(v) = z, then you can write down the the action of F on v using the matrix multiplication formula. You can write the same sort of coordinate formula down for the action of a tensor. I'll demonstrate how the get the formula for a type (1,1) tensor, and then I'll write down the general result.

T(μ, v) = T(μj ωj, vk ek),

and using the linearity of the tensor

T(μ, v) = μj T(ωj, ek) vk = μj T j k vk.

In the general case

T(μ, ρ, …, σ, v, y, …, z) = μi ρj … σk T i,j,…,k l,m,…,n vl ym … zn.

The order the terms are written in doesn't matter (since each element is just a number); I've just written them in that order for clarity.

According to the way we have developed for representing tensors in a coordinate system, a type (1,0) tensor in N dimensional space would require N numbers, T j, to describe it, which is what we expect since we've claimed that we can think of this object like a vector. A type (1,1) tensor requires an N by N matrix of elements, T jk to describe it, and, as I've said, this procedure gives us just the result we expect for a linear transformation, which is what we claimed it represents. In general, a rank 1 tensor (a vector or a one-form) takes an array of N elements to describe it; a rank 2 tensor (that's types (2,0), (0,2), (1,1)) takes a two dimensional, N by N, array to describe it; a rank 3 tensor takes N3 numbers (which you might try to arrange in an N by N by N cube); and a rank r tensor would take Nr numbers to describe it. Now, although a (2,0) tensor, a (1,1) tensor, and a (0,2) tensor can all be represented by an N by N matrix of numbers, only the coordinate representation of a (1,1) tensor transforms the way the representation of a linear transformation should.

Transforming Like a Tensor

Sometimes people talk about "the way a matrix transforms", but, of course, a matrix is just an array of numbers, which don't "transform" in any particular way. What these people really mean is that there is a way in which the matrix representation of a linear transformation must transform in order to make sense. A linear transformation is a geometrical object; that means that you can talk about it without using any particular coordinate system, and you should be able to describe it properly in any coordinate system. Suppose we have a linear transformation F that maps the vector v to the vector z (look ma, no coordinates :-). Now, if you wanted to describe that in terms of a coordinate system A, you might have

z j = F j k v k

(note that k is summed over, by the summation convention, but j is not because it doesn't appear twice on one side of the equation, so the equation is true for each value of j). If you ignore all the raised and lowered index business, you'll see this is just the formula for a matrix multiplying a vector. In a different set of coordinates, B, you could also write down a formula

z' m = F' m l v' l

where this primed elements for the new coordinate system will have different values from the unprimed ones. Then you can ask, "how are the two related?" Since each basis must, by definition, span the vector space, each basis vector should be able to be represented by a linear combination of the other basis; thus, the two bases A and B should be related by a linear transformation L, called the coordinate transformation, which must be invertible, since we can go back and forth between the two representations. The basis vectors of A, ej, can be written as a linear combination of the basis vectors of B, e'k, as

ej = L k j e'k or alternately e'k = (L-1) j k ej

where (L-1) is the inverse of L. Now if we try to represent the vector v in each basis, we'll get

v = v' j e'j = v j ej = v j L k j e'k

using our expression for the relationship of the bases. Now, in order for that equation to be true, we know the coordinate representation of v must transform like

v'l = L l k v k and likewise z'm = L m j z j

If we plug these expressions for z'm and v'l into the equation for the action of F in coordinate system B, then we get

L m j z j = F' m l L l k v k

Multiplying each side from the left by L-1 (the inverse of L) gives

z j = (L-1) j mF' m l L l k v k

In order for that equation to true for any vector v that F might act on and the original equation we wrote for the action of F in terms of F j k to hold, we must conclude that

F j k = (L-1) j mF' m l L l k

or to write it the opposite way

F' j k = L j mF m l (L-1) l k

Thus, simply by writing down the coordinate representation of this geometric object in different sets of coordinates we figured out how the numbers representing it must transform.

Since we defined the coordinate representation of a tensor in terms of the basis vectors and basis one-forms, we have to write down how the old basis and new basis relate. We already said how the basis vectors relate, and we can do a similar thing for the basis one-forms. Again, we just write an expression down in each coordinate representation

ωj(ek) = δ jk = ω'j(e'k)

ωj(ek) = ωj(L l k e'l) = L l k ωj(e'l)

using our expression for the relationship for the bases and the fact that the one-form is linear. This leads us to

L l k ωj(e'l) = ω'j(e'k)

In order for this to be true for all j and k, a little algebra shows us that it must be the case that

ω'k = L j k ωj

Here's the algebra, which you can read if you like, or you can trust me and skip it:

By definition there's some M j n such that ωj = M j n ω'n.

L l k M j n ω'n(e'l) = δ jk

L l k M j n δ nl = δ jk

Now, we can replace n by l everywhere on the left hand side, since n and l are summed over and all terms where n ≠ l are zero from the δ term. This gives, after rearranging the terms (which we can do, since they're just numbers)

M j l L l k = δ jk

But that is just the matrix equation M*L = 1 (by 1 I mean the identity matrix). Since we know already that L is invertible, we then know that M = L-1 confirming the result above.

Armed with the relationships between the basis vectors and basis one-forms, we are prepared to figure out how the coordinate representation of a tensor transforms. Recall that the representation of a tensor T in basis A is

T i,j,…,k l,m,…,n = T(ωi, ωj, …, ωk,el, em, …, en),

and it follows that in basis B the tensor T would be represented by

T' i,j,…,k l,m,…,n = T(ω'i, ω'j, …, ω'k,e'l, e'm, …, e'n).

By plugging in the expressions for B basis vectors in terms of the A basis vectors we can get the relationship between the two representations of T.

T' i,j,…,k l,m,…,n = T(L i r ωr, L j s ωs, …, L k t ωt, (L-1) a l ea, (L-1) b m eb, …, (L-1) c n ec)
= L i r L j s … L k t T r,s,…,t a,b,…,c (L-1) a l (L-1) b m … (L-1) c n

That's the general rule, and, yes, I know it's a lot of indices. Let's go back and look at some specific cases. Here are several sorts of tensors, the name of the object associated with each, and the transformation rule in each case:

Type (0,0) Scalar

T' = T

Type (1,0) Vector

T' j = L j k T k

Type (0,1) One-Form

T' j = T k (L-1) k j

Type (1,1) Linear Transformation

T' j k = L j l T l m (L-1) m k

Type (0,2) Quadratic Form or Bilinear Form

T' j k = T l m (L-1) l j (L-1) m k

Type (2,0) Bivector

T' j k = L j l L k m T l m

You can see that the coordinate transformation for the elements of a (1,0) tensor is the same as the one for the components of a vector, and of the rank 2 tensors only the elements of the (1,1) tensor representation transform like those of a linear transformation, agreeing with our identification of that as a linear transformation earlier.

I should pause here to talk about terminology. The indices that raised and correspond to the part of the tensor that transforms like the elements of a vector are often called the contravariant indices. The indices that are lowered and transform like the elements of a one-form are called the covariant indices. A type (p,q) tensor is sometimes said to be "p times contravariant" and "q times covariant".

We're almost done with this section, so I want to emphasise that the point here, even if you did not follow the algebra, is that a tensor is a geometrical object that we can discuss without using coordinates, but because it can be described in any system of coordinates, we know that a coordinate representation of it must follow certain transformation rules, which we have derived. When people say that something "transforms like a tensor", they simply mean it follows the general transformation rule we wrote down above. Given a set of elements in two different bases, you can check if they are related by the tensor rule to find out if those things are representations of a tensor. This can be useful, because then you know that you can use that tensor quantity in any coordinate system. I would look at these transformation laws as a consequence of a tensor's geometrical nature not as the definition of a tensor. The advantage of looking at it this way is that, I hope, it's a bit less mysterious than the sort of seemingly circular definition that began this write-up.

At this point, you may wonder, "if you have a set of elements in one set of coordinates, why not just choose the tensor that they correspond to?" meaning, define the array in other coordinates using the tensor transformation rule. Voila, a tensor! What's all this business of checking if it transforms? It's true that if you have an array of Nr numbers in a particular coordinate system, you can pick out a rank r tensor that has that array as its representation in that coordinate system. This is no longer so simple, however, when we talk about tensor fields.

Tensor Fields

What is a tensor field? First let's go back and think about what a vector field is. A vector field is an assignment of a vector to each point in a space (or to each point of what mathematicians would call a manifold), which is just a function that maps points in the space to vectors. At each point p we could assign any vector in a vector space, know as the tangent plane at p. A one-form field would be an assignment of a one-form at each point in space. We could have a one-form field act on a vector field, which would just mean have the one-form we assigned to a point act on the vector we assigned to that point producing a number at that point. Another way of putting that is that a one-form field acting on a vector field produces a function. A tensor field, then, is nothing more than the assignment of a tensor to each point. If we have a tensor field of type (p,q) defined on a space, then it can act on p one-form fields and q vector fields to produce a scalar function.

Again, in order to describe the geometric objects in this space, we use coordinates2. Suppose you have a set of coordinates describing the space. Now, when we talk of coordinates describing a general space or surface, we're no longer talking about describing a vector space with a basis; these could be any arbitrary set of coordinates, as long as they make sense. Think, for example, of describing positions in Euclidean space using cylindrical coordinates or describing where you are on the surface of a sphere using spherical polar coordinates3. The point is that now our coordinate transformations are just some smooth invertible mapping from one set of coordinates into another4. When I say the mappings must be smooth, what I mean is that they must be differentiable, which tells us that if we look in a small enough local neighborhood of one point then the coordinate transformation does look like just an invertible linear transformation. That local, linear map is called the differential map of the transformation. Since the vectors and tensors are defined at a point, it is this local, linear transformation that acts on their coordinate representation. The direction of increasing coordinate xj defines the basis vector ej at each point. This is exactly how the er and eθ unit vectors are defined in polar coordinates. The differential map of the coordinate transformation does the change of basis from the one described by x j to the new one described by the direction of increasing x'k.

If we have a vector field defined as a function of some coordinates x j and we transform to new coordinates x' k, then

v'j = ∂x'j/∂xk vk

In other words, all the transformation laws for the elements of a tensor in coordinate representation will be the same as the ones we derived before, except at each point the transformation is

L j k = ∂x'j/∂xk evaluated at point p.

Of course the transformation will be different at each point, since the derivatives of the map are different at each point, in general. If the coordinate transformation is linear, then L is just the coordinate representation of the map at each point, and everything agrees with what we said earlier for a tensor defined only at one point.

One would easily write down a function in some coordinates that gives an array of numbers for each point but does not act correctly when you transform coordinates. Let's talk about an example of a function that represents a vector field and one that doesn't. We'll work in good old 2-D Euclidean space. Suppose I use ordinary Cartesian coordinates to define the function

f(x,y) = x ex + y ey

That's just a set of vectors that point away from the origin, and the magnitude of the vector at (x,y) is equal to the distance from the origin. Now suppose that I try to express this in polar coordinates. First we need the relationship between basis vectors at each point in polar coordinates and the Cartesian ones:

er = cos(θ) ex + sin(θ) ey and eθ = -sin(θ) ex + cos(θ) ey

Now we can write down the corresponding function in the new coordinates

f( x(r,θ), y(r,θ) ) = f'(r,θ) = r cos(θ) ex + r sin(θ) ey = r er

That matches the description I gave earlier, of course. Now let's see if this transformed version we just wrote out matches the transformation rule that we have already derived for the coordinate representation of a vector field. First, we need to know the partial derivatives of the coordinate transformation. The transformations are

r = sqrt(x2 + y2) and θ = tan-1(y/x)

∂r/∂x = x/sqrt(x2 + y2) = r cos(θ)/r = cos(θ)

∂r/∂y = y/sqrt(x2 + y2) = r sin(θ)/r = sin(θ)

∂θ/∂x = -y/(x2 + y2) = -r sin(θ)/r2 = -sin(θ)/r

∂θ/∂y = x/(x2 + y2) = r cos(θ)/r2 = cos(θ)/r

Now, our transformation rule for the components of a vector field (a (1,0) tensor field) is

v' j = ∂x'j/∂xk v k

which in this case means

v r = ∂r/∂x v x + ∂r/∂y v y and v θ = ∂θ/∂x v x + ∂&theta/∂y v y

Plugging in the components of the vector in the original basis v x = x and v y = y, our transformation rule tells us that

v r = x cos(θ) + y sin(θ) = r cos2(θ) + r sin2(θ) = r

v θ = -x sin(θ)/r + y cos(θ)/r = -r sin(θ) cos(θ)/r + r sin(θ) cos(θ)/r = 0

v = v j ej = r er

So we find that the transformation rule for the vector field agrees with what we get from the coordinate transformation. This means that this function actually represents a vector field, since it behaves the way we found that a coordinate representation of a vector field must behave.

We can give an example of a function that does not give a vector field by slightly modifying the first example. Let

f(x,y) = x ex + xy ey.

Doing the coordinate transformation to polar coordinates the same way as before gives

f( x(r,θ), y(r,θ) ) = f'(r,θ) = r cos(θ) ex + r2 sin(θ)cos(θ) ey = (r cos2(θ) + r2 sin2(θ)cos(θ))er + (-r sin(θ)cos(θ) + r2 sin(θ)cos2(θ))eθ.

One the other hand, using our transformation rule for a vector field would give

v r = ∂r/∂x v x + ∂r/∂y v y = x cos(θ) + xy sin(θ) = r cos2(θ) + r2 sin2(θ)cos(θ)

and

v θ = ∂θ/∂x v x + ∂&theta/∂y v y = -x sin(θ)/r + xy cos(θ)/r = - sin(θ) cos(θ) + r sin(θ) cos2(θ)

Comparing these components to the r and θ components above, you'll notice that the r components match, but the θ components do not. This means that when you change coordinates, using the formula gives you a different set of vectors. The vector field should remain the same, since changing coordinates should only be a change in the description, so this means that the function f does not give a correct description of a vector field.

Thus, if you write down a function of coordinates that gives you an array of Nr elements, it's not necessarily a representation of a tensor. You also have to verify that it transforms as a representation of a tensor must transform, according to the transformation rule we talked about earlier. This might sound fairly confusing, but it is often not so bad; given some valid expressions for tensors it is easy to construct some others. For example if you have a valid expression for a vector field and a one-form field, then acting the one-form field on the vector field will give you a valid scalar field, a type (0,0) tensor field that just assigns a scalar value to each point in space. I said that, at a point, a type (0,0) tensor is "just a number", and the same is true at each point for a type (0,0) field, but the important thing is that this function will still give the same value at each point after a coordinate transformation. Just as with vector fields, you could write down many functions of coordinates that would not give the same values after a coordinate transformation. Putting together two tensors to form a valid tensor of lower rank is called contracting the two tensors together.

In practice, most of the times you'd want to talk about tensors, you'll be talking about tensor fields. This is so much the case that usually when people talk about tensor fields they refer to them simply as "tensors". Some people wouldn't even bother to define a tensor acting only at one point, especially if they are defining tensors in terms of the way elements change with a coordinate transformation. It's certainly sensible to talk about tensors at a point, and I hope it was helpful to separate out the basic properties of tensors and how you can relate the behavior of their representations under coordinate transformations to the behavior of the elements of, say, a linear transformation under a change of basis. The point is simply that you must be aware that some people will mean what I've called a "tensor field" when they say "tensor", but it will generally be clear from the context.

Metrics: Raising and Lowering indices

In many cases the space your tensor, or tensor field, is defined on will have an inner product (or at least some kind of quadratic form)5. In Riemannian geometry, and most of the time in Physics, the space will have a metric tensor, which is a type (0,2) tensor often denoted g, where g(v,v) gives the square of the length of the vector v (though this may not be the normal length as measured in Euclidean space). You're already familiar with at least one metric tensor, because in Euclidean space the metric is just the dot product. More importantly for our current purposes, a metric allows you to relate each vector to a one-form in a meaningful way. I said earlier that there is a one-to-one correspondence between vectors and one-forms, which means you could map each vector to a one-form; however, there are an infinite number of such maps, so without a metric there's no real way to say that for a vector v any particular one-form is more closely related to v than any other. The metric can give us such a specific relation.

Given a vector v, we can define a one-form, which I will write in a new notation v*, by the rule

v*(z) = g(v,z) for any vector z

This one-form v* that we have associated with v is called the dual of v. As I've been using throughout, the components of a one-form in a coordinate system are written with lowered indices, which is how you tell the components of v* from the components of v. In coordinate representation the above definition of the dual reads simply

v j z j= g jk v k z j

Or for the general case,

v j = g jk v k

Notice that in coordinate notation, the components of the vector and its dual are just related by a matrix of numbers, the components g jk. If that matrix is invertible (i.e. if the determinant is non-zero), then the mapping is unique and we can invert it and have a mapping from one-forms to vectors as well, so to each one-form μ we can associate a vector μ*, which is also known as the dual of μ.

μ(z) = g(μ*,z) for any vector z

We now have a dual mapping that we can use to go back and forth between vectors and one-forms in a useful way. The relation of a one-form to its dual is written in coordinate notation as

μ j = g jk μ k

g jk are the elements of the inverse matrix of g jk and they are the coordinate components of a new, type (2,0) tensor that I will write (for the coordinate free object) as g(2,0). The rule defining this object is that

g(2,0)(μ,σ) = g(μ*,σ*)

using the dual map from one-forms to vectors. Since the elements g jk are defined by taking the inverse of the matrix with elements g jk, we also have the important relationship that

g jl g lk = δ j k

Because of the notation of using raised indices for the components of vectors and lowered ones for the components of a one-form, the process of mapping one-forms to vectors is sometimes called "raising an index", and mapping from vectors to one-forms is called "lowering an index". Using the dual mapping you can raise or lower an index in any sort of tensor just by applying to dual mapping. For example,

F(2,0)(μ,σ) = F(1,1)(μ,σ*)

In a more general case, the component notation expression would look like

T i j …,k l,m,…,n = g js T i,s,…,k l,m,…,n

As I said above, when working in Cartesian coordinates in Euclidean space, the metric is just the dot product, which is just a metric where the matrix of elements g jk is just the identity matrix. That means that the components of a vector and its dual are the same. The components of the vector can be thought of as the elements of a column vector, and the dual can be thought of as the corresponding row vector, which of course has the same elements. Because the dual mapping is so easy in this case, we just always represent any rank 2 tensor by a type (1,1) by taking the transpose of one of the vectors, which is why we aren't familiar with having to worry about the different sorts of transformation rules. In more complicated coordinates and especially in curved spaces, however, the metric is not so simple and, consequently, neither is the dual mapping. In that case we must be very careful to differentiate between the components of a vector and its dual, as well as the difference between (1,1), (2,0), and (0,2) type tensors.

Tensors in Physics and Mathematics

Tensors are useful for modeling multilinear maps on a space or surface which may be described in many different sets of coordinates. This is useful in mathematics for, among other things, describing geometry, which deals with the intrinsic properties of a space. You may have been tipped off to this when I kept calling a tensor a geometrical object. The field of differential geometry is described largely in terms of tensors, with perhaps the most important one being the Riemann curvature tensor. From the Riemann tensor one can also derive an important type (0,0) tensor, the Gaussian curvature. The important feature of tensors is that they allow you to write equations that are true in all coordinate systems, which is what you need if you're going to try to express intrinsic properties of space.

Since I'm a physicist, not a mathematician, I'll move on to applications of tensors in Physics. Well, what I really mean, of course, is that they're useful for representing physical quantities, so really they could come up in any science or in Engineering. The examples are surely too numerous to mention, so I'll only talk about a few that come to mind.

The first context in which I ever heard about a tensor was in Newtonian mechanics, when talking about the moment of inertia of a rigid rotating body. You can go to that node if you want to know more about the physical meaning of the moment of inertia, but the important part for our current discussion is that the moment of inertia, I, relates the angular momentum vector, L, to the angular velocity vector, ω.

L = I(ω)

(Note, ω here is the angular velocity vector, not a one-form). In the simple situation that one might study in an introductory mechanics course, ω and L are always parallel thanks to symmetry, so the moment of inertia can be modeled simply as a number. In more general situations, though, this is not the case and we have to use the full form as a linear transformation, which can also be thought of as a (1,1) tensor. This is perhaps one of the few instances where we're actually concerned with a tensor at a point and not a tensor field. The reason that people will often refer to this explicitly as a tensor is that it's often advantageous to change coordinates when doing a rotational problem. You sometimes want to work in the co-rotating frame (meaning, pretend you're spinning with the object and work the problem from that point of view). When you do this transformation, it's important to be able to write down the correct form of the moment of inertia in that new set of coordinates, which means treating it as a tensor. Another interesting point is that the moment of inertia can also be used to calculate the kinetic energy of a rotating object with the formula

K = 1/2 I(ω, ω)

but now one with "I" as a type (0,2) tensors, meaning we've lowered one of the indices. Again, in Cartesian coordinates on Euclidean space this just amounts to taking the transpose of ω and essentially just dotting it into the equation for L (and multiplying by 1/2).

Another place that tensors come up in Newtonian mechanics is in the mechanics of continuous systems (meaning solids and fluids). To understand the motion of 3D systems, like fluids and solids, you need to think about not just pressure and tension but also shear stress. You can combine these into a rank two tensor on the the three dimensions of space called the stress tensor. Mechanical engineers have to worry a lot about stress tensors, but to a physicist this is just another classical field theory, which in general is the study of the mechanics of extended objects.

When studying a field theory, physicists like to expand those three directions of space to four dimensions of space and time. In that case, we add energy density and energy fluxes to the stress tensor to create the stress-energy tensor, also known as the energy-momentum tensor. Besides fluid mechanics, another field theory in which the energy-momentum tensor comes up frequently is the theory of electromagnetism. Conservation of energy, for example, is best described in terms of the energy-momentum tensor in E&M. Looking at things in terms of space and time together in this form makes even more sense when you add special relativity into the mix. Of course, other tensorial quantities come up in the study of electromagnetism and other field theories, such as the Maxwell field tensor, which contains all the components of the electric and magnetic fields. This is necessary because in relativistic E&M, neither the electric or magnetic field make sense alone, only when viewed together as the Maxwell field tensor. Indeed, in relativistic E&M Maxwell's equations are written in terms of tensors. I've said that tensors are useful in classical field theory, but this, of course, also applies to quantum field theory as well.

While you can learn quite a bit of physics without tensors, once you get to relativity they become practically indispensable. In special relativity you are concerned with how observations relate in different inertial reference frames, which are just different coordinate systems on spacetime. As a result, transformation properties become supremely important. Much of the early study in SR is spent figuring out which quantities "transform like vectors" and are, therefore, said to be 4-vectors. We also finally have to worry about duals, since distances in spacetime are measured in a different way. Spacetime is not a Euclidean space but a Minkowski space, so now there's a difference between a tensor and its dual even in simple coordinates.

Moving on to general relativity we are concerned not only with inertial reference frames but any coordinate system, so now everything must be expressed in terms of tensors. In fact, the fundamental equation of GR, the Einstein equation, is written in terms of tensors. GR is an application of differential geometry, so here we see the Riemann curvature tensor as well.

I hope this section has made it clear that tensors aren't just something to play with on the blackboard; they pop up constantly when trying to describe nature, for applied mathematicians, physicists, engineers, and probably others as well.

Some of the Stuff I've Left Out

Because tensors are so useful and are essentially the language in which differential geometry and much of physics is spoken, there are a lot of related topics I haven't discussed; this has really just been the tip of the iceberg. In no particular order, here is a list of other topics that I haven't really discussed: differential forms, the tensor product, the wedge product, exterior derivatives, parity and pseudotensors, tensor densities, countless concepts from differential geometry (like the covariant derivative), and generally a lot more could be said about tensor algebra and tensor analysis.


  1. Here we would generally think of it as a column vector. That's how we'll think of all vectors unless we explicitly say otherwise.
  2. In fact, we use coordinates to describe the geometry of the space!
  3. Actually, spherical polar coordinates are singular at the poles, so they're not really quite what we want, but they do for getting the general idea.
  4. On an arbitrary smooth surface things are a bit more complex, because we may actually need several coordinate patches to describe the whole surface in a properly smooth and non-singular way.
  5. In general relativity, you don't have an inner product, but you do have a "metric" quadratic form, since it's a pseudo-Riemannian manifold.

Special thanks to eien_meru for helpful discussions and generally being my guinea pig for this node. For anyone who is wondering, I haven't actually gone insane; when I started this node I didn't intend it to be nearly this long, but I found this was the length I needed to coherently explain what needed to be explained. That'll teach me to say, "I think I can explain this."

Ten"sor (?), n. [NL. See Tension.]

1. Anat.

A muscle that stretches a part, or renders it tense.

2. Geom.

The ratio of one vector to another in length, no regard being had to the direction of the two vectors; -- so called because considered as a stretching factor in changing one vector into another. See Versor.

 

© Webster 1913.

Log in or registerto write something here or to contact authors.