A matrix of Jordan form (also Jordan normal form, Jordan canonical form and classical canonical form) is a block matrix containing Jordan blocks along the diagonal and zeros elsewhere. For example:

|3 1 0 0 0 0 0|
|0 3 1 0 0 0 0|
|0 0 3 1 0 0 0|
|0 0 0 3 0 0 0|
|0 0 0 0 5 0 0|
|0 0 0 0 0 2 1|
|0 0 0 0 0 0 2|

All matrices are similar to a matrix in Jordan form. The process of finding the matrix in Jordan form that is similar to a given matrix is called Jordan matrix decomposition. Such matrices are useful for computional purposes. If you wanted to find the nth power of a matrix A, then you could compute it easily by first finding the matrices S and J such that:

A = SJS-1
An = (SJS-1)n = SJS-1SJS-1SJS-1...SJS-1 = SJIJIJ...S-1 = SJnS-1

Jn can be found by computing the nth power of each of of the individual Jordan blocks. The ability to easily compute large powers of matrices is useful for computing the long term behavior of a system of equations.


Sources:
http://en.wikipedia.org/wiki/Jordan_normal_form
Eric W. Weisstein et al. "Jordan Canonical Form." From MathWorld--A Wolfram Web Resource.
http://mathworld.wolfram.com/JordanCanonicalForm.html

Jordan forms — grow your own!

I remember how infuriating it was to read about the Jordan form when I was first learning linear algebra. Nobody ever wrote down what the algorithm was! They all blabbed on about generalised eigenvectors, generalised eigenspaces, minimal polynomial, algebraic multiplicity,and geometric multiplicity, and kept beating around the bush with all the abstraction without ever writing down a complete and concise procedure on how to put a matrix in Jordan form. Eventually, after reading at least three linear algebra textbooks (that actually went deep enough into the subject to talk about the Jordan form) I managed to decipher what the algorithm should be. I knew that it should be almost the same as the diagonalisation algorithm for a matrix, but with a little twist and a few more steps. It took a little effort. Eventually, I got it.

My purpose here is to save you the deciphering effort and to just tell you what the damn algorithm is. Plain language, or as plain as linear algebra ever gets. Warning: the algorithm that I present is the stupid algorithm. It's only meant to be used by hard-working adult humans and is not particularly recommended for computers or small children. I'm sure there are smarter numerical methods for computing Jordan forms if this is the sort of thing you need to do for a living (although matrices that cannot be diagonalised and require Jordan forms instead are relatively rare in Nature). In fact, one such numerical method can be found here:

http://portal.acm.org/citation.cfm?id=355912

The stupid algorithm I will be describing instead plays to this tune:

  1. Compute the characteristic polynomial.
    1. Subtract a variable down the diagonal of your matrix.
    2. Compute the determinant of resulting matrix
    3. Simplify resulting polynomial if necessary. This is the characteristic polynomial.
  2. Factor the characteristic polynomial.
    1. A difficult problem in general. Consult polynomial factoring algorithms.
    2. The roots of the characteristic polynomial are the eigenvalues of the matrix.
  3. Find the eigenvectors for each eigenvalue.
    1. Subtract one at a time each of the eigenvalues from the previous step down the diagonal of the matrix.
    2. Row-reduce the resulting matrices.
    3. Interpet row-reduced matrices as homogeneous systems of linear equations where all the constants on the right are zero, and write down as many linearly independent solutions as possible. These are the eigenvectors.
  4. Find generalised eigenvectors.
    1. If there aren't as many eigenvectors as size of the matrix, proceed with this step.
    2. Again subtract each eigenvalue down the diagonal of the matrix, and augment the resulting matrix on the right with the corresponding eigenvector. Row-reduce again.
    3. Interpret the row-reduced matrix as a system of inhomogeneous linear equations and solve. If there is more than one free variable, cut down the degrees of freedom by specifying that the solution must be orthogonal to one or more eigenvectors belonging to this eigenvalue (use enough orthogonality conditions until there is only one free variable.) A solution to this problem is a generalised eigenvector.
    4. Repeat step 4.c if necessary, augmenting the same matrix instead with the solution just found. Stop when there are as many solution vectors as size of matrix, or if no solution to the inhomogeneous system exists. In the latter case, pick another eigenvector and find its generalised eigenvectors.
  5. Form the transition matrix and the Jordan form.
    1. Put each eigenvector followed by its generalised eigenvectors in the order they were found as column vectors in a matrix. This is the transition matrix
    2. Conjugate the original matrix by the transition matrix (meaning multiply the original matrix on the left by the inverse of the transition matrix and on the right by the transition matrix). This is the Jordan form, which might be the diagonalisation of a matrix in case that Step 4 was superfluous.

I am going to demonstrate this algorithm below. We will be working throughout with the following nontrivial three (yikes!) examples of sizes 4 × 4, 3 × 3, and 2 × 2 because we are hard-core:

       A                  B                  C

[ -1 -3  3  0 ]     [ -2 -2  2 ]          [ -1 -6 ]
[ -2 -1  0 -3 ],    [ -2  1  1 ],  and    [  2  6 ].
[ -2  0 -1 -3 ]     [ -9 -7  7 ]
[  0  2 -2 -1 ]

Let's get to work. *cracks knuckles*

Step NĂºmero Uno: Compute the characteristic polynomial

An ominous melody slowly builds up.

To compute the characteristic polynomial, take your matrix, subtract a variable down the diagonal (traditionally λ or x; I'll use λ) and compute the determinant of the resulting matrix. In this case we are computing the determinants of

                               

              [ -1 - λ     -3        3        0   ]  
     A - λI = [   -2     -1 - λ      0       -3   ]  
              [   -2        0     -1 - λ     -3   ],    
              [    0        2       -2     -1 - λ ]

 


              [ -2 - λ    -2       2    ]                        [  -1 - λ    -6    ]
     B - λI = [   -2     1 - λ     1    ],     and      C - λI = [     2     6 - λ  ].
              [   -9      -7     7 - λ  ] 

"Determinant of a 4 × 4?" I hear you whine, maggot? Oh, boo-hoo! Real Mathematicians™ do this three times before breakfast in the morning. As I once remarked to a colleague, "I am not a pussy" (and you should say it too before going on), so we shall proceed with the computations. Now get down and give me 4!

For matrix A we begin with a Laplace expansion along the first row to obtain

                      |-1 - λ      0      -3  |     |-2     0      -3  |   |-2  -1 - λ    -3  |
det(A - λI) = (-1 - λ)|   0     -1 - λ    -3  |  + 3|-2  -1 - λ    -3  | +3|-2     0      -3  |.
                      |   2       -2    -1 - λ|     | 0    -2    -1 - λ|   | 0     2    -1 - λ|

Now we compute the three minor 3 × 3 determinants that arose.

det(A - λI) = (-1-λ)[(-1 - λ)3 + 6(-1 - λ) -6(1 - λ)] + 3[-2(-1 - λ)2 - 12 + 12] + 3[12 + 2(1 - λ)2 -12]

Most of this cancels to yield, in a pretty factorised form,

det(A - λI) = (-1 - λ)4.

With similar efforts but slightly fewer computations, we find that the other two characteristic polynomials are

det(B - λI) = 8 - 12λ + 6λ2 - λ3,
det(C - λI) = 6 - 5λ + λ2.

Le Step Deux: Factor the characteristic polynomial

A lively French interlude plays.

That is, take the characteristic polynomial and write it as a product of linear factors (which can be done in this case by the fundamental theorem of algebra, right?) so that we may find its roots. The roots of the characteristic polynomial are the eigenvalues of the corresponding matrix. We need these eigenvalues before we can proceed with the Jordan form.

Factoring polynomials is in itself a difficult problem in general (and actually one of the more stupid ways to find eigenvalues). In this case, however, it is actually rather easy. The characteristic polynomial for the A matrix came pre-factored during its computation, and the other two aren't hard to factor. This is because I deliberately chose computationally easy examples to work with so as to highlight the steps in the algorithm and not get bogged down with too many computations. It is not hard to see that

det(A - λI) = (-1 - &lambda)4,
det(B - λI) = (2 - λ)3,
det(C - λI) = (2 - λ)(3 - λ),

so that the eigenvalues are -1 and 2 for matrices A and B correspondingly, while matrix C has the two eigenvalues 2 and 3.

Step The Third: Find the eigenvectors for each eigenvalue

A recognisable and familiar hymn leads us on.

This means, take each eigenvalue, subtract it down the diagonal of the corresponding matrix, and row reduce the resulting matrix using the Gaussian algorithm. Then interpret this row reduced matrix as a homogeneous linear system and find as many linearly independent solution vectors as possible. These vectors are the eigenvectors corresponding to the eigenvalue you subtracted down the diagonal.


For matrix A this is done thusly:


          [  0 -3  3  0 ]                [ 0  -3  3   0]
          [ -2  0  0 -3 ]    R3-R2       [-2   0  0  -3]
  A + I = [ -2  0  0 -3 ]  -------->     [ 0   0  0   0]
          [  0  2 -2  0 ]  R4-(2/3)R1    [ 0   0  0   0]

  -R1     [ 2  0  0  3 ]             [ 2  0  0  3]
------->  [ 0  3 -3  0 ]  -------->  [ 0  1 -1  0]
  -R2     [ 0  0  0  0 ]  (1/3)R3    [ 0  0  0  0]
R1<->R2   [ 0  0  0  0 ]             [ 0  0  0  0]

Ok, so I did several row operations in one step, and I didn't do them in the strict order that the Gaussian row-reduction algorithm requires. Also, because I like integers, I didn't divide the first row by its leading term in order to get a leading one. These are merely cosmetic enhancements for human eyes. It should nevertheless be clear that except for dividing each row by its leading term, I have already put the matrix into reduced row-echelon form, and that we are not going to be getting any more rows of zeroes.

Now we reinterpret the row reduced matrix as a homogeneous system of linear equations where all the constant terms on the right are zero (we could have done this from the beginning), and see that there should be two linearly independent solutions and that

      u1       u2
     [ 3 ]  [  0 ] 
     [ 0 ]  [  1 ]
     [ 0 ]  [  1 ]
     [-2 ]  [  0 ]

are two such solutions (that is, matrix A + I sends those two vectors into zero when it multiplies them from the left).


In a similar manner, for matrix B we subtract 2 down the diagonal to obtain

            [ -4  -2   2 ]
   B - 2I = [ -2  -1   1 ]
            [ -9  -7   5 ]

and row reduce it to the form

   [  5   0  -2 ]
   [  0   5  -1 ]
   [  0   0   0 ]

where I have again opted for the cosmetic enhancement to not divide each row by its leading term. This homogeneous system only has one linearly independent solution, such as

     v1
   [  2 ]
   [  1 ]
   [  5 ]

For matrix C, we first subtract eigenvalues 2 and 3 down the diagonal to obtain

         [ -3  -6 ]             [ -4  -6]
C - 2I = [  2   4 ]    C - 3I = [  2   3],

which row-reduce to

      [ 1  2 ]    [ 2  3 ]
      [ 0  0 ]    [ 0  0 ]

yielding the eigenvectors

     w1      w2
    [ 2 ]  [  3 ]
    [-1 ]  [ -2 ].

Jordanian Step: Find generalised eigenvectors

An apocryphal extended piece fills the room.

For this step, if we still don't have enough eigenvectors (as many eigenvectors as the size of the matrix), then for each eigenvector vk, we solve the inhomogeneous linear systems (M - λI)vk + 1 = vk successively in order. Yes, more Gaussian algorithm, I'm afraid. If the system has a solution, it should have one-dimensional solutions, meaning, one free variable after we row-reduce, or one leading 1 less than the size of the matrix. It might have more, which will be the case only if there is more than one Jordan block with the same eigenvalue. In that case, we impose the further restriction that the solution vk+1 must be orthogonal to one or more of the eigenvectors belonging to this eigenvalue.

The system might also have no solution, which means that this particular eigenvector doesn't have more generalised eigenvectors. Basically, what happens is that each eigenvector will have a chain of generalised eigenvectors, and all the chains of generalised eigenvectors taken together will give us as many generalised eigenvectors as the size of the matrix. If a particular chain has stopped yielding eigenvectors, it is time to look at another eigenvector and see how many generalised eigenvectors it yields. We will do all of this with our ongoing examples.

All of this step and its various substeps may be superfluous if the matrix is already diagonalisable (has as many eigenvectors as its size).


We begin with matrix A. We have found in the previous step two eigenvectors, dubbed u1 and u2 Let us begin by looking for generalised eigenvectors for u1.

This amounts to solving the following inhomogeneous linear system. Observe that I augmented the matrix A + I on the right by the eigenvector u1 and added another row which says that the solution has to be orthogonal to the eigenvector u1 (I could have also chosen the solution to be orthgonal to u2 instead.)

      [  0 -3  3  0 |  3 ]
      [ -2  0  0 -3 |  0 ]
      [ -2  0  0 -3 |  0 ]
      [  0  2 -2  0 | -2 ]
      [  3  0  0 -2 |  0 ]

After computations that look awfully familiar, this row-reduces to

      [  1  0  0  0 |  0 ]
      [  0  1 -1  0 | -1 ]
      [  0  0  0  1 |  0 ] .
      [  0  0  0  0 |  0 ]
      [  0  0  0  0 |  0 ]

Observe that there is only one free variable, the third one from left to right. From this we can read that

      u3
    [ 0 ]
    [ 1 ]
    [ 2 ]
    [ 0 ]

is a generalised eigenvector.

We keep looking for generalised eigenvectors, this time augmenting the matrix with u3. But the resulting matrix

      [  0 -3  3  0 |  0 ]
      [ -2  0  0 -3 |  1 ]
      [ -2  0  0 -3 |  2 ]
      [  0  2 -2  0 |  0 ]
      [  3  0  0 -2 |  0 ]

row reduces to

      [  1  0  0  0 |  0 ]
      [  0  1 -1  0 |  0 ]
      [  0  0  0  1 |  0 ] ,
      [  0  0  0  0 |  1 ]
      [  0  0  0  0 |  0 ]

which is an inconsistent set of equations with no solution (the fourth row says that 0 = 1). Therefore, u1 only has the generalised eigenvector u3 and it follows that the last generalised eigenvector must come from u2.

We augment the same matrix with u2, and for variety's sake, we now require the solution to be orthgonal to u2 instead of u1:

      [  0 -3  3  0 |  0 ]
      [ -2  0  0 -3 |  1 ]
      [ -2  0  0 -3 |  1 ].
      [  0  2 -2  0 |  0 ]
      [  0  1  1  0 |  0 ]

The row-reduced matrix this time is

      [  2  0  0  3 | -1 ]
      [  0  1  0  0 |  0 ]
      [  0  0  1  0 |  0 ] ,
      [  0  0  0  0 |  0 ]
      [  0  0  0  0 |  0 ]

yielding the solution

     u4
   [  1 ]
   [  0 ]
   [  0 ].
   [ -1 ]

Since that gives us already four generalised eigenvectors total, there is no need to check if u4 will give us another eigenvector, because we know it won't. This concludes step 4 for matrix A.


Matrix B only has one eigenvector, so we know that there must be two generalised eigenvectors for this one eigenvector. Also, because there is only one eigenvector for this eigenvalue, there is no need to impose orthogonality conditions.

Proceeding as before, we augment the matrix B - 2I by the vector v1 on the right (look back to Step the Third), yielding

  [ -4  -2   2  | 2 ]
  [ -2  -1   1  | 1 ],
  [ -9  -7   5  | 5 ]

and the corresponding row reduction is

  [ 5  0  -2 | -2 ]
  [ 0  5  -1 | -1 ].
  [ 0  0   0 |  0 ]

A possible solution is

     v2
    [ 2 ]
    [ 1 ] ,
    [ 6 ]

which we use again to augment the matrix B - 2I like so:

  [ -4  -2   2  | 2 ]
  [ -2  -1   1  | 1 ],
  [ -9  -7   5  | 6 ]

reducing to

  [ 5  0  -2 | -1 ]
  [ 0  5  -1 | -3 ]
  [ 0  0   0 |  0 ]

and suggesting the solution

         v3
       [ 3 ] 
       [ 1 ] .
       [ 8 ]

Since that gives us three generalised eigenvectors for B of size three, we are done with the Jordanian step for the matrix B.


Since matrix C is of size two and has two ordinary eigenvectors, there is no need to perform this step on this matrix.

Grand Finale: Form transition matrix and the Jordan form

Cue dramatic music!

The hard work is done. How much of that did you follow? We can enjoy our dessert now, which simply consists of putting the (generalised) eigenvectors we found above into a pretty matrix as columns, taking care to put the generalised eigenvectors in the order that we found them following their ordinary eigenvector. This is the transition matrix. Then we conjugate the original matrix by this transition matrix (multiply the original matrix by the transition matrix on the right and by the inverse of the transition matrix on the left), and this is the Jordan form we sought.


The transition matrix for A is

 
        u1 u3 u2 u4
      [ 3  0  0  1 ]
  U = [ 0  1  1  0 ]
      [ 0  2  1  0 ]
      [-2  0  0 -1 ]

where u1 is preceded by u3 and u2 by u4 because u3 is a generalised eigenvector belonging to u1 and u4 belongs to u2. The Jordan form is therefore

JA = U-1AU

or

           [ -1  1  0  0 ]
           [  0 -1  0  0 ]
    JA  =  [  0  0 -1  1 ].
           [  0  0  0 -1 ]

Observe that there are two Jordan blocks, one for each eigenvector, and the lone eigenvalue -1 runs down the diagonal.


For B we find the transition matrix to be

          v1 v2 v3
        [ 2  2  3 ] 
    V = [ 1  1  1 ]
        [ 5  6  8 ]

where again observe that we put the two generalised eigenvectors following their ordinary eigenvector in the order that we found them. The Jordan form is JB = V-1BV or

         [ 2 1 0 ]
    JB = [ 0 2 1 ] .
         [ 0 0 2 ]

One Jordan block because B only had one eigenvector.


The last transition matrix is

        w1 w2
      [ 2  3 ]
  W = [-1 -2 ],

and the Jordan form JC = W-1CW is

          [ 2  0 ] 
     JC = [ 0  3 ].

Two trivial Jordan blocks because there were two eigenvectors, and the matrix is in fact diagonal because we didn't have to worry about generalised eigenvectors for this one.

Epilogue: Who cares about all this matrix crap?

Actually, not many people, to be frank. Numerical analysts have little use for the algorithm that I presented above, because as I said, they have better methods. In fact, the Jordan form is a bit of an artificial theoretical gadget, because practically all matrices can be diagonalised and don't need the Jordanian step above of finding generalised eigenvectors. Nothing wrong with theoretical gadgets, except that fewer loons are interested in them.

The Jordan form, although useful, is of more limited scope than the ordinary diagonalised form that most matrices have. If there is indeed a naturally-occuring matrix arising from practical problem out there that needs a Jordan form, (for example, some difference equations, discrete cousins of differential equations, may give rise to matrices that are not diagonalisable), then there is something very special and particular about this situation that merits very careful analysis.

So why did we go through all the above work with three different matrices, if it's all mostly useless? Because people have asked me how to compute Jordan forms by hand, and this writeup is my answer. Because theory is pretty. And because we can, and therefore we should.

Log in or registerto write something here or to contact authors.