It's amazing what people will nodeshell.

Georg Cantor used the diagonal argument as a way of explaining how the cardinality of the real numbers is somehow greater than that of the natural numbers, that is, that there are "more" real numbers than natural numbers.

It goes like this:

Assume that you can find a function that puts the real numbers into a one-to-one correspondence with the positive integers. Each real number has a decimal expansion. For example:

f(1) = 0.1231243412331212...
f(2) = 0.4359348729384792...
f(3) = 0.3304923870398473...
f(4) = ....


and so on.

Now, consider the real number whose decimal expansion whose first digit is obtained by taking the first digit of f(1) and adding 1, whose second digit is f(2)+1, and so on, always taking the nth digit of f(n) and adding 1. If a digit is 9, use 0 instead of adding 1.

This number is different from all the real numbers that f maps to, yet in order for f to be a one-to-one correspondence, it must exhaust the real numbers! This means that f cannot be a one-to-one correspondence. Since we placed no conditions on what f might be, this means no function between the integers and the real numbers can be a one-to-one correspondence.

Please note that this argument is Cantor's way of explaining his proof that for every set there exists a larger set in layman's terms. Kalmakka's little parlor trick is briefly interesting but harmless. The general proof isn't that hard to understand, but that's for another node.

Although the diagonal theory is an excellent one some people may find it conceptually difficult, so here is a way of showing that there are infinitely more real numbers than rational numbers.

Suppose you made a number line that was 1 foot wide and contained all numbers; it would be infinitely long and therefore it would have infinite area.

Now suppose you took a list of all the fractions (for how to list them see "infinity", and now suppose you took 1 square foot of paper. Cut it in half and put one half over the first fraction in the list, namely 0/1 or just 0. Now you could take the remaining paper, cut it in half, and put one half over the next fraction; 1/1.

You could continue this process until you had covered all the fractions in the list, although you would soon be adding strips of paper less than a billionth of an inch wide. They would still be wide enough because on a number line no rational number has any width at all.

This shows that while you need an infinite amount of paper to cover all real numbers on a number line you can use a finite amount of paper to cover all fractions.

If you need infinitely more paper to cover all real numbers than you do to cover all rational numbers then there must be infinitely more real numbers than natural numbers.

This doesn't constitute a mathematical proof but I found it made it easier to believe the diagonal argument.

The reasoning behind the following result is also sometimes known as the diagonal argument.

Proposition:

Let (X, d) be a complete metric space without isolated points (i.e. every ball contains a point other then the one it is centred on). Then X is uncountable.

Proof:

Suppose otherwise. X must be infinite, since otherwise every point would necessarily be isolated. Therefore there is a bijection f : x → N.

We now construct a sequence x0, x1, ... of points in X and a sequence r0, r1, ... of positive real numbers. Let x0 = f -1(0) and r0 = 1.

Given xk, rk the set B = {x ∈ X : d(x, xk) ≤ rk} \ {xk} is non-empty. Let xk+1 be the point in B which minimises f. Let rk+1 = min {rk/2, d(xk, xk+1)}.

The sequence xn is a Cauchy sequence, so since X is complete it converges to some limit x ∈ X. Inductively f(xk) ≥ k for all k. Hence x ≠ f -1(k) for any k, since xn lies in a closed ball not containing f -1(k) for n > k. This contradicts the bijectiveness of f.

Hence X is uncountable.


The reason why this is called the diagonal argument is that it is essentially an abstraction and generalisation of the proof for the uncountability of the reals given in the write-up above. We obtain a sequence by successively poking a point to ensure that it is distinct from each point of a given list, and then invoke completeness (this is implicit when we use decimal expansions) to show that the sequnce converges to a point not in the list.

Most references to Cantor's diagonal argument, such as Gorgonzola's, seem to give the appearance that it is impossible to create a list of real numbers that contains its own modified diagonal. To clarify (or confuse) matters, I will show that such a list is possible but that list will still be incomplete.

Example:
f(0)=.10000000000...
f(1)=.19999999999...
f(2)=.12931234134...
f(3)=.77799999999...
f(4)=.14159231415...
f(5)=.31245931234...
(Boldface digits for clarification)
By increasing the diagonal digits by one (wrapping around from 9 to 0), we get the number 0.200000000... which is the same as 0.19999999... = f(1).

Now, every number in this list, except the first, has the digit 9 in at one place or another. Therefore the list doesn't include numbers such as 1/9 = 0.11111... or 1/3 = 0.333333... . It's still incomplete.

Using the digit 9 all the way in the diagonal in unavoidable, as the only real numbers with dual representation are those with infinite trailing nines / zeroes.

Cantor diagonalization is a method of proof developed by the mathematician Georg Cantor in the late 1800's. It's a simple, elegant, and powerful form of proof by contradiction (a.k.a. reductio ad absurdum), and it has been used in one form or another in many of the most important mathematical results of the last century. (It's also known as diagonal proof, diagonal argument, or just plain old diagonalization. It's not to be confused with this, which is something totally different.) In their abstract form, diagonal proofs can be rather confusing, so we'll start with a simple and relatively concrete example: Cantor's proof that there are more real numbers than there are whole numbers.

"But wait," I hear you say, "aren't there an infinity of both whole numbers and real numbers? How can there possibly be more of one than the other?"

Well, it's true that there's an infinity of both. It's also true that there's an infinity of integers, an infinity of odd numbers, an infinity of perfect squares -- and it's even true that there are as many of each of those as there are of whole numbers. (If you're wondering how or why that's so, then I strongly recommend reading this node before proceeding.) Nontheless, there are MORE real numbers than there are of any of these, and we're going to prove it, using Cantor diagonalization. Before we begin, however, it's important that we review exactly what whole numbers and real numbers are:

The whole numbers, also known as the the positive integers, are simply the numbers we use to count:
{1, 2, 3, 4, 5, 6,...}

The real numbers are simply all of the rational numbers and all of the irrational numbers. You can find a good definition over at the real number node (and you can go here or here [or even here] for a more technical sort of definition), but the important thing to note is that all real numbers can be represented as non-terminating decimals.1 So, for example, all of the following are real numbers:
3 = 3.0000000000000.. .
42 = 42.000000000…
1/9 = 0.11111111111…
22/7 = 3.142857142857142857…
π = 3.141592653589793238…
√2 = 1.41421356...
e = 2.718281828...
√3 = 1.7320508...
φ = 1.6180339887...

The first few digits of other, less famous ones can be seen by typing a decimal point and then mashing your hand into the keypad for several minutes.

Now we can begin our proof. Since it's a proof by contradiction, we'll start by stating the opposite of what we want to prove, and then we will, in essence, go for the jugular: we will show that such an assumption leads directly to a contradiction.

Assumption: There are as many real numbers as there are whole numbers.

If this is the case, then it must be possible to make a numbered list of ALL the real numbers, thus setting up a one-to-one correspondence between the two sets of numbers. So, let's say we've got this list. Here it is:2

1) .14159265...
2) .41421356...
3) .34354357...
4) .25302211...
5) .34978923...
6) .11453934...
7) .12333222...
8) .34568285...
.
.
.

Now, have a look at the primary diagonal of the list; the first digit of the first number, the second digit of the second number, and so on:

1) .14159265...
2) .41421356...
3) .34354357...
4) .25302211...
5) .34978923...
6) .11453934...
7) .12333222...
8) .34568285...
.
.
.

Now, this diagonal is itself a real number, which we'll call b:

b = .11308925...

For those of you who find this "using the diagonal of the picture to define a number" thing somewhat wishy-washy (which it most certainly is not, but I won't argue the point here), here's an alternate definition of b:

Nth digit of b = Nth digit of the real number at place N on the list.

In other words, the first digit of b is the first digit of the first number on the list, and the second digit of b is the second digit of the second number on the list, etc. Of course, this is exactly what the diagonal is, and thus we have here a totally equivalent definition of b.

Now, go back and look at what we've done. Specifically, we've defined an real number (namely, b) in terms of the list that we have above of all real numbers.3 Ergo, the number we've created MUST be on our list somewhere, since our list is a list of all real numbers, and b is as real as they come. Since b must be on our list, it's got to have a place number, just like every other number on the list. Let's call this number B. Now we ask the question: what about the Bth digit of b? Using our second definition of b, we see that:

Bth digit of b = Bth digit of the real number at place B on the list.

Note the phrase in italics. The real number at place B on the list is, by definition, b. So we now have:

Bth digit of b = Bth digit of b.

Terribly convenient, don't you think? So far, no problems. But if you have the sense that something is about to go horribly, horribly wrong, you're absolutely correct. Here it comes:

Let's define another number in terms of our list. Specifically, let's take b and add 1 to every digit -- unless the digit is 9, in which case we take it back to 0. Technically, this operation is called "adding 1 mod 10," but there's no need for you to worry about such things. It makes more sense when you see it:

a = .22419036...

Compare this to b, and you'll see what I mean. All we did was add one to each digit, except for 9, which cycles back to 0. Simple enough. Here's our totally equivalent definition of a:

Nth digit of a = (Nth digit of the real number at place N on the list) + 1.

Now, a, just like b, is a real number. Therefore, just like b, it must appear on our list somewhere, and must therefore have a place number. Let's call that number A. Now we ask the same question as before: What about the Ath digit of a? Just like before, using our second definition of a, we see that:

Ath digit of a = (Ath digit of the real number at place A on the list) + 1.

Once again, note the phrase in italics. The real number at place A on the list is, by definition, a. So we now have:

Ath digit of a = (Ath digit of a) + 1.

Well, shit. That can't be good. Look at that. Or better yet, look at what happens when you subtract the "Ath digit of a" from both sides of the equation:

0 = 1

...so black is white, and we go get ourselves killed at the next zebra crossing. Seriously, though, we've derived a contradiction from our initial assumption that the real numbers are no more numerous than the whole numbers, and therefore our initial assumption must be false: there are, in fact, MORE real numbers than there are whole numbers. You can try to list them all, but once you make such a list, you'll always be able to find a real number that is NOT on the list through the magic of diagonalization.4

But don't change that dial, folks. There's more. This is just one example of a diagonal proof. It's the easiest to start with, but it's not the most elegant one around (just have a look at the footnotes to see what I mean). If you like, you can go home safe in your knowledge that the set of all reals is nonenumerable, and you'll have a pretty good idea of what Cantor diagonalization is. But if you want to see a truly elegant diagonal proof, stick around. We're just getting started.


Still here? Good. I knew you'd stay for the witty, non-interactive banter if for nothing else. Anyhow, for this bit, we need to define a bit of set-theory terminiology. Please don't run -- it's fairly simple. A subset of some set S is defined as a set with no members that are not in S. For example, let's look at a very simple set:

S = {Calvin, Hobbes}

S has four subsets:

  1. {Calvin, Hobbes}
  2. {Calvin}
  3. {Hobbes}
  4. {}, or Ø, the empty or null set.

We can also represent all possible subsets of S using a table:


   Calvin  Hobbes
1)   Y      Y
2)   Y      N
3)   N      Y
4)   N      N

So subset 1 has both Calvin and Hobbes in it, 2 has only Calvin, etc. It's the same as the list right above it, and both show that set S has 4 subsets.

Now we're ready. Let's go back to our old friend, the set of all whole numbers. We're going to prove, using Cantor diagonalization, that the number of subsets of the set of all whole numbers is bigger than the set of all whole numbers itself. In other words, we're going to prove that the set of all sets of whole numbers is nonenumerable. This result is known as Cantor's Theorem.

We start by assuming that there are not more subsets of the set of all whole numbers than there are whole numbers themselves. This means we can make a numbered list of all the subsets of the set of all whole numbers:

Is X in this set?
     1 2 3 4 5 6 7 8...
  1) Y N Y N Y N Y N...
2) N N N N N Y N N...
3) N Y N Y N Y N Y...
4) Y Y Y Y Y Y Y Y...
  5) N N N N N N N N...
6) Y N N Y N N N N...
  7) N Y Y N Y N Y N...
  8) Y Y N Y N N N Y...
.
.
.

This list is a bit confusing, so let me take a moment to explain. It's just like the Calvin and Hobbes chart, just infinitely big and with numbers. The numbers running down the left are the list numbers of the subsets. Running off to the right of each one is a series of Ys and Ns. Each Y or N answers the question, "Is the number at the top of this column a member of the subset that has the list number at the far left of this row?" So, for example, subset number 1) looks like this:

1) = {1, 3, 5, 7...}

Hopefully this makes the meaning of the chart clear to you. Now have a look at the diagonal of the chart:

Is X in this set?
     1 2 3 4 5 6 7 8...
  1) Y N Y N Y N Y N...
2) N N N N N Y N N...
3) N Y N Y N Y N Y...
4) Y Y Y Y Y Y Y Y...
  5) N N N N N N N N...
6) Y N N Y N N N N...
  7) N Y Y N Y N Y N...
  8) Y Y N Y N N N Y...
.
.
.

Let's define a set D based on the diagonal:

D = {1, 4, 7, 8...}

And here's a totally equivalent definition of D:

For any whole number x, x is a member of D if and only if x is a member of the xth set on the list.

This should be looking mighty familiar. Now, let's consider the complement of D, the set that contains all the whole numbers that D does not and vice versa:

C = {2, 3, 5, 6...}

And a totally equivalent definition of C:

For any whole number x, x is a member of C iff x is NOT a member of the xth set on the list.

This last definition is the key. Make sure you're clear on it before you read on. We're nearly done, don't worry. Now, C, being a set containing solely whole numbers, must be a subset of the set of all whole numbers. Thus, it must be on our list somewhere, and therefore it must have a place number on our list. Let's call this number c. Now, c must be a whole number, since it's one of our place numbers on our list. This brings us to the central question: is c a member of C? Let's use our definition of C to find out:

c is a member of C iff c is NOT a member of the cth set on the list.

Note the clause in italics VERY carefully. The cth set on the list is C, by definition. Therefore, we can substitute C in for the clause in italics, which leaves us with the following gem:

c is a member of C iff c is NOT a member of C.

This is, of course, a contradiction, and a particularly evil-looking one at that. Ergo, our initial assumption was false, and therefore we have proven exactly what we set out to prove; namely, Cantor's Theorem.

Diagonal proofs, despite the name, are generally done without pretty pictures and diagonal lines; they tend to be given in a wholly abstract form. While this makes them a bit harder to understand (as well as making the name rather strange-sounding until you see the pretty pictures), it does make it possible to give a diagonal proof in a (very short) paragraph. So, for your reading pleasure, I give you the above proof, minus the pictures.

Cantor's Theorem: The set of all sets of whole numbers (a.k.a. S) is nonenumerable.

Proof: Assume the opposite: that S is enumerable. If so, it is possible to uniquely assign each member of S a unique whole number as an index number. Now, consider the set D: for any whole number x, x is a member of D iff x is NOT a member of sx, the member of S for which it (x) is the index number. All members of D are whole numbers; therefore, D must be a member of S, and must therefore have an index number itself. Let D's index number be d. By the definition given above, d is a member of D iff d is not a member of the set for which it is the index number. However, the set for which d is the index number is D itself, and so we have:
d is a member of D iff d is not a member of D;
which is a contradiction. Thus, our initial assumption must be false, and therefore S is nonenumerable. QED.

That's Cantor diagonalization, in a nutshell. Pretty, isn't it?

Note that the whole thing feels rather deliciously twisted, somewhat like a snake eating its own tail. It's with good reason. The whole proof relies on indirect self-reference -- by using the list numbers, we've made it possible for a set to talk about itself. Have a look back at that paradoxical sentence in bold just above. This sentence, which is false, should remind you of it. Sorry. Sorry. What I meant to say is that the bold sentence above should be reminiscent of the sentence "This sentence is false". If c is a member of C, then c is not a member of C, in which case c is a member of C after all, &c. Both of these paradoxes stem from the fact that self-reference opens the door to all sorts of nasty business -- the same reason that there's a whole host of paradoxes and "limitative theorems" which are related to Cantor diagonalization.

Like I said, pretty.


Footnotes:
1: This isn’t to say that every real number has a UNIQUE representation as an infinite decimal. Certain rational numbers have two decimal expansions: one ending in an infinite string of zeroes, and one ending in an infinite string of nines. For example, 3 also equals 2.9999999..&c. This will be important later on, in footnote 4.

2: You've probably noticed that we're only using real numbers between zero and one. The reasons for our doing so will become clear once you've gone through the whole proof, but as it turns out, the number of real numbers between zero and one is the same as the number of real numbers everywhere. There's a nice proof of this here, though it will make more sense if you read this and this first.

3: Between zero and one, that is, but since b has to be on this interval too, we're safe.

4: Sharp-eyed readers may notice that this whole proof assumes that each real number has a UNIQUE decimal expansion, which we said was not the case in footnote 1. This is true. However, this doesn't kill diagonalization dead: to see why, have a look at Kalmakka's excellent and brief W/U immediately above this one.

Log in or registerto write something here or to contact authors.