The Ugly Duckling Theorem demonstrates that classification is impossible without some sort of bias. It is named for Hans Christian Andersen's famous story of The Ugly Duckling, because it shows that, all things being equal, an ugly duckling is just as similar to a swan as two swans are to each other. It was proposed and proved by Satosi Watanabe in 1969.

Here's the basic idea: say there are n things in the universe, and we want to put them in some sort of classes or categories. But we have *no* preconceived ideas or biases about what sorts of categories are "natural" or "normal" and what aren't. So we have to consider *all* the possible classes that could be, all the possible ways of making sets out of the n objects. There are 2^{n} such ways, the size of the power set of n objects. Maybe we can use that to measure the similarity between two objects: we'll see how many sets they have in common. Woops. We can't. *Any* two objects have *exactly* the same number of classes in common with one another, namely 2^{n-1} (half the total number of classes there are). To see this is so, you can imagine each class is a represented by an n-bit binary number, with a zero for each element not in the class and a one for each element in the class. As you can see, there are 2^{n} such numbers, which makes sense, of course. Because all possible choices of zeros and ones are there, any two bit-positions will agree exactly half the time. If you have trouble seeing that, pick two elements and reorder the bits so they're the first two, and imagine the numbers sorted lexicographically. If you think about it, the first 2^{n}/2 numbers will have bit #1 set to zero, and the second 2^{n}/2 will have it set to one. Within each of those blocks, the top 2^{n}/4 will have bit #2 set to zero and the other 2^{n}/4 will have it as one... so they agree on two blocks of 2^{n}/4 or on half of all the cases. No matter which two elements we pick! So if we have no preconceived bias about which categories are better, then by perfectly fair means of measuring, *everything* is equally similar (or equally dissimilar). The number of predicates simultaneously satisfied by two non-identical elements is constant over all such pairs. And is the same as the number of those satisfied by one. Yow.

So you can see we need some kind of inductive bias to make such judgments; some way to prefer certain categories over others.