How many bits are in the human genome? (recipe) by syntaxfree

The compression metaphor above is formalized by information entropy. Without entering equation territory, the question we ask has two steps:

For a given word size (two pairs, three pairs,...), how unique is a given word in your genome and
What's the "total uniqueness"?

Information entropy answers those things in a beautiful way that's both simple and intricate. It's also closely connected to the compression answer.

If we know something about structure, then we might have to trot out Kolmogorov complexity (for instance, the digits of pi provide for high-entropy noise that won't compress well, but the knowledge that it's pi we're talking about gives us well-behaving series for generating the data that fit in a really small place), but if we don't, that's the question you should be asking: not how many bits of data, but how many bits of entropy?

How is the information in DNA modified by metabolism?	Science in the Next Millennium	How many genes do we (humans) have?	bp
Parsimonious	Human Genome Project	My entire genetic makeup can be entered on a single CD-Rom	junk DNA
algorithmic information theory	DNA	central dogma	One day, E2 will attain sentience. And I'll be there to see it.
Number of Ants in the world vs. Number of Leaves	shotgun sequencing	Using gzip to do computational linguistics	elegant program
The meaning of life	Writing a Novel is Hard	ASN.1	Chromatin