A rule that describes how gene expression sometimes works.

A gene is a section of DNA that we say "codes for" something, such as an enzyme. An enzyme is a protein (that is, a large molecule made of amino acids) that catalyzes some reaction.

In 1902, Sir Archibald Garrod came up with the idea while researching a disease called alkaptonuria, which runs in families and causes the patient's urine to turn black or brown when exposed to air (for example, babies with it would have black urine stains in their diapers). Garrod proposed that this happened because these patients had a missing or broken version of an enzyme that normally breaks down alkaptans (the chemicals that were turning the urine dark). If each gene codes for one enzyme, these patients had a broken or missing enzyme because the gene for that enzyme was broken or missing.

In the 1940s, George Beadle and Edward Tatum worked with the fungus Neurospora crassa, a break mold, and showed that, of the many auxotrophic mutants they had, each had a defect in one enzyme, and a mutation in one gene. They established a one-to-one correspondence between genes and enzymes: One Gene, One Enzyme.

It was an important finding at the time, but as is often the case in science, we now know that it's not exactly true.

As summarized in the Central Dogma of Molecular Biology, DNA codes for (is used as a blueprint to make) a messenger RNA (mRNA) transcript, which in turn codes for a polypeptide or protein. The central dogma can be thought of as a more detailed statement of "one gene, one enzyme", but even it is an overgeneralization.

Although a gene does code for an mRNA, the gene also contains other sequences that don't make it into the mRNA, such as the promoter and the introns (but only eukaryotes have introns - prokaryotes, aka bacteria, do not).

In fact, in eukaryotes one gene can code for many mRNAs through the magic of alternative splicing: to remove the introns, the RNA has to be cut apart and spliced back together. Some genes can generate more than one RNA (there is a certain gene that codes for over 100,000 RNAs) by cutting and splicing in different places. So one gene can code for many mRNAs.

Furthermore, each RNA doesn't have to become a polypeptide. Some RNAs are useful as they are, such as tRNAs that help in translation and snRNAs, which help in splicing.

Also, when mRNA is translated into a polypeptide, that polypeptide doesn't have to form a whole protein by itself. For example, the protein hemoglobin is made of four polypeptides: two of them come from one gene, and the other two come from another. (there are also heme rings, but that's another story). So many polypeptides can form one protein.

To make it even worse, a protein doesn't have to be an enzyme. For example, trypsin is an enzyme because it does something; it catalyzes a reaction. Specifically, it cuts other proteins into pieces. But collagen is not an enzyme. It is a structural protein: it's there to make the extracellular matrix strong and resilient. So, not all proteins are enzymes.

Finally, just to make your life harder, some enzymes are actually made partially of proteins and partially of RNA. The ribosome that is responsible for protein translation is one of these, as are the snRNPs (pronounced "snurps") which are involved in splicing.