The idea behind DNA computing is that a strand of DNA contains information. The four bases of DNA (A, T, G, C) can store information, just like the 26 letters of the english alphabet, or the 2 letters (1 and 0) that computers use. Within our cells, phenomenally amazing things are done with this information, but all the machinery is specialized to the kinds of things cells do, like make proteins and replicate DNA. The idea behind DNA computing is to put our own information onto DNA, and make our own ways of reading and manipulating that information, to solve problems that normally would require computers.

The idea for DNA computing may come from the fact that, in vivo, the proteins that "read" and "write" DNA bear some resemblance to a turing machine in which DNA is the "tape". DNA has four bases, comparable to the turing machine's bits, and proteins like DNA polymerase can chug along the strand, performing actions based on the sequence of the bases: DNA polymerase can add bases to a single strand or excise bases from a double strand. Meanwhile, RNA polymerase can read the strand and construct a separate chain (this time of RNA) based on what it reads. One can imagine a computer scientist reading about this and exclaiming "There's got to be a computer in there somewhere!"

But why? What good would a DNA computer be? Well, DNA molecules are small. You can have billions, trillions, quadrillions of them in one milliliter, or cc. One recent study was getting 6.646x1010 operations per second for just one ml. That's equivalent to a 66 GHz computer chip - for comparison, the fastest chip you can easily buy is around 3 GHz, and the world's fastest supercomputer (the NEC Earth-Simulator 5120) is 36 teraHz. Speaking purely in terms of gigahertz it would be beat out by a 20-ounce bottle of well-engineered DNA soup. So we see the potential.

All those billions of strands of DNA in a test tube can do their thing at the same time: they are massively parallel, unlike a typical computer, which generally has to wait for each operation to finish before starting the next one. SMP and beowulf clusters are nothing compared to the parallelization of DNA.

Another advantage of DNA computers is that DNA is a very compact storage medium. DNA takes one cubic nanometer per bit, whereas tape storage takes a whopping 1012 cubic nanometers per bit.

Unfortunately, nobody has yet done anything useful with a DNA computer, and it's probably too early to tell whether this is because DNA computing is still in its infancy, or whether the DNA computer will forever be vaporware. At present, even though the computations themselves are very fast, to synthesize the strands of DNA and to analyze the results takes many days of lab time, so that overall the process takes much longer than solving the same problems on a computer. So far all that's been done are proof-of-concept problems that hint at what DNA computers may one day be able to do.

The first, classic experiment by Leonard Adleman, in 1994, solved the problem of finding solutions to the "traveling salesman problem". If you have a number of cities ("vertices") to visit, and each is only connected to certain others, what is the shortest route that hits all the cities?

The methodology for this experiment relies on the fact that DNA hybridizes with DNA of a complimentary sequence. Each vertex was represented by a 20-base DNA sequence. Then, 20-base sequences were made that had the last ten bases of one vertex and the first ten bases of another. These were made in all allowable combinations (representing, for the traveling salesman, a train that can go from city B to city F, if the sequence has the last ten bases of city B and the first ten of city F). These connecting sequences, and the sequences representing the vertices, were mixed together. When the bases paired up, strands of DNA resulted that represented routes between the cities. A method called graduated PCR can find the order of the sequences in the strands, and is used to "print" the results.

But being massively parallel has its downsides: Martyn Amos of the University of Exeter, UK, has been quoted as saying:

"It has been estimated that if you scaled up the Hamilton Path Problem to 200 cities from Adleman's seven, then the weight of DNA required to represent all the possible solutions would exceed the weight of the earth."

Also, even though Adleman's reaction mixture produced results within the test tube at a rate of 100 teraflops, the experiment required about 7 days of lab work all told.

Another problem is that DNA hybridizing is not perfect; Adleman had to throw out certain solutions that were obviously wrong, such as result strands that were extremely short, extremely long, or were found to encode paths that were not possible according to the way the problem was laid out. A larger, more complicated experiment would have produced a larger, more complicated result set to be analyzed, and the human effort required to weed out the errors would probably be much greater.

So what now? Some other combinatorial NP-complete problems have been solved with DNA, such as a simple example of the graph coloring problem. Other researchers are trying to find a way to make DNA act like logic gates, or are attempting to make a turing machine out of restriction endonucleases and ligase. Ehud Shapiro is working on the latter, but not with a goal of making something like a conventional computer out of DNA; his idea is to make tiny biological computers that can be inserted into individual cells. The tiny "computers" might be able to recognize specific molecules or situations in the cell, and respond by synthesizing other molecules that could help the situation - in this case, acting as a therapy - or that could serve as a detectable flag to chemically signal what's going on. Thinking up huge long lists of potential medical and laboratory applications is left as an exercise for the reader. One of Shapiro's most recent papers (March 2003) involved a DNA molecule that acted as both data and fuel for the enzyme "hardware". Another researcher, Eric Winfree, is working on "programming" the edges of tiny DNA tiles, with the goal of making nanotech-sized structures out of them.

Selected references and further reading:
Benenson Y, Adar R, Paz-Elizur T, Livneh Z, Shapiro E. DNA molecule provides a computing machine with both data and fuel. Proc Natl Acad Sci U S A. 2003 Mar 4;100(5):2191-6.

Adleman LM. Molecular computation of solutions to combinatorial problems. Science. 1994 Nov 11;266(5187):1021-4.

Parker J. Computing with DNA. EMBO Rep. 2003 Jan;4(1):7-10. Review.

Liu W, Zhang F, Xu J. A DNA algorithm for the graph coloring problem. J Chem Inf Comput Sci. 2002 Sep-Oct;42(5):1176-8.