Scientific field started by Leonard Adleman's 1994 paper in the scientific journal Science that described a solution to the Hamiltonian Path Problem using strands of DNA in solution. In general, research in the field aims to use DNA molecules to compute. DNA computing is suitable for problems that require massively parallel processing of information (parallel computing). Recent approaches have begun to use RNA or surface-based DNA to solve problems.

People involved in the field include Leonard Adleman, Richard Lipton, Erik Winfree, Lila Kari, and Laura Landweber.

DNA is an extraordinary storage device which has stood up to at least 3 billion years of testing. Recent experiments by the (brilliant) Leonard Adleman have shown that DNA can be used as a computing device as well. Another experiment by Carter Bancroft illustrated the possibility of performing addition using the molecule. The advantages of using DNA to perform calculations is that rather than doing many different operations in sequential order, it can perform many of the same operation simulatenously. This is essentially the process of parallel computing, and can be very useful in solving problems such as the Hamiltonian Path Problem. The importance of being able to solve a Hamiltonian Path Problem may seem negligible, but any Hamiltonian Path Problem can be reduced to an NP-Hard problem, but more importantly any NP-Hard problem can be reduced to a Hamiltonian Path Problem. The Hamiltonian Path Problem is also known as The Travelling Salesman Problem.

In fact, Adleman solved a very small instance of a rather easy NP complete problem. 7 vertex Hamiltonian path is just too easy to bother with. Modern approximations routinely "solve" TSP with tens of cities.

Of course, we're not always sure we got an optimum (although branch and bound sometimes guarantees that, too). And DNA computing doesn't really scale. Adleman used 20-base oligomers; a bigger problem would require longer oligomers (length grows linearly with the size of the instance). And since "all" combinations are tried (in fact that's not true either, since there's nothing to guarantee this will happen for very large problems), exponentially many combinations would still have to happen to find the right one. And the number (read: amount) of DNA molecules produced would grow exponentially, too. (I don't mention error correction, which is also a real problem, as it is solvable).

So it's not clear that DNA computing is a real advance in solving hard problems. But it's neat as hell!

The idea behind DNA computing is that a strand of DNA contains information. The four bases of DNA (A, T, G, C) can store information, just like the 26 letters of the english alphabet, or the 2 letters (1 and 0) that computers use. Within our cells, phenomenally amazing things are done with this information, but all the machinery is specialized to the kinds of things cells do, like make proteins and replicate DNA. The idea behind DNA computing is to put our own information onto DNA, and make our own ways of reading and manipulating that information, to solve problems that normally would require computers.

The idea for DNA computing may come from the fact that, in vivo, the proteins that "read" and "write" DNA bear some resemblance to a turing machine in which DNA is the "tape". DNA has four bases, comparable to the turing machine's bits, and proteins like DNA polymerase can chug along the strand, performing actions based on the sequence of the bases: DNA polymerase can add bases to a single strand or excise bases from a double strand. Meanwhile, RNA polymerase can read the strand and construct a separate chain (this time of RNA) based on what it reads. One can imagine a computer scientist reading about this and exclaiming "There's got to be a computer in there somewhere!"

But why? What good would a DNA computer be? Well, DNA molecules are small. You can have billions, trillions, quadrillions of them in one milliliter, or cc. One recent study was getting 6.646x1010 operations per second for just one ml. That's equivalent to a 66 GHz computer chip - for comparison, the fastest chip you can easily buy is around 3 GHz, and the world's fastest supercomputer (the NEC Earth-Simulator 5120) is 36 teraHz. Speaking purely in terms of gigahertz it would be beat out by a 20-ounce bottle of well-engineered DNA soup. So we see the potential.

All those billions of strands of DNA in a test tube can do their thing at the same time: they are massively parallel, unlike a typical computer, which generally has to wait for each operation to finish before starting the next one. SMP and beowulf clusters are nothing compared to the parallelization of DNA.

Another advantage of DNA computers is that DNA is a very compact storage medium. DNA takes one cubic nanometer per bit, whereas tape storage takes a whopping 1012 cubic nanometers per bit.

Unfortunately, nobody has yet done anything useful with a DNA computer, and it's probably too early to tell whether this is because DNA computing is still in its infancy, or whether the DNA computer will forever be vaporware. At present, even though the computations themselves are very fast, to synthesize the strands of DNA and to analyze the results takes many days of lab time, so that overall the process takes much longer than solving the same problems on a computer. So far all that's been done are proof-of-concept problems that hint at what DNA computers may one day be able to do.

The first, classic experiment by Leonard Adleman, in 1994, solved the problem of finding solutions to the "traveling salesman problem". If you have a number of cities ("vertices") to visit, and each is only connected to certain others, what is the shortest route that hits all the cities?

The methodology for this experiment relies on the fact that DNA hybridizes with DNA of a complimentary sequence. Each vertex was represented by a 20-base DNA sequence. Then, 20-base sequences were made that had the last ten bases of one vertex and the first ten bases of another. These were made in all allowable combinations (representing, for the traveling salesman, a train that can go from city B to city F, if the sequence has the last ten bases of city B and the first ten of city F). These connecting sequences, and the sequences representing the vertices, were mixed together. When the bases paired up, strands of DNA resulted that represented routes between the cities. A method called graduated PCR can find the order of the sequences in the strands, and is used to "print" the results.

But being massively parallel has its downsides: Martyn Amos of the University of Exeter, UK, has been quoted as saying:

"It has been estimated that if you scaled up the Hamilton Path Problem to 200 cities from Adleman's seven, then the weight of DNA required to represent all the possible solutions would exceed the weight of the earth."

Also, even though Adleman's reaction mixture produced results within the test tube at a rate of 100 teraflops, the experiment required about 7 days of lab work all told.

Another problem is that DNA hybridizing is not perfect; Adleman had to throw out certain solutions that were obviously wrong, such as result strands that were extremely short, extremely long, or were found to encode paths that were not possible according to the way the problem was laid out. A larger, more complicated experiment would have produced a larger, more complicated result set to be analyzed, and the human effort required to weed out the errors would probably be much greater.

So what now? Some other combinatorial NP-complete problems have been solved with DNA, such as a simple example of the graph coloring problem. Other researchers are trying to find a way to make DNA act like logic gates, or are attempting to make a turing machine out of restriction endonucleases and ligase. Ehud Shapiro is working on the latter, but not with a goal of making something like a conventional computer out of DNA; his idea is to make tiny biological computers that can be inserted into individual cells. The tiny "computers" might be able to recognize specific molecules or situations in the cell, and respond by synthesizing other molecules that could help the situation - in this case, acting as a therapy - or that could serve as a detectable flag to chemically signal what's going on. Thinking up huge long lists of potential medical and laboratory applications is left as an exercise for the reader. One of Shapiro's most recent papers (March 2003) involved a DNA molecule that acted as both data and fuel for the enzyme "hardware". Another researcher, Eric Winfree, is working on "programming" the edges of tiny DNA tiles, with the goal of making nanotech-sized structures out of them.