Traditional SETI (Search for Extraterrestrial Intelligence) programs use one central computer to analyze electromagnetic data in real time as it's read from a radio telescope. Because the computer must be ready for new data as it comes in, it can only do limited analysis of the signal. That is, it can neither analyze particular sections of the electromagnetic spectrum in great depth, nor attend to all the possible sections which might contain electromagnetic signs of extraterrestrial life. SETI@home is a project managed by the Space Sciences Laboratory at the University of California, Berkeley that breaks this computational bottleneck and does deep enough analysis that hopefully no signal will go unnoticed.

Instead of using a central computer that briefly stores and parses the incoming data stream, SETI@home harnesses the power of distributed computing to search the stream for telltale signals. One computer does sit on the telescope's incoming data line, but instead of processing the data, it spins it on to 35 gigabyte tapes which are shipped to Berkeley. There, more computers encapsulate it into much smaller chunks and broadcast them to individual users throughout the internet. These users, who sign up through the setiathome.ssl.berkeley.edu website, also run a screensaver or daemon program which examines the data. Since there is no hope of doing this in real time anyway, each user computer can take as long as it needs (up to the time-out threshold of a week or two) to work with the data. Once the data has been examined in full, it is sent back to the SETI@home network, which collates the packets and tests to see if any of them are positive for interesting findings.

A full 35 gigabyte tape's worth of data comes into the Arecibo radio telescope (which is used by SETI@home) in Puerto Rico every day. The telescope is only active 70% of the day, the rest of the time is taken by calibrating it and sending data off the island on radio frequencies which would interfere with its use. To save money, SETI@home has its receiver attached to the antenna of the telescope, but doesn't actually purchase any time to control the antenna for itself. Instead it uses the piggyback concept pioneered by Carl Sagan's SERENDIP projects, and picks up data while other researchers use the telescope for their work. In other words, the SETI@home receiver pays attention to whatever part of the sky it's pointed at, storing only that part of the sky for analysis. While it's difficult to cover the whole sky in such a haphazard manner, SETI@home hopes to have all the sky that's visible to Arecibo covered three times by the end of the project.

Tapes arriving at Berkeley each have about 15 hours of data which falls in the 2.5 MHz of frequency surrounding 1420 MHz, the 21 cm hydrogen line that's popular with SETI researchers. Signal from these tapes is first fed into a data splitter cluster which uses fast Fourier transforms to pare the data into 256 separate 9,766 Hz subbands, which are each split temporally into pieces lasting about 107 seconds. These 250K chunks, along with error checking and collation data regarding them, are put into 340K work-units which are sent to a storage device. From there, a data server accesses the user database, and sends out work-units from storage to all of the SETI@home users on the internet. It is to this server which the finished units return, packaged with all the data derived from them by the internet users.

Each returning work-unit is error-checked against and merged with the others, and the result is compared with a scientific database. That database contains information on frequencies generated by terrestrial radio interference, which it removes from analysis. It also weights signals higher if they from a direction that has a cataloged nearby star, and does other weighting based on SETI algorithms. If a signal passes enough tests done in the science database it is turned over to a human researcher for further analysis, and (thus far, unfortunately) always proves to be a signal of terrestrial origin.

SETI@home tries to be more than a neato science project for its users, providing a common interest and experience base for an online community. Setiathome.ssl.berkeley.edu has a message board for participants, as well as polls and customizable user profiles. Also, the team encourages a sort of informal competition between users by ranking them in order of computational hours donated to the project.

SETI@home clients for the Microsoft Windows family of operating systems provide a wicked cool display of what your computer is doing with the data, and look like the kind of computer display you only see in movies. The Linux client doesn't have the spiffy graphics unless you use a SETI@home add-on for X, instead running in text mode only or as an invisible daemon. Both clients use process management to limit the amount of processor time used for signal processing, so no slowdown should be noticeable while the client runs in the background. See also Getting the most out of SETI@home on your Mac for information about that platform.


Thanks to Albert Herring for help with this writeup.