There are various ways to obtain certain types of information about a protein. The methods and databases that you will want to use will depend a great deal on what kind of information you want.

Databases such as Swiss-Prot can be used to obtain the primary structures (amino acid sequences) of proteins. As with nucleic acids, there are two basic methods for searching sequence databases:

  1. particular keyword label (e.g., "cytochrome c"). PIR (at http://pir.georgetown.edu/) in particular allows a variety of different types of keyword characteristics to be searched;

  2. search engines can be used to hunt for sequences that are similar to one another. As an analogy, you could search a phone book for a number (sequence) associated with a particular name (protein), or you could search a phonebook to determine the names of all the people who had phone number ending in 7675.

Databases such as Blocks (http://www.blocks.fhcrc.org/) or Prodom (http://prodes.toulouse.inra.fr/prodom/doc/prodom.html) provide information about sequence and structural patterns in proteins. These databases group proteins that contain similar active sites or substructures, and thus differ from search engines that blindly compare all primary sequences. They are particularly useful because the structure of a protein will determine its function, yet there are no good ways to compare overall structure. As an analogy, if you wanted to know who was related to who in a town, it might be easier to look at last names in a phone book rather than pictures in a yearbook.

Databases such as NRL-3D (http://www.psc.edu/general/software/packages/nrl_3d/nrl_3d.html) and Entrez (http://www.ncbi.nlm.nih.gov/Entrez/) contain information about the overall three-dimensional structures of proteins. As an analogy, if you looked up a name in the phone book, this would be the yearbook that would show his or her picture.

In order to find out virtually anything you want to know about an enzyme, use the EC Enzyme Database (http://www.expasy.ch/enzyme/). This database can be searched using the name of the enzyme. A series of different enzymes (with associated EC numbers) will come up. When you click on a particular type of enzyme (EC number), it will lead you into a page that contains links to other types of information, including: the reaction catalyzed, associated metabolic diseases (OMIM), and what is known about enzymes from particular organisms. When you in turn click on an enzyme from a particular organism you will be led to a page that contains links to protein sequence, pattern, and structure for that enzyme.


From the BioTech Project at http://biotech.icmb.utexas.edu/. For further information see the BioTech homenode.