One thing that has always impressed me about HIV is how SMALL it is. The DNA strand that codes for it takes up only 8379 base pairs, or base pair triplets to code for 2793 amino acid blocks. Since a base pair triplet can code any of 21 amino acid types (counting the 'end' instruction) each one is equal to (ln(21)/ln(2)) = 4.4 bits of information. This means that the code for the HIV only takes up (2793 *4.4/8)= 1537 bytes.
It's kind of unnerving to know that you can write Hello World programs in certain programming languages that take up more data than a virus that can hijack a human immune system.
update CrazyIvan pointed out to me the fact that any DNA sequence has 6 possible frames for reading: Each strand can be read from one of three possible offsets, in either direction. This means that we cannot make the 6-bit to 21-amino acid compression, and must treat each base pair as an uncompressible 2 bits of information. Now, our code size is 8329 base pairs * 1 byte/4 base pairs = 2,083 bytes. (Although the implementation becomes 6 times more complex, most of this complexity is in the processor mechanism, and not in the encoding)
The envelope of the viral particle has proteins gp41* and gp120*, which are recognition particles that bind to specific receptors on certain types of cells. It should be noted that not all CD4+ T-cells are infected, and not all T-cells that do NOT carry the CD4 surface protein are infected, either. The variance lies in the coreceptors CXCR4 and CCR5, which allow proper "docking" of the virus and encourage its entry into the cell. Without these coreceptors present, the T-cell cannot take up the virus; thus, there are people who can carry and transmit the human immunodeficiency virus without ever showing any symptoms themselves.
Incidentally, it should also be noted that HIV can also infect macrophages and other lymphocytes besides T-cells.
The genome is actually quite impressive. Through alternate splicing, frameshift induction, and overlapping genes, the virus has managed to squeeze 7 different known genes, resulting in at least a dozen or so mature proteins, into about 9.8 kilobases of RNA.
The genome looks something like so:
| Vif | | U | | Gag | | R | | Env | LTR-------------------------------------------------------------------------------LTR | Pol | | Tat | |T| |Nef| |Re| |Rev|
printable version chaos
Everything2 Help
cooled by dem bones