This is a very intricate topic that will most like bore the bejeezus out of most people. This is intended to be a general reference addition, like Websters. If links don't work yet, I will be filling them in eventually. If you can fill one in yourself (in reasonable detail), do so! I am trying to include all I can remember, but I am bound to leave something out, so if I do, tell me! Consider this to be a work in progress, it will be updated as I find mistakes or new research comes to light. This is pretty difficult to do without pictures, but here goes nothing...

HIV is the virus generally accepted to cause AIDS. It is an enveloped virus, meaning it has a membrane surrounding it, which is derived from the membrane of a host cell, with additional viral proteins added (the most important of which are gp120 and gp41). Examples of other enveloped species are Influenza and herpesvirus. The other major type of virion is naked, where there is simply a protein coat surrounding the virus. They include picornaviruses and bacteriophages, among others.

HIV is also a retrovirus. This means that it has a single stranded RNA genome that is converted to a double stranded cDNA by reverse transcriptase. This DNA is then imported to the nucleus and incorporated into the host genome at a region of active transcription by the enzyme integrase. It should be noted that integrase is a fairly unique enzyme, nothing like it exists in humans. It is therefore a very promising target for antiviral therapy. The integrated viral DNA is known now as a provirus. It contains a promoter and 3 important genes, known as gag, pol, and env, as well as a variety of accessory genes that won't be dealt with here.

These genes are transcribed in one long string, and the resulting RNAs are rearranged to form the mRNAs for the accessory genes, which translocate unspliced viral RNA to the cytoplasm, where they are translated as one long gag-pol polyprotein by a process that involves ribosomal frameshifting. The gag-pol polyprotein is cleaved by the viral protease into smaller structural proteins (viral matrix, capsid, and nucleocapsid proteins], as well as the viral enzymes (protease, integrase, and reverse transcriptase). The viral surface proteins gp120 and gp41 are produced as the gp160 polyprotein, and then cleaved by cellular proteases. Other accessory proteins regulate the combination of nucleocapsid proteins and viral RNA. This then buds off from the cytoplasm of the cell as a mature virion.

Mature HIV virions infect CD4+ T-helper cells. They bind when the viral envelope protein gp120 associates with the CD4 receptor. Gp41 then mediates the fusion of the viral envelope with the cell membrane, and the virion enters the cell, thus beginning the cycle over again.

See also:AZT, protease inhibitor
You can see HIV-1's genetic code for yourself, courtesy of the National Institute of Health at .

One thing that has always impressed me about HIV is how SMALL it is. The DNA strand that codes for it takes up only 8379 base pairs, or base pair triplets to code for 2793 amino acid blocks. Since a base pair triplet can code any of 21 amino acid types (counting the 'end' instruction) each one is equal to (ln(21)/ln(2)) = 4.4 bits of information. This means that the code for the HIV only takes up (2793 *4.4/8)= 1537 bytes.

It's kind of unnerving to know that you can write Hello World programs in certain programming languages that take up more data than a virus that can hijack a human immune system.

CrazyIvan pointed out to me the fact that any DNA sequence has 6 possible frames for reading: Each strand can be read from one of three possible offsets, in either direction. This means that we cannot make the 6-bit to 21-amino acid compression, and must treat each base pair as an uncompressible 2 bits of information. Now, our code size is 8329 base pairs * 1 byte/4 base pairs = 2,083 bytes. (Although the implementation becomes 6 times more complex, most of this complexity is in the processor mechanism, and not in the encoding)

HIV falls under the class of lentiviruses, which are a type of retrovirus that shows a long "lag time" between the time of infection and the rise of first symptoms. Other lentiviruses include SIV (simian immunodeficiency virus) and FIV (feline immunodeficiency virus -- anyone who has a cat with this knows it's misery).

The envelope of the viral particle has proteins gp41* and gp120*, which are recognition particles that bind to specific receptors on certain types of cells. It should be noted that not all CD4+ T-cells are infected, and not all T-cells that do NOT carry the CD4 surface protein are infected, either. The variance lies in the coreceptors CXCR4 and CCR5, which allow proper "docking" of the virus and encourage its entry into the cell. Without these coreceptors present, the T-cell cannot take up the virus; thus, there are people who can carry and transmit the human immunodeficiency virus without ever showing any symptoms themselves.

Incidentally, it should also be noted that HIV can also infect macrophages and other lymphocytes besides T-cells.

The genome is actually quite impressive. Through alternate splicing, frameshift induction, and overlapping genes, the virus has managed to squeeze 7 different known genes, resulting in at least a dozen or so mature proteins, into about 9.8 kilobases of RNA.

The genome looks something like so:

                                       | Vif  |       | U |
     |    Gag     |                          | R |      |         Env         |
               |       Pol             |       | Tat |              |T|      |Nef|
                                                  |Re|              |Rev|

  • Gag = group antigen = matrix, nucleocapsid, capsid proteins
  • Pol = polymerase = protease, integrase, reverse transcriptase, Rnase H
  • Env = envelope proteins = gp41, gp120
  • Tat + T = transcriptional activator = regulatory gene this gene is a product of the Tat and T sections being spliced together
  • Rev +Re = RNA binding protein = shifts gene production this gene is a product of splicing the Re and Rev sections together
  • Nef = negative factor = regulation of cellular activities
  • R = Vpr = protein incorporated in virion = allows entry into nondividing cells
  • U = Vpu = membrane phosphoprotein = downmodulates CD4, promotes virion release
  • Vif = essential for growth only in certain cell lines
  • LTR = long terminal repeat = required for delineation of viral genome when incorporated into host genome; can allow repeating of viral genome within host by linking LTRs

Due to the error-prone viral RTase, HIV is very prone to mutation. (The mutation rate is approximately 1 in 106 bases.) Taking this and the rate of viral production into account, it seems that it would be very easy for the virus to mutate and become resistant to any form of treatment used. For this reason, HAART (highly active anti-retroviral therapy), which usually involves a "cocktail" of two protease inhibitors and a transcription inhibitor, was developed. The probability of developing mutations sufficient to render all three drugs is 1 in 1018, or very small. In fact, HAART is sufficient to maintain low enough serum levels of HIV particles for a patient to survive for several years. However, the entire regimen must be carefully maintained, as the virus lie latent and will begin replicating again once treatment is stopped.
* -- gp stands for gene product.


Log in or registerto write something here or to contact authors.