Selection against CpG dinucleotides in lentiviral genes: a possible role of methylation in regulation of viral expression.

Extremely low frequencies of CpG dinucleotides are found in the genomes of the lentivirus subfamily of retroviruses, including the human, simian and feline immunodeficiency viruses (HIV1, HIV2, SIV, and FIV, respectively), equine infectious anemia virus (EIAV), and the ovine lentivirus, Visna. The occurrence of CpG dinucleotides is greater in the 2-3 (NCG) than in the 1-2 (CGN) codon-defined frame, as well as in the gag and env genes, compared to the more conserved pol gene. These differences suggest that CpG depletion in lentiviruses occurs as a result of selection against CpG rather than due to mutational bias, the latter is responsible for low CpG frequencies in vertebrate genomes. CpG levels in the onco-retrovirus subfamily are reduced to a lesser extent, principally due to mutational bias. The difference between the retrovirus subfamilies appears to reflect their evolutionary origin, that is, lentiviruses have no known endogenous counterparts whereas most oncoviruses have endogenous cellular counterparts with which they can undergo recombination. Furthermore, we suggest that the number of CpG dinucleotides in a lentiviral genome determines the maximum potential DNA methylation level of the provirus, which in turn affects viral transcription in host cells.