Codon and amino acid usage in retroviral genomes is consistent with virus-specific nucleotide pressure.

Retroviral RNA genomes are known to have a biased nucleotide composition. For instance, the plus-strand RNA of human immunodeficiency virus (HIV) is A-rich, and the genome of human T cell leukemia virus (HTLV) is C-rich, and other retroviruses have a U-rich or G-rich genome. The biased composition of these genomes is most likely caused by directional mutational pressure of the respective reverse transcriptase enzymes. Using a set of retroviral genomes with a distinct nucleotide composition, we performed skew analyses of the nucleotide bias along the complete viral genome. Distinct nucleotide signatures were apparent, and these typical patterns were generally conserved across the viral genome. Furthermore, it is demonstrated that this typical nucleotide bias, combined with a profound discrimination against the CpG dinucleotide sequence, strongly influences the codon usage of the retroviruses in a direct manner, and their amino acid usage in an indirect manner. The fact that both codon usage and amino acid usage are so closely entwined with the genome composition has important practical implications. For instance, the typical trends in nucleotide usage could influence the molecular phylogenetic reconstruction of the family Retroviridae.

[1]  B. Berkhout,et al.  The tendency of lentiviral open reading frames to become A-rich: constraints imposed by viral genome organization and cellular tRNA availability , 1995, Journal of Molecular Evolution.

[2]  David L. Robertson,et al.  Recombination in AIDS viruses , 1995, Journal of Molecular Evolution.

[3]  John N. Anderson,et al.  Nucleotide composition as a driving force in the evolution of retroviruses , 1994, Journal of Molecular Evolution.

[4]  T. Jukes,et al.  Relationship between G + C in silent sites of codons and amino acid composition of human proteins , 1993, Journal of Molecular Evolution.

[5]  I. Novella Contributions of vesicular stomatitis virus to the understanding of RNA virus evolution. , 2003, Current opinion in microbiology.

[6]  Jan Balzarini,et al.  Exploitation of the Low Fidelity of Human Immunodeficiency Virus Type 1 (HIV-1) Reverse Transcriptase and the Nucleotide Composition Bias in the HIV-1 Genome To Alter the Drug Resistance Development of HIV , 2001, Journal of Virology.

[7]  A. Das,et al.  HIV-1 RNA editing, hypermutation, and error-prone reverse transcription. , 2001, Science.

[8]  David P. Kreil,et al.  Identification of thermophilic species by the amino acid compositions deduced from their genomes. , 2001, Nucleic acids research.

[9]  Stephen J Freeland,et al.  A simple model based on mutation and selection explains trends in codon and amino-acid usage and GC composition within and across genomes , 2001, Genome Biology.

[10]  G. Singer,et al.  Nucleotide bias causes a genomewide bias in the amino acid composition of proteins. , 2000, Molecular biology and evolution.

[11]  Peter A. Jones,et al.  Cancer-epigenetics comes of age , 1999, Nature Genetics.

[12]  T. A. Hall,et al.  BIOEDIT: A USER-FRIENDLY BIOLOGICAL SEQUENCE ALIGNMENT EDITOR AND ANALYSIS PROGRAM FOR WINDOWS 95/98/ NT , 1999 .

[13]  M. Van de Casteele,et al.  The role of the codon first letter in the relationship between genomic GC content and protein amino acid composition. , 1999, Research in microbiology.

[14]  B. Berkhout,et al.  Biased Nucleotide Composition of the Genome of HERV-K Related Endogenous Retroviruses and Its Evolutionary Implications , 1999, Journal of Molecular Evolution.

[15]  A Grigoriev,et al.  Analyzing genomes with cumulative skew diagrams. , 1998, Nucleic acids research.

[16]  J. Lobry,et al.  Influence of genomic G+C content on average amino-acid composition of proteins from 59 bacterial species. , 1997, Gene.

[17]  A. Meyerhans,et al.  HIV genetic variation is directed and restricted by DNA precursor availability. , 1997, Journal of molecular biology.

[18]  B. Berkhout,et al.  Initial appearance of the 184Ile variant in lamivudine-treated patients is caused by the mutational bias of human immunodeficiency virus type 1 reverse transcriptase , 1997, Journal of virology.

[19]  B. Berkhout,et al.  Nucleotide substitution patterns can predict the requirements for drug-resistance of HIV-1 proteins. , 1996, Antiviral research.

[20]  T. Porter Correlation between codon usage, regional genomic nucleotide composition, and amino acid composition in the cytochrome P-450 gene superfamily. , 1995, Biochimica et biophysica acta.

[21]  J. Thompson,et al.  CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. , 1994, Nucleic acids research.

[22]  B. Berkhout,et al.  The unusual nucleotide content of the HIV RNA genome results in a biased amino acid composition of HIV proteins. , 1994, Nucleic acids research.

[23]  M Sala,et al.  G-->A hypermutation of the human immunodeficiency virus type 1 genome: evidence for dCTP pool imbalance during reverse transcription. , 1994, Proceedings of the National Academy of Sciences of the United States of America.

[24]  N. Sueoka,et al.  CORRELATION BETWEEN BASE COMPOSITION OF DEOXYRIBONUCLEIC ACID AND AMINO ACID COMPOSITION OF PROTEIN. , 1961, Proceedings of the National Academy of Sciences of the United States of America.