Retroviral Oligonucleotide Distributions Correlate with Biased Nucleotide Compositions of Retrovirus Sequences, Suggesting a Duplicative Stepwise Molecular Evolution

Abstract. A computer-assisted analysis was made of 24 complete nucleotide sequences selected from the vertebrate retroviruses to represent the ten viral groups. The conclusions of this analysis extend and strengthen the previously made hypothesis on the Moloney murine leukemia virus: The evolution of the nucleotide sequence appears to have occurred mainly through at least three overlapping levels of duplication: (1) The distributions of overrepresented (3–6)-mers are consistent with the universal rule of a trend toward TG/CT excess and with the persistence of a certain degree of symmetry between the two strands of DNA. This suggests one or several original tandemly repeated sequences and some inverted duplications. (2) The existence of two general core consensuses at the level of these (3–6)-mers supports the hypothesis of a common evolutionary origin of vertebrate retroviruses. Consensuses more specific to certain sequences are compatible with phylogenetic trees established independently. The consensuses could correspond to intermediary evolutionary stages. (3) Most of the (3–6)-mers with a significantly higher than average frequency appear to be internally repeated (with monomeric or oligomeric internal iterations) and seem to be at least partly the cause of the bias observed by other researchers at the level of retroviral nucleotide composition. They suggest a third evolutionary stage by slippage-like stepwise local duplications.

[1]  Miguel Ángel Martínez,et al.  Reverse transcriptase and substrate dependence of the RNA hypermutagenesis reaction. , 1995, Nucleic acids research.

[2]  J. Weber Informativeness of human (dC-dA)n.(dG-dT)n polymorphisms. , 1990, Genomics.

[3]  D. Tautz,et al.  Cryptic simplicity in DNA is a major source of genetic variation , 1986, Nature.

[4]  M. Stratton,et al.  Instability of short tandem repeats (microsatellites) in human cancers , 1994, Nature Genetics.

[5]  T. Eickbush,et al.  Origin and evolution of retroelements based upon their reverse transcriptase sequences. , 1990, The EMBO journal.

[6]  G. B. Golding,et al.  Sequence-directed mutagenesis: evidence from a phylogenetic history of human alpha-interferon genes. , 1985, Proceedings of the National Academy of Sciences of the United States of America.

[7]  S. Ohno (AGCTG) (AGCTG) (AGCTG) (GGGTG) as the primordial sequence of intergenic spacers: the role in immunoglobulin class switch. , 1981, Differentiation; research in biological diversity.

[8]  G. Dover,et al.  Molecular drive: a cohesive mode of species evolution , 1982, Nature.

[9]  H. Temin Origin of retroviruses from cellular moveable genetic elements , 1980, Cell.

[10]  S Ohno Codon preference is but an illusion created by the construction principle of coding sequences. , 1988, Proceedings of the National Academy of Sciences of the United States of America.

[11]  R. Nussinov,et al.  Some indications for inverse DNA duplication. , 1982, Journal of theoretical biology.

[12]  T. Yomo,et al.  Various regulatory sequences are deprived of their uniqueness by the universal rule of TA/CG deficiency and TG/CT excess. , 1990, Proceedings of the National Academy of Sciences of the United States of America.

[13]  M. Behe,et al.  An oligopurine sequence bias occurs in eukaryotic viruses. , 1988, Nucleic acids research.

[14]  S. Karlin,et al.  Dinucleotide relative abundance extremes: a genomic signature. , 1995, Trends in genetics : TIG.

[15]  R. Blake,et al.  Statistical significance of symmetrical and repetitive segments in DNA. , 1982, Nucleic acids research.

[16]  F. Galibert,et al.  Nucleotide sequence of the gag gene and gag-pol junction of feline leukemia virus , 1984, Journal of virology.

[17]  J. Longshore,et al.  Over-representation of the disease associated (CAG) and (CGG) repeats in the human genome. , 1994, Nucleic acids research.

[18]  B. Berkhout,et al.  The unusual nucleotide content of the HIV RNA genome results in a biased amino acid composition of HIV proteins. , 1994, Nucleic acids research.

[19]  Overlapping redundant septuplets identical with regulatory elements of HIV-1 and SV40. , 1989, Nucleic acids research.