Patterns of Evolution and Host Gene Mimicry in Influenza and Other RNA Viruses

It is well known that the dinucleotide CpG is under-represented in the genomic DNA of many vertebrates. This is commonly thought to be due to the methylation of cytosine residues in this dinucleotide and the corresponding high rate of deamination of 5-methycytosine, which lowers the frequency of this dinucleotide in DNA. Surprisingly, many single-stranded RNA viruses that replicate in these vertebrate hosts also have a very low presence of CpG dinucleotides in their genomes. Viruses are obligate intracellular parasites and the evolution of a virus is inexorably linked to the nature and fate of its host. One therefore expects that virus and host genomes should have common features. In this work, we compare evolutionary patterns in the genomes of ssRNA viruses and their hosts. In particular, we have analyzed dinucleotide patterns and found that the same patterns are pervasively over- or under-represented in many RNA viruses and their hosts suggesting that many RNA viruses evolve by mimicking some of the features of their host's genes (DNA) and likely also their corresponding mRNAs. When a virus crosses a species barrier into a different host, the pressure to replicate, survive and adapt, leaves a footprint in dinucleotide frequencies. For instance, since human genes seem to be under higher pressure to eliminate CpG dinucleotide motifs than avian genes, this pressure might be reflected in the genomes of human viruses (DNA and RNA viruses) when compared to those of the same viruses replicating in avian hosts. To test this idea we have analyzed the evolution of the influenza virus since 1918. We find that the influenza A virus, which originated from an avian reservoir and has been replicating in humans over many generations, evolves in a direction strongly selected to reduce the frequency of CpG dinucleotides in its genome. Consistent with this observation, we find that the influenza B virus, which has spent much more time in the human population, has adapted to its human host and exhibits an extremely low CpG dinucleotide content. We believe that these observations directly show that the evolution of RNA viral genomes can be shaped by pressures observed in the host genome. As a possible explanation, we suggest that the strong selection pressures acting on these RNA viruses are most likely related to the innate immune response and to nucleotide motifs in the host DNA and RNAs.

[1]  R. Silverman,et al.  Hepatitis C virus RNA: dinucleotide frequencies and cleavage by RNase L. , 2007, Virus research.

[2]  S. Agrawal,et al.  Synthetic agonists of Toll-like receptors 7, 8 and 9. , 2007, Biochemical Society transactions.

[3]  Y. Louzoun,et al.  Phase-Dependent Immune Evasion of Herpesviruses , 2007, Journal of Virology.

[4]  M. Malim,et al.  APOBEC-mediated viral restriction: not simply editing? , 2007, Trends in biochemical sciences.

[5]  Yan Li,et al.  Aberrant innate immune response in lethal infection of macaques with the 1918 influenza virus , 2007, Nature.

[6]  P. Auewarakul,et al.  Compositional Bias and Size of Genomes of Human DNA Viruses , 2006, Intervirology.

[7]  Gunther Hartmann,et al.  5'-Triphosphate RNA Is the Ligand for RIG-I , 2006, Science.

[8]  Raul Rabadan,et al.  Comparison of Avian and Human Influenza A Viruses Reveals a Mutational Bias on the Viral Genomes , 2006, Journal of Virology.

[9]  E. Holmes,et al.  Evolutionary Basis of Codon Usage and Nucleotide Composition Bias in Vertebrate DNA Viruses , 2006, Journal of Molecular Evolution.

[10]  Arnold J. Levine,et al.  A Relative-Entropy Algorithm for Genomic Fingerprinting Captures Host-Phage Similarities , 2005, Journal of bacteriology.

[11]  Y. Guan,et al.  Proinflammatory cytokine responses induced by influenza A (H5N1) viruses in primary human alveolar and bronchial epithelial cells , 2005, Respiratory research.

[12]  C. Coban,et al.  CpG RNA: Identification of Novel Single-Stranded RNA That Stimulates Human CD14+CD11c+ Monocytes , 2005, The Journal of Immunology.

[13]  P. Auewarakul Composition bias and genome polarity of RNA viruses , 2004, Virus Research.

[14]  G. Bernardi,et al.  Compositional constraints and genome evolution , 2005, Journal of Molecular Evolution.

[15]  Ruth Nussinov,et al.  Strong doublet preferences in nucleotide sequences and DNA geometry , 2005, Journal of Molecular Evolution.

[16]  Colin N. Dewey,et al.  Sequence and comparative analysis of the chicken genome provide unique perspectives on vertebrate evolution , 2004, Nature.

[17]  Yong Wang,et al.  Cytosine Methylation Is Not the Major Factor Inducing CpG Dinucleotide Deficiency in Bacterial Genomes , 2004, Journal of Molecular Evolution.

[18]  Kamel Jabbari,et al.  Cytosine methylation and CpG, TpG (CpA) and TpA frequencies. , 2004, Gene.

[19]  R. König,et al.  Single-strand specificity of APOBEC3G accounts for minus-strand deamination of the HIV genome , 2004, Nature Structural &Molecular Biology.

[20]  Michel Henry,et al.  APOBEC3G is a single-stranded DNA cytidine deaminase and functions independently of HIV reverse transcriptase. , 2004, Nucleic acids research.

[21]  K. Ishii,et al.  Signal transduction pathways mediated by the interaction of CpG DNA with Toll-like receptor 9. , 2004, Seminars in immunology.

[22]  Terrence S. Furey,et al.  The UCSC Table Browser data retrieval tool , 2004, Nucleic Acids Res..

[23]  Terrence S. Furey,et al.  The UCSC Genome Browser Database , 2003, Nucleic Acids Res..

[24]  Y. Guan,et al.  Induction of proinflammatory cytokines in human macrophages by influenza A (H5N1) viruses: a mechanism for the unusual severity of human disease? , 2002, The Lancet.

[25]  K. Ishii,et al.  CpG DNA: recognition by and activation of monocytes. , 2002, Microbes and infection.

[26]  A. Cuticchia,et al.  Influence of intercodon and base frequencies on codon usage in filarial parasites. , 2001, Genomics.

[27]  J. V. Moran,et al.  Initial sequencing and analysis of the human genome. , 2001, Nature.

[28]  Immunostimulatory DNA Sequences , 2001, Springer Berlin Heidelberg.

[29]  International Human Genome Sequencing Consortium Initial sequencing and analysis of the human genome , 2001, Nature.

[30]  F. De Amicis,et al.  Intercodon dinucleotides affect codon choice in plant genes. , 2000, Nucleic acids research.

[31]  G Bernardi,et al.  Isochores and the evolutionary genomics of vertebrates. , 2000, Gene.

[32]  E. Raz,et al.  Immunostimulatory DNA sequences : an overview. , 2000, Methods in molecular medicine.

[33]  D. Mouchiroud,et al.  Warm-blooded isochore structure in Nile crocodile and turtle. , 1999, Molecular biology and evolution.

[34]  N. Mcferran,et al.  Dinucleotide and stop codon frequencies in single-stranded RNA viruses. , 1997, The Journal of general virology.

[35]  S Karlin,et al.  Compositional differences within and between eukaryotic genomes. , 1997, Proceedings of the National Academy of Sciences of the United States of America.

[36]  S Karlin,et al.  Why is CpG suppressed in the genomes of virtually all small eukaryotic viruses but not in those of large eukaryotic viruses? , 1994, Journal of virology.

[37]  D. A. Clayton,et al.  Pervasive CpG suppression in animal mitochondrial genomes. , 1994, Proceedings of the National Academy of Sciences of the United States of America.

[38]  J A Koziol,et al.  Evolution of the genome and the genetic code: selection at the dinucleotide level by methylation and polyribonucleotide cleavage. , 1989, Proceedings of the National Academy of Sciences of the United States of America.

[39]  G. Shaw,et al.  A conserved AU sequence from the 3′ untranslated region of GM-CSF mRNA mediates selective mRNA degradation , 1986, Cell.

[40]  W. Salser Globin mRNA sequences: analysis of base pairing and evolutionary implications. , 1978, Cold Spring Harbor symposia on quantitative biology.

[41]  I B Dawid,et al.  5-Methylcytidylic Acid: Absence from Mitochondrial DNA of Frogs and HeLa Cells , 1974, Science.

[42]  E Parisi,et al.  The heterogeneity of thymine methyl group origin in DNA pyrimidine isostichs of developing sea urchin embryos. , 1967, Proceedings of the National Academy of Sciences of the United States of America.

[43]  J. Josse,et al.  Enzymatic synthesis of deoxyribonucleic acid. XII. A polymer of deoxyguanylate and deoxycytidylate. , 1962, The Journal of biological chemistry.

[44]  A. Kornberg,et al.  Enzymatic synthesis of deoxyribonucleic acid. XI. Further studies on nearest neighbor base sequences in deoxyribonucleic acids. , 1962, The Journal of biological chemistry.

[45]  J. Josse,et al.  Enzymatic synthesis of deoxyribonucleic acid. VIII. Frequencies of nearest neighbor base sequences in deoxyribonucleic acid. , 1961, The Journal of biological chemistry.