Rampant C→U Hypermutation in the Genomes of SARS-CoV-2 and Other Coronaviruses: Causes and Consequences for Their Short- and Long-Term Evolutionary Trajectories

The wealth of accurately curated sequence data for severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), its long genome, and its low substitution rate provides a relatively blank canvas with which to investigate effects of mutational and editing processes imposed by the host cell. The finding that a large proportion of sequence change in SARS-CoV-2 in the initial months of the pandemic comprised C→U mutations in a host APOBEC-like context provides evidence for a potent host-driven antiviral editing mechanism against coronaviruses more often associated with antiretroviral defense. In evolutionary terms, the contribution of biased, convergent, and context-dependent mutations to sequence change in SARS-CoV-2 is substantial, and these processes are not incorporated by standard models used in molecular epidemiology investigations. ABSTRACT The pandemic of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has motivated an intensive analysis of its molecular epidemiology following its worldwide spread. To understand the early evolutionary events following its emergence, a data set of 985 complete SARS-CoV-2 sequences was assembled. Variants showed a mean of 5.5 to 9.5 nucleotide differences from each other, consistent with a midrange coronavirus substitution rate of 3 × 10−4 substitutions/site/year. Almost one-half of sequence changes were C→U transitions, with an 8-fold base frequency normalized directional asymmetry between C→U and U→C substitutions. Elevated ratios were observed in other recently emerged coronaviruses (SARS-CoV, Middle East respiratory syndrome [MERS]-CoV), and decreasing ratios were observed in other human coronaviruses (HCoV-NL63, -OC43, -229E, and -HKU1) proportionate to their increasing divergence. C→U transitions underpinned almost one-half of the amino acid differences between SARS-CoV-2 variants and occurred preferentially in both 5′ U/A and 3′ U/A flanking sequence contexts comparable to favored motifs of human APOBEC3 proteins. Marked base asymmetries observed in nonpandemic human coronaviruses (U ≫ A > G ≫ C) and low G+C contents may represent long-term effects of prolonged C→U hypermutation in their hosts. The evidence that much of sequence change in SARS-CoV-2 and other coronaviruses may be driven by a host APOBEC-like editing process has profound implications for understanding their short- and long-term evolution. Repeated cycles of mutation and reversion in favored mutational hot spots and the widespread occurrence of amino acid changes with no adaptive value for the virus represent a quite different paradigm of virus sequence change from neutral and Darwinian evolutionary frameworks and are not incorporated by standard models used in molecular epidemiology investigations. IMPORTANCE The wealth of accurately curated sequence data for severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), its long genome, and its low substitution rate provides a relatively blank canvas with which to investigate effects of mutational and editing processes imposed by the host cell. The finding that a large proportion of sequence change in SARS-CoV-2 in the initial months of the pandemic comprised C→U mutations in a host APOBEC-like context provides evidence for a potent host-driven antiviral editing mechanism against coronaviruses more often associated with antiretroviral defense. In evolutionary terms, the contribution of biased, convergent, and context-dependent mutations to sequence change in SARS-CoV-2 is substantial, and these processes are not incorporated by standard models used in molecular epidemiology investigations.

[1]  O. Maguire,et al.  Mitochondrial hypoxic stress induces widespread RNA editing by APOBEC3G in natural killer cells , 2019, Genome Biology.

[2]  J. Dudley,et al.  APOBECs and virus restriction. , 2015, Virology.

[3]  C. Münk,et al.  An ancient history of gene duplications, fusions and losses in the evolution of APOBEC3 mutators in mammals , 2012, BMC Evolutionary Biology.

[4]  C. Samuel,et al.  Adenosine deaminases acting on RNA (ADARs) are both antiviral and proviral. , 2011, Virology.

[5]  Paul Kellam,et al.  Spread, Circulation, and Evolution of the Middle East Respiratory Syndrome Coronavirus , 2014, mBio.

[6]  V. Corman,et al.  Hosts and Sources of Endemic Human Coronaviruses , 2018, Advances in Virus Research.

[7]  B. Berkhout,et al.  On the biased nucleotide composition of the human coronavirus RNA genome , 2015 .

[8]  M. Peeters,et al.  Evolution of the Primate APOBEC3A Cytidine Deaminase Gene and Identification of Related Coding Regions , 2012, PloS one.

[9]  M. Emerman,et al.  Ancient Adaptive Evolution of the Primate Antiviral DNA-Editing Enzyme APOBEC3G , 2004, PLoS biology.

[10]  Xiaotu Ma,et al.  Analysis of error profiles in deep next-generation sequencing data , 2019, Genome Biology.

[11]  M. Malim,et al.  The innate antiviral factor APOBEC3G targets replication of measles, mumps and respiratory syncytial viruses. , 2012, The Journal of general virology.

[12]  Timothy B. Stockwell,et al.  Infidelity of SARS-CoV Nsp14-Exonuclease Mutant Virus Replication Is Revealed by Complete Genome Sequencing , 2010, PLoS pathogens.

[13]  E. Holmes,et al.  A new coronavirus associated with human respiratory disease in China , 2020, Nature.

[14]  Heng Wang,et al.  Newly emerged porcine enteric alphacoronavirus in southern China: Identification, origin and evolutionary history analysis , 2018, Infection, Genetics and Evolution.

[15]  M. Stenglein,et al.  APOBEC3B and APOBEC3F Inhibit L1 Retrotransposition by a DNA Deamination-independent Mechanism* , 2006, Journal of Biological Chemistry.

[16]  M. Malim,et al.  APOBEC-Mediated Editing of Viral RNA , 2004, Science.

[17]  G. Gao,et al.  A Novel Coronavirus from Patients with Pneumonia in China, 2019 , 2020, The New England journal of medicine.

[18]  M. Torcia,et al.  Evidence for host-dependent RNA editing in the transcriptome of SARS-CoV-2 , 2020, Science Advances.

[19]  Susanna K.P. Lau,et al.  Cytosine deamination and selection of CpG suppressed clones are the two major independent biological forces that shape codon usage bias in coronaviruses , 2007, Virology.

[20]  M. Torcia,et al.  Evidence for host-dependent RNA editing in the transcriptome of SARS-CoV-2. , 2020, Science advances.

[21]  Kai Zhao,et al.  A pneumonia outbreak associated with a new coronavirus of probable bat origin , 2020, Nature.

[22]  M Sala,et al.  G-->A hypermutation of the human immunodeficiency virus type 1 genome: evidence for dCTP pool imbalance during reverse transcription. , 1994, Proceedings of the National Academy of Sciences of the United States of America.

[23]  L. M. Mansky,et al.  Deamination hotspots among APOBEC3 family members are defined by both target site sequence context and ssDNA secondary structure , 2020, Nucleic acids research.

[24]  M. Malim,et al.  DNA Deamination Mediates Innate Immunity to Retroviral Infection , 2003, Cell.

[25]  Joaquín Dopazo,et al.  Genetic evolution and tropism of transmissible gastroenteritis coronaviruses , 1992, Virology.

[26]  Xiaotao Lu,et al.  High Fidelity of Murine Hepatitis Virus Replication Is Decreased in nsp14 Exoribonuclease Mutants , 2007, Journal of Virology.

[27]  Gersende Caron,et al.  Broad antiretroviral defence by human APOBEC3G through lethal editing of nascent reverse transcripts , 2003, Nature.

[28]  S. Cen,et al.  APOBEC3G is a restriction factor of EV71 and mediator of IMB‐Z antiviral activity , 2019, Antiviral research.

[29]  Aleksandra Milewska,et al.  APOBEC3-mediated restriction of RNA virus replication , 2018, Scientific Reports.

[30]  S. Patnaik,et al.  The double-domain cytidine deaminase APOBEC3G is a cellular site-specific RNA editing enzyme , 2016, Scientific Reports.

[31]  M. Nelson,et al.  Characterization and evolution of porcine deltacoronavirus in the United States , 2015, Preventive Veterinary Medicine.

[32]  M. Vignuzzi,et al.  Coronaviruses Lacking Exoribonuclease Activity Are Susceptible to Lethal Mutagenesis: Evidence for Proofreading and Potential Therapeutics , 2013, PLoS pathogens.

[33]  W. Fitch,et al.  Severe Acute Respiratory Syndrome Coronavirus Sequence Characteristics and Evolutionary Rate Estimate from Maximum Likelihood Analysis , 2004, Journal of Virology.

[34]  Jing Zhao,et al.  Early Transmission Dynamics in Wuhan, China, of Novel Coronavirus–Infected Pneumonia , 2020, The New England journal of medicine.

[35]  P. Lemey,et al.  Genetic Variability of Human Respiratory Coronavirus OC43 , 2005, Journal of Virology.

[36]  S. Patnaik,et al.  APOBEC3A cytidine deaminase induces RNA editing in monocytes and macrophages , 2015, Nature Communications.

[37]  Koichiro Tamura,et al.  MEGA6: Molecular Evolutionary Genetics Analysis version 6.0. , 2013, Molecular biology and evolution.

[38]  Hui Zhang,et al.  The cytidine deaminase CEM15 induces hypermutation in newly synthesized HIV-1 DNA , 2003, Nature.

[39]  P. Simmonds,et al.  Modelling mutational and selection pressures on dinucleotides in eukaryotic phyla –selection against CpG and UpA in cytoplasmically expressed RNA and in RNA viruses , 2013, BMC Genomics.

[40]  M. Salemi,et al.  Regaining perspective on SARS-CoV-2 molecular tracing and its implications , 2020, medRxiv.

[41]  P. Simmonds SSE: a nucleotide and amino acid sequence analysis platform , 2012, BMC Research Notes.

[42]  M. Baker,et al.  Antiviral Immune Responses of Bats: A Review , 2012, Zoonoses and public health.