Recombination and lineage-specific mutations linked to the emergence of SARS-CoV-2

The emergence of SARS-CoV-2 underscores the need to better understand the evolutionary processes that drive the emergence and adaptation of zoonotic viruses in humans. In the betacoronavirus genus, which also includes SARS-CoV and MERS-CoV, recombination frequently encompasses the Receptor Binding Domain (RBD) of the Spike protein, which, in turn, is responsible for viral binding to host cell receptors. Here, we find evidence of a recombination event in the RBD involving ancestral linages to both SARS-CoV and SARS-CoV-2. Although we cannot specify the recombinant nor the parental strains, likely due to the ancestry of the event and potential undersampling, our statistical analyses in the space of phylogenetic trees support such an ancestral recombination. Consequently, SARS-CoV and SARS-CoV-2 share an RBD sequence that includes two insertions (positions 432-436 and 460-472), as well as the variants 427N and 436Y. Both 427N and 436Y belong to a helix that interacts directly with the human ACE2 (hACE2) receptor. Reconstruction of ancestral states, combined with protein-binding affinity analyses using the physics-based trRosetta algorithm, reveal that the recombination event involving ancestral strains of SARS-CoV and SARS-CoV-2 led to an increased affinity for hACE2 binding, and that alleles 427N and 436Y significantly enhanced affinity as well. Structural modeling indicates that ancestors of SARS-CoV-2 may have acquired the ability to infect humans decades ago. The binding affinity with the human receptor was subsequently boosted in SARS-CoV and SARS-CoV-2 through further mutations in RBD. In sum, we report an ancestral recombination event affecting the RBD of both SARS-CoV and SARS-CoV-2 that was associated with an increased binding affinity to hACE2. Importance This paper addresses critical questions about the origin of the SARS-CoV-2 virus: what are the evolutionary mechanisms that led to the emergence of the virus, and how can we leverage such knowledge to assess the potential of SARS-like viruses to become pandemic strains? In this work, we demonstrate common mechanisms involved in the emergence of human-infecting SARS-like viruses: first, by acquiring a common haplotype in the RBD through recombination, and further, through increased specificity to the human ACE2 receptor through lineage specific mutations. We also show that the ancestors of SARS-CoV-2 already had the potential to infect humans at least a decade ago, suggesting that SARS-like viruses currently circulating in wild animal species constitute a source of potential pandemic re-emergence.

[1]  Jianyi Yang,et al.  Improved protein structure prediction using predicted interresidue orientations , 2020, Proceedings of the National Academy of Sciences.

[2]  Rolf Hilgenfeld,et al.  Nsp3 of coronaviruses: Structures and functions of a large multi-domain protein , 2017, Antiviral Research.

[3]  Lisa E. Gralinski,et al.  SARS-CoV-2 Infection is Effectively Treated and Prevented by EIDD-2801 , 2021, Nature.

[4]  O. Gascuel,et al.  New algorithms and methods to estimate maximum-likelihood phylogenies: assessing the performance of PhyML 3.0. , 2010, Systematic biology.

[5]  E. Holmes,et al.  Genomic characterisation and epidemiology of 2019 novel coronavirus: implications for virus origins and receptor binding , 2020, The Lancet.

[6]  K. E. Follis,et al.  Furin cleavage of the SARS coronavirus spike glycoprotein enhances cell–cell fusion but does not affect virion entry , 2006, Virology.

[7]  D. Falzarano,et al.  SARS and MERS: recent insights into emerging coronaviruses , 2016, Nature Reviews Microbiology.

[8]  Alice C Hughes,et al.  A Novel Bat Coronavirus Closely Related to SARS-CoV-2 Contains Natural Insertions at the S1/S2 Cleavage Site of the Spike Protein , 2020, Current Biology.

[9]  Ralph Baric,et al.  A Mouse-Adapted SARS-Coronavirus Causes Disease and Mortality in BALB/c Mice , 2007, PLoS pathogens.

[10]  K. Katoh,et al.  MAFFT Multiple Sequence Alignment Software Version 7: Improvements in Performance and Usability , 2013, Molecular biology and evolution.

[11]  Edward C. Holmes,et al.  Rates of Molecular Evolution in RNA Viruses: A Quantitative Phylogenetic Analysis , 2002, Journal of Molecular Evolution.

[12]  Andrew Rambaut,et al.  Exploring the temporal structure of heterochronous sequences using TempEst (formerly Path-O-Gen) , 2016, Virus evolution.

[13]  B. Murrell,et al.  RDP4: Detection and analysis of recombination patterns in virus genomes , 2015, Virus evolution.

[14]  R. Johnston,et al.  Synthetic recombinant bat SARS-like coronavirus is infectious in cultured cells and in mice , 2008, Proceedings of the National Academy of Sciences.

[15]  Maryam K. Garba,et al.  Probabilistic Distances Between Trees , 2017, Systematic biology.

[16]  Sergei L. Kosakovsky Pond,et al.  Natural selection in the evolution of SARS-CoV-2 in bats created a generalist virus and highly capable human pathogen , 2021, PLoS biology.

[17]  H. Ochman,et al.  Recombination events are concentrated in the spike protein region of Betacoronaviruses , 2020, PLoS genetics.

[18]  C. Hon,et al.  Evidence of the Recombinant Origin of a Bat Severe Acute Respiratory Syndrome (SARS)-Like Coronavirus and Its Implications on the Direct Ancestor of SARS Coronavirus , 2007, Journal of Virology.

[19]  T. Lam,et al.  Evolutionary origins of the SARS-CoV-2 sarbecovirus lineage responsible for the COVID-19 pandemic , 2020, Nature Microbiology.

[20]  B. Canard,et al.  The spike glycoprotein of the new coronavirus 2019-nCoV contains a furin-like cleavage site absent in CoV of the same clade , 2020, Antiviral Research.

[21]  Emmanuel Paradis,et al.  ape 5.0: an environment for modern phylogenetics and evolutionary analyses in R , 2018, Bioinform..

[22]  R. Nielsen,et al.  Synonymous mutations and the molecular evolution of SARS-CoV-2 origins , 2020, bioRxiv.

[23]  J. Scott Provan,et al.  A Fast Algorithm for Computing Geodesic Distances in Tree Space , 2009, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[24]  Daniel L. Ayres,et al.  Bayesian phylogenetic and phylodynamic data integration using BEAST 1.10 , 2018, Virus evolution.

[25]  G. Herrler,et al.  SARS-CoV-2 Cell Entry Depends on ACE2 and TMPRSS2 and Is Blocked by a Clinically Proven Protease Inhibitor , 2020, Cell.

[26]  Y. Benjamini,et al.  THE CONTROL OF THE FALSE DISCOVERY RATE IN MULTIPLE TESTING UNDER DEPENDENCY , 2001 .

[27]  Guoping Zhao,et al.  Molecular Evolution of the SARS Coronavirus During the Course of the SARS Epidemic in China , 2004, Science.

[28]  Jesse D. Bloom,et al.  Deep mutational scanning of SARS-CoV-2 receptor binding domain reveals constraints on folding and ACE2 binding , 2020, bioRxiv.

[29]  Ralph S. Baric,et al.  Recombination, Reservoirs, and the Modular Spike: Mechanisms of Coronavirus Cross-Species Transmission , 2009, Journal of Virology.

[30]  Rolf Hilgenfeld,et al.  A G-quadruplex-binding macrodomain within the “SARS-unique domain” is essential for the activity of the SARS-coronavirus replication–transcription complex , 2015, Virology.

[31]  Huachen Zhu,et al.  Identification of 2019-nCoV related coronaviruses in Malayan pangolins in southern China , 2020, bioRxiv.

[32]  Fei Deng,et al.  Discovery of a novel coronavirus associated with the recent pneumonia outbreak in humans and its potential bat origin , 2020, bioRxiv.

[33]  Sung Keun Kang,et al.  Molecular evolution of the SARS coronavirus during the course of the SARS epidemic in China. , 2004, Science.

[34]  Sergey Lyskov,et al.  PyRosetta: a script-based interface for implementing molecular modeling algorithms using Rosetta , 2010, Bioinform..

[35]  Qiang Zhou,et al.  Structural basis for the recognition of SARS-CoV-2 by full-length human ACE2 , 2020, Science.

[36]  E. Holmes,et al.  Viral evolution and the emergence of SARS coronavirus. , 2004, Philosophical transactions of the Royal Society of London. Series B, Biological sciences.

[37]  E. Holmes,et al.  The proximal origin of SARS-CoV-2 , 2020, Nature Medicine.

[38]  Louis J. Billera,et al.  Geometry of the Space of Phylogenetic Trees , 2001, Adv. Appl. Math..

[39]  B. Neuman Bioinformatics and functional analyses of coronavirus nonstructural proteins involved in the formation of replicative organelles , 2016, Antiviral Research.

[40]  Elena E. Giorgi,et al.  Emergence of SARS-CoV-2 through Recombination and Strong Purifying Selection , 2020, bioRxiv.

[41]  Brian D. Weitzner,et al.  Macromolecular modeling and design in Rosetta: recent methods and frameworks , 2020, Nature Methods.

[42]  A. Walls,et al.  Structure, Function, and Antigenicity of the SARS-CoV-2 Spike Glycoprotein , 2020, Cell.

[43]  Martin Vingron,et al.  TREE-PUZZLE: maximum likelihood phylogenetic analysis using quartets and parallel computing , 2002, Bioinform..

[44]  Edward C Holmes,et al.  The phylogeography of human viruses , 2004, Molecular ecology.

[45]  The coronavirus proofreading exoribonuclease mediates extensive viral recombination , 2021, PLoS pathogens.

[46]  S. Harrison,et al.  Structure of SARS Coronavirus Spike Receptor-Binding Domain Complexed with Receptor , 2005, Science.

[47]  Christian Drosten,et al.  Identification of a novel coronavirus in patients with severe acute respiratory syndrome. , 2003, The New England journal of medicine.

[48]  Yue Chen,et al.  Evolution and variation of 2019-novel coronavirus , 2020, bioRxiv.

[49]  C. Maranas,et al.  Biophysical characterization of the SARS-CoV-2 spike protein binding with the ACE2 receptor and implications for infectivity , 2020, bioRxiv.

[50]  Ning Wang,et al.  Discovery of a rich gene pool of bat SARS-related coronaviruses provides new insights into the origin of SARS coronavirus , 2017, PLoS pathogens.

[51]  D. Sauter,et al.  Furin‐mediated protein processing in infectious diseases and cancer , 2019, Clinical & translational immunology.

[52]  T. Kuiken,et al.  The Multibasic Cleavage Site in H5N1 Virus Is Critical for Systemic Spread along the Olfactory and Hematogenous Routes in Ferrets , 2012, Journal of Virology.

[53]  V. Misra,et al.  Bats and Coronaviruses , 2019, Viruses.

[54]  E. Holmes,et al.  A new coronavirus associated with human respiratory disease in China , 2020, Nature.