Defining the Pandemic at the State Level: Sequence-Based Epidemiology of the SARS-CoV-2 virus by the Arizona COVID-19 Genomics Union (ACGU)

In December of 2019, a novel coronavirus, SARS-CoV-2, emerged in the city of Wuhan, China causing severe morbidity and mortality. Since then, the virus has swept across the globe causing millions of confirmed infections and hundreds of thousands of deaths. To better understand the nature of the pandemic and the introduction and spread of the virus in Arizona, we sequenced viral genomes from clinical samples tested at the TGen North Clinical Laboratory, provided to us by the Arizona Department of Health Services, and at Arizona State University and the University of Arizona, collected as part of community surveillance projects. Phylogenetic analysis of 79 genomes we generated from across Arizona revealed a minimum of 9 distinct introductions throughout February and March. We show that >80% of our sequences descend from clades that were initially circulating widely in Europe but have since dominated the outbreak in the United States. In addition, we show that the first reported case of community transmission in Arizona descended from the Washington state outbreak that was discovered in late February. Notably, none of the observed transmission clusters are epidemiologically linked to the original travel-related cases in the state, suggesting successful early isolation and quarantine. Finally, we use molecular clock analyses to demonstrate a lack of identifiable, widespread cryptic transmission in Arizona prior to the middle of February 2020.

[1]  Clemens Vonrhein,et al.  Structural and molecular basis of mismatch correction and ribavirin excision from coronavirus RNA , 2017, Proceedings of the National Academy of Sciences.

[2]  A. Cornish-Bowden Nomenclature for incompletely specified bases in nucleic acid sequences: recommendations 1984. , 1985, Nucleic acids research.

[3]  Alexey M. Kozlov,et al.  RAxML-NG: a fast, scalable and user-friendly tool for maximum likelihood phylogenetic inference , 2018, bioRxiv.

[4]  G. Gao,et al.  A Novel Coronavirus from Patients with Pneumonia in China, 2019 , 2020, The New England journal of medicine.

[5]  E. Decroly,et al.  One severe acute respiratory syndrome coronavirus protein complex integrates processive RNA polymerase and exonuclease activities , 2014, Proceedings of the National Academy of Sciences.

[6]  S. Ho,et al.  Relaxed Phylogenetics and Dating with Confidence , 2006, PLoS biology.

[7]  Beth K. Martin,et al.  Early Detection of Covid-19 through a Citywide Pandemic Surveillance Platform , 2020, The New England journal of medicine.

[8]  Alexey M. Kozlov,et al.  RAxML-NG: a fast, scalable and user-friendly tool for maximum likelihood phylogenetic inference , 2019, Bioinform..

[9]  Trevor Bedford,et al.  Cryptic transmission of SARS-CoV-2 in Washington state , 2020, Science.

[10]  Nichollas E. Scott,et al.  Direct RNA sequencing and early evolution of SARS-CoV-2 , 2020, bioRxiv.

[11]  S. Lindstrom,et al.  First Mildly Ill, Non-Hospitalized Case of Coronavirus Disease 2019 (COVID-19) Without Viral Transmission in the United States — Maricopa County, Arizona, 2020 , 2020, Clinical infectious diseases : an official publication of the Infectious Diseases Society of America.

[12]  Daniel L. Ayres,et al.  Bayesian phylogenetic and phylodynamic data integration using BEAST 1.10 , 2018, Virus evolution.

[13]  Edward C. Holmes,et al.  A dynamic nomenclature proposal for SARS-CoV-2 to assist genomic epidemiology , 2020, bioRxiv.

[14]  Timothy B. Stockwell,et al.  Infidelity of SARS-CoV Nsp14-Exonuclease Mutant Virus Replication Is Revealed by Complete Genome Sequencing , 2010, PLoS pathogens.

[15]  Hannah R. Meredith,et al.  The Incubation Period of Coronavirus Disease 2019 (COVID-19) From Publicly Reported Confirmed Cases: Estimation and Application , 2020, Annals of Internal Medicine.

[16]  M. Ciccozzi,et al.  Emerging SARS-CoV-2 mutation hot spots include a novel RNA-dependent-RNA polymerase variant , 2020, Journal of Translational Medicine.

[17]  K. Shi,et al.  Structural basis of receptor recognition by SARS-CoV-2 , 2020, Nature.

[18]  Ben Nichols,et al.  VSEARCH: a versatile open source tool for metagenomics , 2016, PeerJ.

[19]  C. Hsiao,et al.  The SARS coronavirus nucleocapsid protein – Forms and functions , 2014, Antiviral Research.

[20]  L. Guddat,et al.  Structure of the RNA-dependent RNA polymerase from COVID-19 virus , 2020, Science.

[21]  E. Holmes,et al.  The proximal origin of SARS-CoV-2 , 2020, Nature Medicine.

[22]  Mandev S. Gill,et al.  Improving Bayesian population dynamics inference: a coalescent-based model for multiple loci. , 2013, Molecular biology and evolution.

[23]  Trevor Bedford,et al.  Nextstrain: real-time tracking of pathogen evolution , 2017, bioRxiv.

[24]  Andrew Rambaut,et al.  Evolutionary origins of the SARS-CoV-2 sarbecovirus lineage responsible for the COVID-19 pandemic , 2020, Nature Microbiology.

[25]  Yan Zhang,et al.  Structural Basis for the Inhibition of the RNA-Dependent RNA Polymerase from SARS-CoV-2 by Remdesivir , 2020, bioRxiv.

[26]  S. Lindstrom,et al.  First Case of 2019 Novel Coronavirus in the United States , 2020, The New England journal of medicine.

[27]  Ruifu Yang,et al.  Antigenicity Analysis of Different Regions of the Severe Acute Respiratory Syndrome Coronavirus Nucleocapsid Protein , 2004, Clinical chemistry.

[28]  Yi Jiang,et al.  Structural basis for inhibition of the RNA-dependent RNA polymerase from SARS-CoV-2 by remdesivir , 2020, Science.

[29]  Isaac I. Bogoch,et al.  Coast-to-coast spread of SARS-CoV-2 in the United States revealed by genomic epidemiology , 2020, medRxiv.

[30]  A. Walls,et al.  Structure, Function, and Antigenicity of the SARS-CoV-2 Spike Glycoprotein , 2020, Cell.

[31]  D. Montefiori,et al.  Spike mutation pipeline reveals the emergence of a more transmissible form of SARS-CoV-2 , 2020, bioRxiv.

[32]  P. Rottier,et al.  The Coronavirus Nucleocapsid Protein Is Dynamically Associated with the Replication-Transcription Complexes , 2010, Journal of Virology.

[33]  M. Scotch,et al.  An 81-Nucleotide Deletion in SARS-CoV-2 ORF7a Identified from Sentinel Surveillance in Arizona (January to March 2020) , 2020, Journal of Virology.