Phylogenetic analysis of SARS-CoV-2 in Boston highlights the impact of superspreading events

Phylogenetics of superspreading One important characteristic of coronavirus epidemiology is the occurrence of superspreading events. These are marked by a disproportionate number of cases originating from often-times asymptomatic individuals. Using a rich sequence dataset from the early stages of the Boston outbreak, Lemieux et al. identified superspreading events in specific settings and analyzed them phylogenetically (see the Perspective by Alizon). Using ancestral trait inference, the authors identified several importation events, further investigated the context and contribution of particular superspreading events to the establishment of local and wider SARS-CoV-2 transmission, and used viral phylogenies to describe sustained transmission. Science, this issue p. eabe3261; see also p. 574 Phylogenetic analysis of complete SARS-CoV-2 genomes provides evidence that superspreading profoundly influenced epidemic spread. INTRODUCTION We used genomic epidemiology to investigate the introduction and spread of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) in the Boston area across the first wave of the pandemic, from March through May 2020, including high-density sampling early in this period. Our analysis provides a window into the amplification of transmission in an urban setting, including the impact of superspreading events on local, national, and international spread. RATIONALE Superspreading is recognized as an important driver of SARS-CoV-2 transmission, but the determinants of superspreading—why apparently similar circumstances can lead to very different outcomes—are poorly understood. The broader impact of such events, both on local transmission and on the overall trajectory of the pandemic, can also be difficult to determine. Our dataset includes hundreds of cases that resulted from superspreading events with different epidemiological features, which allowed us to investigate the nature and effect of superspreading events in the first wave of the pandemic in the Boston area and to track their broader impact. RESULTS Our data suggest that there were more than 120 introductions of SARS-CoV-2 into the Boston area, but that only a few of these were responsible for most local transmission: 29% of the introductions accounted for 85% of the cases. At least some of this variation results from superspreading events amplifying some lineages and not others. Analysis of two superspreading events in our dataset illustrate how some introductions can be amplified by superspreading. One occurred in a skilled nursing facility, where multiple introductions of SARS-CoV-2 were detected in a short time period. Only one of these led to rapid and extensive spread within the facility, and significant mortality in this vulnerable population, but there was little onward transmission. A second superspreading event, at an international business conference, led to sustained community transmission, including outbreaks in homeless and other higher-risk communities, and was exported domestically and internationally, ultimately resulting in hundreds of thousands of cases. The two events also differed substantially in the genetic variation they generated, possibly suggesting varying transmission dynamics in superspreading events. Our results also show how genomic data can be used to support cluster investigations in real time—in this case, ruling out connections between contemporaneous cases at Massachusetts General Hospital, where nosocomial transmission was suspected. CONCLUSION Our results provide powerful evidence of the importance of superspreading events in shaping the course of this pandemic and illustrate how some introductions, when amplified under unfortunate circumstances, can have an outsized effect with devastating consequences that extend far beyond the initial events themselves. Our findings further highlight the close relationships between seemingly disconnected groups and populations during a pandemic: Viruses introduced at an international business conference seeded major outbreaks among individuals experiencing homelessness; spread throughout the Boston area, including to other higher-risk communities; and were exported extensively to other domestic and international sites. They also illustrate an important reality: Although superspreading among vulnerable populations has a larger immediate impact on mortality, the cost to society is greater for superspreading events that involve younger, healthier, and more mobile populations because of the increased risk of subsequent transmission. This is relevant to ongoing efforts to control the spread of SARS-CoV-2, particularly if vaccines prove to be more effective at preventing disease than blocking transmission. Schematic outline of this genomic epidemiology study. Illustrated are the numerous introductions of SARS-CoV-2 into the Boston area; the minimal spread of most introductions; and the local, national, and international impact of the amplification of one introduction by a large superspreading event. Analysis of 772 complete severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) genomes from early in the Boston-area epidemic revealed numerous introductions of the virus, a small number of which led to most cases. The data revealed two superspreading events. One, in a skilled nursing facility, led to rapid transmission and significant mortality in this vulnerable population but little broader spread, whereas other introductions into the facility had little effect. The second, at an international business conference, produced sustained community transmission and was exported, resulting in extensive regional, national, and international spread. The two events also differed substantially in the genetic variation they generated, suggesting varying transmission dynamics in superspreading events. Our results show how genomic epidemiology can help to understand the link between individual clusters and wider community spread.

[1]  C. Reed,et al.  Estimated incidence of COVID-19 illness and hospitalization — United States, February–September, 2020 , 2020, Clinical infectious diseases : an official publication of the Infectious Diseases Society of America.

[2]  Kristen M. Fedak,et al.  Coronavirus Disease among Workers in Food Processing, Food Manufacturing, and Agriculture Workplaces , 2020, Emerging infectious diseases.

[3]  Joel O. Wertheim,et al.  The emergence of SARS-CoV-2 in Europe and North America , 2020, Science.

[4]  Ryan W. Thompson,et al.  Presymptomatic Transmission of Severe Acute Respiratory Syndrome Coronavirus 2 Among Residents and Staff at a Skilled Nursing Facility: Results of Real-time Polymerase Chain Reaction and Serologic Testing , 2020, Clinical Infectious Diseases.

[5]  Ryan W. Thompson,et al.  Presymptomatic Transmission of SARS-CoV-2 Amongst Residents and Staff at a Skilled Nursing Facility: Results of Real-Time PCR and Serologic Testing , 2020, Clinical infectious diseases : an official publication of the Infectious Diseases Society of America.

[6]  Pardis C Sabeti,et al.  Structural and Functional Analysis of the D614G SARS-CoV-2 Spike Protein Variant , 2020, bioRxiv.

[7]  A. Kucharski,et al.  Estimating the overdispersion in COVID-19 transmission using outbreak sizes outside China. , 2020, Wellcome open research.

[8]  S. Rowland-Jones,et al.  Tracking Changes in SARS-CoV-2 Spike: Evidence that D614G Increases Infectivity of the COVID-19 Virus , 2020, Cell.

[9]  Allison E James,et al.  High COVID-19 Attack Rate Among Attendees at Events at a Church - Arkansas, March 2020. , 2020, MMWR. Morbidity and mortality weekly report.

[10]  D. Patrick,et al.  High SARS-CoV-2 Attack Rate Following Exposure at a Choir Practice - Skagit County, Washington, March 2020. , 2020, MMWR. Morbidity and mortality weekly report.

[11]  A. Schuchat,et al.  Public Health Response to the Initiation and Spread of Pandemic COVID-19 in the United States, February 24–April 21, 2020 , 2020, MMWR. Morbidity and mortality weekly report.

[12]  T. Baggett,et al.  Prevalence of SARS-CoV-2 Infection in Residents of a Large Homeless Shelter in Boston. , 2020, JAMA.

[13]  W. Wei,et al.  Presymptomatic Transmission of SARS-CoV-2 — Singapore, January 23–March 16, 2020 , 2020, MMWR. Morbidity and mortality weekly report.

[14]  T. Rea,et al.  Epidemiology of Covid-19 in a Long-Term Care Facility in King County, Washington , 2020, The New England journal of medicine.

[15]  Lucie Abeler-Dörner,et al.  Quantifying SARS-CoV-2 transmission suggests epidemic control with digital contact tracing , 2020, Science.

[16]  E. Dong,et al.  An interactive web-based dashboard to track COVID-19 in real time , 2020, The Lancet Infectious Diseases.

[17]  Kohske Takahashi,et al.  Welcome to the Tidyverse , 2019, J. Open Source Softw..

[18]  Jennifer Lu,et al.  Improved metagenomic analysis with Kraken 2 , 2019, Genome Biology.

[19]  Trevor Bedford,et al.  Nextstrain: real-time tracking of pathogen evolution , 2017, bioRxiv.

[20]  Richard A Neher,et al.  TreeTime: Maximum-likelihood phylodynamic analysis , 2017, bioRxiv.

[21]  Stefan Elbe,et al.  Data, disease and diplomacy: GISAID's innovative contribution to global health , 2017, Global challenges.

[22]  David K. Smith,et al.  ggtree: an r package for visualization and annotation of phylogenetic trees with their covariates and other associated data , 2017 .

[23]  Pardis C Sabeti,et al.  Unbiased Deep Sequencing of RNA Viruses from Clinical Samples , 2016, Journal of visualized experiments : JoVE.

[24]  Andrew Rambaut,et al.  Exploring the temporal structure of heterochronous sequences using TempEst (formerly Path-O-Gen) , 2016, Virus evolution.

[25]  David Bryant,et al.  popart: full‐feature software for haplotype network construction , 2015 .

[26]  Trevor Bedford,et al.  Ebola Virus Epidemiology, Transmission, and Evolution during Seven Months in Sierra Leone , 2015, Cell.

[27]  A. von Haeseler,et al.  IQ-TREE: A Fast and Effective Stochastic Algorithm for Estimating Maximum-Likelihood Phylogenies , 2014, Molecular biology and evolution.

[28]  Tarjei S Mikkelsen,et al.  Enhanced methods for unbiased deep sequencing of Lassa and Ebola RNA viruses from clinical and biological samples , 2014, Genome Biology.

[29]  K. Katoh,et al.  MAFFT Multiple Sequence Alignment Software Version 7: Improvements in Performance and Usability , 2013, Molecular biology and evolution.

[30]  Thibaut Jombart,et al.  adegenet 1.3-1: new tools for the analysis of genome-wide SNP data , 2011, Bioinform..

[31]  P. E. Kopp,et al.  Superspreading and the effect of individual variation on disease emergence , 2005, Nature.

[32]  Jean YH Yang,et al.  Bioconductor: open software development for computational biology and bioinformatics , 2004, Genome Biology.

[33]  Ross Ihaka,et al.  Gentleman R: R: A language for data analysis and graphics , 1996 .