Lineage frequency time series reveal elevated levels of genetic drift in SARS-CoV-2 transmission in England

Random genetic drift in the population-level dynamics of an infectious disease outbreak results from the randomness of inter-host transmission and the randomness of host recovery or death. The strength of genetic drift has been found to be high for SARS-CoV-2 due to superspreading, and this is expected to substantially impact the disease epidemiology and evolution. Noise that results from the measurement process, such as biases in data collection across time, geographical areas, etc., can potentially confound estimates of genetic drift as both processes contribute “noise” to the data. To address this challenge, we develop and validate a method to jointly infer genetic drift and measurement noise from time-series lineage frequency data. We apply this method to over 490,000 SARS-CoV-2 genomic sequences from England collected between March 2020 and December 2021 by the COVID-19 Genomics UK (COG-UK) consortium. We find that even after correcting for measurement noise, the strength of genetic drift is consistently, throughout time, higher than that expected from the observed number of COVID-19 positive individuals in England by 1 to 3 orders of magnitude. Corrections taking into account epidemiological dynamics (susceptible-infected-recovered or susceptible-exposed-infected-recovered models) do not explain the discrepancy. Moreover, the levels of genetic drift that we observe are higher than the estimated levels of superspreading found by modeling studies that incorporate data on actual contact statistics in England. We discuss how even in the absence of superspreading, high levels of genetic drift can be generated via community structure in the host contact network. Our results suggest that further investigations of heterogeneous host contact structure may be important for understanding the high levels of genetic drift observed for SARS-CoV-2 in England.

[1]  J. Novembre,et al.  Population genetic models for the spatial spread of adaptive variants: A review in light of SARS-CoV-2 evolution , 2022, PLoS genetics.

[2]  Daniel B. Weissman,et al.  Investigating the evolutionary origins of the first three SARS-CoV-2 variants of concern , 2022, bioRxiv.

[3]  Benjamin H. Good,et al.  Quantifying the local adaptive landscape of a nascent bacterial community , 2022, bioRxiv.

[4]  Dillon Gostic Katelyn Tsang Tim Wu Peng Lim Wey Wen Yeung Adam Time-varying transmission heterogeneity of SARS and COVID-19 in Hong Kong (preprint) , 2022 .

[5]  Nuno R. Faria,et al.  Context-specific emergence and growth of the SARS-CoV-2 Delta variant , 2021, medRxiv.

[6]  S. Lehmann,et al.  Understanding components of mobility during the COVID-19 pandemic , 2021, Philosophical Transactions of the Royal Society A.

[7]  K. Koelle,et al.  Comment on “Genomic epidemiology of superspreading events in Austria reveals mutational dynamics and transmission properties of SARS-CoV-2” , 2021, Science Translational Medicine.

[8]  Nuno R. Faria,et al.  Spatiotemporal invasion dynamics of SARS-CoV-2 lineage B.1.1.7 emergence , 2021, Science.

[9]  O. Pybus,et al.  Assignment of epidemiological lineages in an emerging pandemic using the pangolin tool , 2021, Virus evolution.

[10]  J. Dushoff,et al.  The origins and potential future of SARS-CoV-2 variants of concern in the evolving COVID-19 pandemic , 2021, Current Biology.

[11]  A. Oliver,et al.  Spread of a SARS-CoV-2 variant through Europe in the summer of 2020 , 2021, Nature.

[12]  J. Gog,et al.  Early epidemiological signatures of novel SARS-CoV-2 variants: establishment of B.1.617.2 in England , 2021, medRxiv.

[13]  Vineet D. Menachery,et al.  Catch Me if You Can: Superspreading of COVID-19 , 2021, Trends in Microbiology.

[14]  C. Donnelly,et al.  Genetic evidence for the association between COVID-19 epidemic severity and timing of non-pharmaceutical interventions , 2021, Nature Communications.

[15]  Graham W. Taylor,et al.  Assessing transmissibility of SARS-CoV-2 lineage B.1.1.7 in England , 2021, Nature.

[16]  A. Goyal,et al.  Early super-spreader events are a likely determinant of novel SARS-CoV-2 variant predominance , 2021, medRxiv.

[17]  Robert J. Taylor,et al.  Overdispersion in COVID-19 increases the effectiveness of limiting nonrepetitive contacts for transmission control , 2021, Proceedings of the National Academy of Sciences.

[18]  J. Todd,et al.  SARS-CoV-2 within-host diversity and transmission , 2021, Science.

[19]  A. Goyal,et al.  Viral load and contact heterogeneity predict SARS-CoV-2 transmission and super-spreading events , 2021, eLife.

[20]  F. Papavasiliou,et al.  SARS-CoV-2 variant evolution in the United States: High accumulation of viral mutations over time likely through serial Founder Events and mutational bursts , 2021, bioRxiv.

[21]  J. B. Kirkegaard,et al.  Variability of Individual Infectiousness Derived from Aggregate Statistics of COVID-19 , 2021, medRxiv.

[22]  Paige B. Miller,et al.  An open-access database of infectious disease transmission trees to explore superspreader epidemiology , 2021, medRxiv.

[23]  Carl A. B. Pearson,et al.  Estimated transmissibility and impact of SARS-CoV-2 lineage B.1.1.7 in England , 2021, Science.

[24]  K. V. Parag,et al.  Establishment & lineage dynamics of the SARS-CoV-2 epidemic in the UK , 2020, medRxiv.

[25]  Benjamin J Cowling,et al.  Clustering and superspreading potential of SARS-CoV-2 infections in Hong Kong , 2020, Nature Medicine.

[26]  Max S. Y. Lau,et al.  Characterizing superspreading events and age-specific infectiousness of SARS-CoV-2 transmission in Georgia, USA , 2020, Proceedings of the National Academy of Sciences.

[27]  Casper K Lumby,et al.  A large effective population size for established within-host influenza virus infection , 2020, eLife.

[28]  C. Mohan,et al.  Epidemiology and transmission dynamics of COVID-19 in two Indian states , 2020, Science.

[29]  Edward C. Holmes,et al.  A dynamic nomenclature proposal for SARS-CoV-2 lineages to assist genomic epidemiology , 2020, Nature Microbiology.

[30]  S. Otto,et al.  On the evolutionary epidemiology of SARS-CoV-2 , 2020, Current Biology.

[31]  A. Hill,et al.  Dynamics of COVID-19 under social distancing measures are driven by transmission network structure , 2020, medRxiv.

[32]  A. Huppert,et al.  Full genome viral sequences inform patterns of SARS-CoV-2 spread into and within Israel , 2020, Nature Communications.

[33]  Sebastian Funk,et al.  Extended data: Estimating the overdispersion in COVID-19 transmission using outbreak sizes outside China , 2020 .

[34]  Matteo Fumagalli,et al.  Inference of natural selection from ancient DNA , 2020, Evolution letters.

[35]  Eric H. Y. Lau,et al.  Temporal dynamics in viral shedding and transmissibility of COVID-19 , 2020, Nature Medicine.

[36]  C. Althaus,et al.  Pattern of early human-to-human transmission of Wuhan 2019 novel coronavirus (2019-nCoV), December 2019 to January 2020 , 2020, Euro surveillance : bulletin Europeen sur les maladies transmissibles = European communicable disease bulletin.

[37]  Sasha F. Levy,et al.  High-resolution lineage tracking reveals traveling wave of adaptation in laboratory yeast , 2019, Nature.

[38]  M. Gambhir,et al.  The role of super-spreading events in Mycobacterium tuberculosis transmission: evidence from contact tracing , 2019, BMC Infectious Diseases.

[39]  Graham Coop,et al.  The Linked Selection Signature of Rapid Adaptation in Temporal Genomic Data , 2019, Genetics.

[40]  Adi Stern,et al.  Inferring population genetics parameters of evolving viruses using time-series data , 2018, bioRxiv.

[41]  Xavier Didelot,et al.  Modeling the Growth and Decline of Pathogen Effective Population Size Provides Insight into Epidemic Dynamics and Drivers of Antimicrobial Resistance , 2017, bioRxiv.

[42]  Katia Koelle,et al.  Transmission Bottleneck Size Estimation from Pathogen Deep-Sequencing Data, with an Application to Human Influenza A Virus , 2017, Journal of Virology.

[43]  Daniel Wegmann,et al.  An Approximate Markov Model for the Wright–Fisher Diffusion and Its Application to Time Series Data , 2015, Genetics.

[44]  A. Kucharski,et al.  The role of superspreading in Middle East respiratory syndrome coronavirus (MERS-CoV) transmission. , 2015, Euro surveillance : bulletin Europeen sur les maladies transmissibles = European communicable disease bulletin.

[45]  Gavin Sherlock,et al.  Quantitative evolutionary dynamics using high-resolution lineage tracking , 2015, Nature.

[46]  Eric Fleury,et al.  Detailed Contact Data and the Dissemination of Staphylococcus aureus in Hospitals , 2015, PLoS Comput. Biol..

[47]  A. Ferrari,et al.  Equivalence between the Posterior Distribution of the Likelihood Ratio and a p-value in an Invariant Frame , 2014 .

[48]  Anand Bhaskar,et al.  A NOVEL SPECTRAL METHOD FOR INFERRING GENERAL DIPLOID SELECTION FROM TIME SERIES GENETIC DATA. , 2013, The annals of applied statistics.

[49]  Gil McVean,et al.  Estimating Selection Coefficients in Spatially Structured Populations from Time Series Data of Allele Frequencies , 2013, Genetics.

[50]  Mandev S. Gill,et al.  Improving Bayesian population dynamics inference: a coalescent-based model for multiple loci. , 2013, Molecular biology and evolution.

[51]  J. Plotkin,et al.  Identifying Signatures of Selection in Genetic Time Series , 2013, Genetics.

[52]  Erik M. Volz,et al.  Complex Population Dynamics and the Coalescent Under Neutrality , 2012, Genetics.

[53]  Katia Koelle,et al.  Rates of coalescence for common epidemiological models at equilibrium , 2012, Journal of The Royal Society Interface.

[54]  S. Ho,et al.  Skyline‐plot methods for estimating demographic history from nucleotide sequences , 2011, Molecular ecology resources.

[55]  Philip L. F. Johnson,et al.  Genetic history of an archaic hominin group from Denisova Cave in Siberia , 2010, Nature.

[56]  Erik M. Volz,et al.  Viral phylodynamics and the search for an ‘effective number of infections’ , 2010, Philosophical Transactions of the Royal Society B: Biological Sciences.

[57]  Philip L. F. Johnson,et al.  A Draft Sequence of the Neandertal Genome , 2010, Science.

[58]  T. Day,et al.  Risk factors for the evolutionary emergence of pathogens , 2010, Journal of The Royal Society Interface.

[59]  T. Stadler On incomplete sampling under birth-death models and connections to the sampling-based coalescent. , 2009, Journal of theoretical biology.

[60]  B. Charlesworth Effective population size and patterns of molecular evolution and variation , 2009, Nature Reviews Genetics.

[61]  Jonathan P. Bollback,et al.  Estimation of 2Nes From Temporal Allele Frequency Data , 2008, Genetics.

[62]  P. E. Kopp,et al.  Superspreading and the effect of individual variation on disease emergence , 2005, Nature.

[63]  Murray Aitkin,et al.  Bayesian point null hypothesis testing via the posterior likelihood ratio , 2005, Stat. Comput..

[64]  O. Pybus,et al.  An integrated framework for the inference of viral population history from reconstructed genealogies. , 2000, Genetics.

[65]  M. Slatkin,et al.  Using maximum likelihood to estimate population size from temporal changes in allele frequencies. , 1999, Genetics.