o2geosocial: Reconstructing who-infected-whom from routinely collected surveillance data

Reconstructing the history of individual transmission events between cases is key to understanding what factors facilitate the spread of an infectious disease. Since conducting extended contact-tracing investigations can be logistically challenging and costly, statistical inference methods have been developed to reconstruct transmission trees from onset dates and genetic sequences. However, these methods are not as effective if the mutation rate of the virus is very slow, or if sequencing data is sparse. We developed the package o2geosocial to combine variables from routinely collected surveillance data with a simple transmission process model. The model reconstructs transmission trees when full genetic sequences are not available, or uninformative. Our model incorporates the reported age-group, onset date, location and genotype of infected cases to infer probabilistic transmission trees. The package also includes functions to summarise and visualise the inferred cluster size distribution. The results generated by o2geosocial can highlight regions where importations repeatedly caused large outbreaks, which may indicate a higher regional susceptibility to infections. It can also be used to generate the individual number of secondary transmissions, and show the features associated with individuals involved in high transmission events. The package is available for download from the Comprehensive R Archive Network (CRAN) and GitHub.

[1]  P. Fine The interval between successive cases of an infectious disease. , 2003, American journal of epidemiology.

[2]  J. Wallinga,et al.  Serial intervals of respiratory infectious diseases: a systematic review and analysis. , 2014, American journal of epidemiology.

[3]  Sebastian Funk,et al.  What settings have been linked to SARS-CoV-2 transmission clusters? , 2020, Wellcome open research.

[4]  C. Fraser Estimating Individual and Household Reproduction Numbers in an Emerging Epidemic , 2007, PloS one.

[5]  Sebastian Funk,et al.  Socialmixr: Social Mixing Matrices for Infectious Disease Modelling , 2018 .

[6]  Thibaut Jombart,et al.  outbreaker2: Bayesian Reconstruction of Disease Outbreaks by Combining Epidemiologic and Genomic Data , 2018 .

[7]  J Wallinga,et al.  Unravelling transmission trees of infectious diseases by combining genetic and epidemiological data , 2012, Proceedings of the Royal Society B: Biological Sciences.

[8]  Paige B. Miller,et al.  An open-access database of infectious disease transmission trees to explore superspreader epidemiology , 2021, medRxiv.

[9]  P. Rota,et al.  A Measles Outbreak in an Underimmunized Amish Community in Ohio. , 2016, The New England journal of medicine.

[10]  M. Kendall,et al.  Estimating transmission from genetic and epidemiological data: a metric to compare transmission trees , 2016, 1609.09051.

[11]  Colin J. Worby,et al.  Within-Host Bacterial Diversity Hinders Accurate Reconstruction of Transmission Networks from Genomic Distance Data , 2014, PLoS Comput. Biol..

[12]  James O. Lloyd-Smith,et al.  Inference of R 0 and Transmission Heterogeneity from the Size Distribution of Stuttering Chains , 2013, PLoS Comput. Biol..

[13]  Wayne T. A. Enanoria,et al.  Identifying postelimination trends for the introduction and transmissibility of measles in the United States. , 2014, American journal of epidemiology.

[15]  Simon Cauchemez,et al.  Chains of transmission and control of Ebola virus disease in Conakry, Guinea, in 2014: an observational study. , 2015, The Lancet. Infectious diseases.

[16]  Kyle E. Walker,et al.  tigris: An R Package to Access and Work with Geographic Data from the US Census Bureau , 2016, R J..

[17]  V. Saliba,et al.  Summer music and arts festivals as hot spots for measles transmission: experience from England and Wales, June to October 2016 , 2016, Euro surveillance : bulletin Europeen sur les maladies transmissibles = European communicable disease bulletin.

[18]  Theodore Kypraios,et al.  Reconstructing transmission trees for communicable diseases using densely sampled genetic data. , 2014, The annals of applied statistics.

[19]  Gavin J. Gibson,et al.  A Systematic Bayesian Integration of Epidemiological and Genetic Data , 2015, PLoS Comput. Biol..

[20]  R. Brookmeyer,et al.  Incubation periods of acute respiratory viral infections: a systematic review , 2009, The Lancet Infectious Diseases.

[21]  R. Myers,et al.  Assessment of the Utility of Whole Genome Sequencing of Measles Virus in the Characterisation of Outbreaks , 2015, PloS one.

[22]  Hiroshi Nishiura,et al.  The correlation between infectivity and incubation period of measles, estimated from households with two cases. , 2011, Journal of theoretical biology.

[23]  A. Severini,et al.  Measles molecular epidemiology: What does it tell us and why is it important? , 2014, Canada communicable disease report = Releve des maladies transmissibles au Canada.

[24]  Measles virus nomenclature update: 2012. , 2012, Releve epidemiologique hebdomadaire.

[25]  Nando de Freitas,et al.  An Introduction to MCMC for Machine Learning , 2004, Machine Learning.

[26]  Yi Shi,et al.  Inference of person-to-person transmission of COVID-19 reveals hidden super-spreading events during the early outbreak phase , 2020, Nature Communications.

[27]  J. Wallinga,et al.  Different Epidemic Curves for Severe Acute Respiratory Syndrome Reveal Similar Impacts of Control Measures , 2004, American journal of epidemiology.

[28]  Ả. Svensson A note on generation times in epidemic models. , 2007, Mathematical Biosciences.

[29]  Thibaut Jombart,et al.  outbreaker2: a modular platform for outbreak reconstruction , 2018, BMC Bioinformatics.

[30]  Quentin J. Leclerc,et al.  What settings have been linked to SARS-CoV-2 transmission clusters? , 2020, Wellcome open research.

[31]  R. Irizarry ggplot2 , 2019, Introduction to Data Science.

[32]  Cécile Viboud,et al.  Comparison of alternative models of human movement and the spread of disease , 2019, bioRxiv.

[33]  Jacco Wallinga,et al.  Relating Phylogenetic Trees to Transmission Trees of Infectious Disease Outbreaks , 2013, Genetics.

[34]  Introduction to the "geosphere"package (Version 1.3-13) , 2015 .

[35]  Maxime Lenormand,et al.  Systematic comparison of trip distribution laws and models , 2015, 1506.04889.

[36]  S. Stouffer Intervening opportunities: a theory relating mobility and distance , 1940 .

[37]  S. Merler,et al.  The role of population heterogeneity and human mobility in the spread of pandemic influenza , 2010, Proceedings of the Royal Society B: Biological Sciences.

[38]  R. Mikolajczyk,et al.  Social Contacts and Mixing Patterns Relevant to the Spread of Infectious Diseases , 2008, PLoS medicine.

[39]  Samuel Soubeyrand,et al.  A Bayesian Inference Framework to Reconstruct Transmission Trees Using Epidemiological and Genetic Data , 2012, PLoS Comput. Biol..

[40]  Christl A. Donnelly,et al.  Transmission intensity and impact of control policies on the foot and mouth epidemic in Great Britain , 2001, Nature.

[41]  Dirk Eddelbuettel,et al.  Rcpp: Seamless R and C++ Integration , 2011 .

[42]  R. Eggo,et al.  Determinants of Transmission Risk During the Late Stage of the West African Ebola Epidemic , 2019, American journal of epidemiology.

[43]  Neil M. Ferguson,et al.  Evaluating the Adequacy of Gravity Models as a Description of Human Mobility for Epidemic Modelling , 2012, PLoS Comput. Biol..

[44]  Thibaut Jombart,et al.  When are pathogen genome sequences informative of transmission events? , 2018, PLoS pathogens.