Statistical outbreak detection by joining medical records and pathogen similarity

We present a statistical inference model for the detection and characterization of outbreaks of hospital associated infection. The approach combines patient exposures, determined from electronic medical records, and pathogen similarity, determined by whole-genome sequencing, to simultaneously identify probable outbreaks and their root-causes. We show how our model can be used to target isolates for whole-genome sequencing, improving outbreak detection and characterization even without comprehensive sequencing. Additionally, we demonstrate how to learn model parameters from reference data of known outbreaks. We demonstrate model performance using semi-synthetic experiments.

[1]  Lack of Comprehensive Outbreak Detection in Hospitals , 2016, Infection Control & Hospital Epidemiology.

[2]  T Jombart,et al.  Reconstructing disease outbreaks from genetic data: a graph approach , 2010, Heredity.

[3]  Klaus-Peter Adlassnig,et al.  Effectiveness of an automated surveillance system for intensive care unit-acquired infections , 2013, J. Am. Medical Informatics Assoc..

[4]  Thibaut Jombart,et al.  When are pathogen genome sequences informative of transmission events? , 2018, PLoS pathogens.

[5]  David S. Johnson,et al.  Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .

[6]  Marc Lipsitch,et al.  Epidemiologic data and pathogen genome sequences: a powerful synergy for public health , 2014, Genome Biology.

[7]  L. Hutwagner,et al.  Using laboratory-based surveillance data for prevention: an algorithm for detecting Salmonella outbreaks. , 1997, Emerging infectious diseases.

[8]  Jane W. Marsh,et al.  Genomic Epidemiology of an Endoscope-Associated Outbreak of Klebsiella pneumoniae Carbapenemase (KPC)-Producing K. pneumoniae , 2015, PloS one.

[9]  H. Humphreys,et al.  Methods for Outbreak Detection in Hospitals—Does One Size Fit All? , 2016, Infection Control & Hospital Epidemiology.

[10]  M. Crotta,et al.  Towards an integrated food safety surveillance system: a simulation study to explore the potential of combining genomic and epidemiological metadata , 2017, Royal Society Open Science.

[11]  Rumi Chunara,et al.  Network inference from multimodal data: A review of approaches from infectious disease transmission , 2016, Journal of Biomedical Informatics.

[12]  Theodore Kypraios,et al.  Reconstructing transmission trees for communicable diseases using densely sampled genetic data. , 2014, The annals of applied statistics.

[13]  Peter J. Haug,et al.  Comparison of computerized surveillance and manual chart review for adverse events , 2011, J. Am. Medical Informatics Assoc..

[14]  Joshua A. Doherty,et al.  Formulation of a model for automating infection surveillance: algorithmic detection of central-line associated bloodstream infection. , 2010, Journal of the American Medical Informatics Association : JAMIA.

[15]  Thibaut Jombart,et al.  outbreaker2: Bayesian Reconstruction of Disease Outbreaks by Combining Epidemiologic and Genomic Data , 2018 .

[16]  Daniel T. Haydon,et al.  Molecular Epidemiology of the Foot-and-Mouth Disease Virus Outbreak in the United Kingdom in 2001 , 2006, Journal of Virology.

[17]  Henk C den Bakker,et al.  Genomic Epidemiology: Whole-Genome-Sequencing-Powered Surveillance and Outbreak Investigation of Foodborne Bacterial Pathogens. , 2016, Annual review of food science and technology.

[18]  Matthew Hall,et al.  Epidemic Reconstruction in a Phylogenetics Framework: Transmission Trees as Partitions of the Node Set , 2014, PLoS Comput. Biol..

[19]  Fabrice Carrat,et al.  WTW - an algorithm for identifying "who transmits to whom" in outbreaks of interhuman transmitted infectious agents , 2010, J. Am. Medical Informatics Assoc..

[20]  Colin J. Worby,et al.  'SEEDY' (Simulation of Evolutionary and Epidemiological Dynamics): An R Package to Follow Accumulation of Within-Host Mutation in Pathogens , 2015, PloS one.

[21]  Joanne R. Winter,et al.  Interpreting whole genome sequencing for investigating tuberculosis transmission: a systematic review , 2016, BMC Medicine.

[22]  Gaël Thébaud,et al.  Integrating genetic and epidemiological data to determine transmission pathways of foot-and-mouth disease virus , 2008, Proceedings of the Royal Society B: Biological Sciences.

[23]  R. Lynfield,et al.  Multistate point-prevalence survey of health care-associated infections. , 2014, The New England journal of medicine.

[24]  Jeroen S. de Bruin,et al.  Data use and effectiveness in electronic surveillance of healthcare associated infections in the 21st century: a systematic review , 2014, J. Am. Medical Informatics Assoc..

[25]  Dirk P. Kroese,et al.  The cross-entropy method for estimation , 2013 .

[26]  Artur Dubrawski,et al.  Detection of Events In Multiple Streams of Surveillance Data , 2011 .

[27]  Colin J. Worby,et al.  The Distribution of Pairwise Genetic Distances: A Tool for Investigating Disease Transmission , 2014, Genetics.

[28]  Tom Britton,et al.  Molecular Infectious Disease Epidemiology: Survival Analysis and Algorithms Linking Phylogenies to Transmission Trees , 2015, PLoS Comput. Biol..

[29]  Yili Hong,et al.  On computing the distribution function for the Poisson binomial distribution , 2013, Comput. Stat. Data Anal..

[30]  J. Stelling,et al.  Implementation and evaluation of an automated surveillance system to detect hospital outbreak , 2017, American journal of infection control.

[31]  Xavier Didelot,et al.  Genomic Infectious Disease Epidemiology in Partially Sampled and Ongoing Outbreaks , 2016, bioRxiv.