TiTUS: Sampling and Summarizing Transmission Trees with Multi-strain Infections

Motivation The combination of genomic and epidemiological data hold the potential to enable accurate pathogen transmission history inference. However, the inference of outbreak transmission histories remains challenging due to various factors such as within-host pathogen diversity and multi-strain infections. Current computational methods ignore within-host diversity and/or multi-strain infections, often failing to accurately infer the transmission history. Thus, there is a need for efficient computational methods for transmission tree inference that accommodate the complexities of real data. Results We formulate the Direct Transmission Inference (DTI) problem for inferring transmission trees that support multi-strain infections given a timed phylogeny and additional epidemiological data. We establish hardness for the decision and counting version of the DTI problem. We introduce TiTUS, a method that uses SATISFIABILITY to almost uniformly sample from the space of transmission trees. We introduce criteria that prioritizes parsimonious transmission trees that we subsequently summarize using a novel consensus tree approach. We demonstrate TiTUS’s ability to accurately reconstruct transmission trees on simulated data as well as a documented HIV transmission chain. Availability https://github.com/elkebir-group/TiTUS Contact melkebir@illinois.edu Supplementary information Supplementary data are available at Bioinformatics online.

[1]  Xavier Didelot,et al.  Bayesian Inference of Infectious Disease Transmission from Whole-Genome Sequence Data , 2014, Molecular biology and evolution.

[2]  Ion I. Mandoiu,et al.  TreeFix-TP: Phylogenetic Error-Correction for Infectious Disease Transmission Network Inference , 2019, bioRxiv.

[3]  J Wallinga,et al.  Unravelling transmission trees of infectious diseases by combining genetic and epidemiological data , 2012, Proceedings of the Royal Society B: Biological Sciences.

[4]  D. Sankoff Minimal Mutation Trees of Sequences , 1975 .

[5]  Mohammed El-Kebir,et al.  SharpTNI: Counting and Sampling Parsimonious Transmission Networks under a Weak Bottleneck , 2019, bioRxiv.

[6]  Astrid Gall,et al.  PHYLOSCANNER: Inferring Transmission from Within- and Between-Host Pathogen Genetic Diversity , 2017, bioRxiv.

[7]  H. Whittle,et al.  Effect of subclinical infection on maintaining immunity against measles in vaccinated children in West Africa , 1999, The Lancet.

[8]  Richard M. Karp,et al.  Reducibility Among Combinatorial Problems , 1972, 50 Years of Integer Programming.

[9]  M. Kendall,et al.  Estimating transmission from genetic and epidemiological data: a metric to compare transmission trees , 2016, 1609.09051.

[10]  Ethan Romero-Severson,et al.  Timing and order of transmission events is not directly reflected in a pathogen phylogeny. , 2014, Molecular biology and evolution.

[11]  Nicola De Maio,et al.  Bayesian reconstruction of transmission within outbreaks using genomic variants , 2017, bioRxiv.

[12]  Julian Parkhill,et al.  Evolution of MRSA During Hospital Transmission and Intercontinental Spread , 2010, Science.

[13]  M. Kendall,et al.  treespace: Statistical exploration of landscapes of phylogenetic trees , 2017, Molecular ecology resources.

[14]  Nicola De Maio,et al.  SCOTTI: Efficient Reconstruction of Transmission within Outbreaks with the Structured Coalescent , 2016, PLoS Comput. Biol..

[15]  Xavier Didelot,et al.  Genomic Infectious Disease Epidemiology in Partially Sampled and Ongoing Outbreaks , 2016, bioRxiv.

[16]  Evan S Snitkin,et al.  Tracking a Hospital Outbreak of Carbapenem-Resistant Klebsiella pneumoniae with Whole-Genome Sequencing , 2012, Science Translational Medicine.

[17]  P. Rohani,et al.  Estimating the Duration of Pertussis Immunity Using Epidemiological Signatures , 2009, PLoS pathogens.

[19]  Bimal Kumar Roy,et al.  Counting, sampling and integrating: Algorithms and complexity , 2013 .

[20]  Claude Castelluccia,et al.  Extending SAT Solvers to Cryptographic Problems , 2009, SAT.

[21]  Guy Baele,et al.  The Genealogical Population Dynamics of HIV-1 in a Large Transmission Chain: Bridging within and among Host Evolutionary Rates , 2014, PLoS Comput. Biol..

[22]  Tom Britton,et al.  Molecular Infectious Disease Epidemiology: Survival Analysis and Algorithms Linking Phylogenies to Transmission Trees , 2015, PLoS Comput. Biol..

[23]  Mate Soos,et al.  BIRD: Engineering an Efficient CNF-XOR SAT Solver and Its Applications to Approximate Model Counting , 2019, AAAI.

[24]  Sanjit A. Seshia,et al.  On Parallel Scalable Uniform SAT Witness Generation , 2015, TACAS.

[25]  M. Uhlén,et al.  Accurate reconstruction of a known HIV-1 transmission history by phylogenetic tree analysis. , 1996, Proceedings of the National Academy of Sciences of the United States of America.

[26]  Matthew Hall,et al.  Epidemic Reconstruction in a Phylogenetics Framework: Transmission Trees as Partitions of the Node Set , 2014, PLoS Comput. Biol..

[27]  Supratik Chakraborty,et al.  A Scalable Approximate Model Counter , 2013, CP.

[28]  Alexandros Stamatakis,et al.  RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies , 2014, Bioinform..

[29]  C. Colijn,et al.  Transmission Trees on a Known Pathogen Phylogeny: Enumeration and Sampling , 2019, Molecular biology and evolution.

[30]  Guy Baele,et al.  Phylodynamic assessment of intervention strategies for the West African Ebola virus outbreak , 2018, Nature Communications.

[31]  Supratik Chakraborty,et al.  Balancing scalability and uniformity in SAT witness generator , 2014, 2014 51st ACM/EDAC/IEEE Design Automation Conference (DAC).

[32]  Nadia Creignou,et al.  On P completeness of some counting problems , 1993 .

[33]  L. Allen An Introduction to Stochastic Epidemic Models , 2008 .

[34]  Jacco Wallinga,et al.  Relating Phylogenetic Trees to Transmission Trees of Infectious Disease Outbreaks , 2013, Genetics.

[35]  Marc Thurley,et al.  sharpSAT - Counting Models with Advanced Component Caching and Implicit BCP , 2006, SAT.

[36]  Gaël Thébaud,et al.  Integrating genetic and epidemiological data to determine transmission pathways of foot-and-mouth disease virus , 2008, Proceedings of the Royal Society B: Biological Sciences.

[37]  Sebastián Duchêne,et al.  BEAST 2.5: An advanced software platform for Bayesian evolutionary analysis , 2018, bioRxiv.

[38]  István Miklós,et al.  Computational Complexity of Counting and Sampling , 2019 .

[39]  Layla Oesper,et al.  A Consensus Approach to Infer Tumor Evolutionary Histories , 2018, BCB.

[40]  P. Lemey,et al.  Molecular Footprint of Drug-Selective Pressure in a Human Immunodeficiency Virus Transmission Chain , 2005, Journal of Virology.

[41]  Mohammed El-Kebir,et al.  Summarizing the solution space in tumor phylogeny inference by multiple consensus trees , 2019, Bioinform..