Real-Time Analysis and Visualization of Pathogen Sequence Data

The rapid development of sequencing technologies has to led to an explosion of pathogen sequence data, which are increasingly collected as part of routine surveillance or clinical diagnostics. In public health, sequence data are used to reconstruct the evolution of pathogens, to anticipate future spread, and to target interventions. ABSTRACT The rapid development of sequencing technologies has to led to an explosion of pathogen sequence data, which are increasingly collected as part of routine surveillance or clinical diagnostics. In public health, sequence data are used to reconstruct the evolution of pathogens, to anticipate future spread, and to target interventions. In clinical settings, whole-genome sequencing can identify pathogens at the strain level, can be used to predict phenotypes such as drug resistance and virulence, and can inform treatment by linking closely related cases. While sequencing has become cheaper, the analysis of sequence data has become an important bottleneck. Deriving interpretable and actionable results for a large variety of pathogens, each with its own complexity, from continuously updated data is a daunting task that requires flexible bioinformatic workflows and dissemination platforms. Here, we review recent developments in real-time analyses of pathogen sequence data, with a particular focus on the visualization and integration of sequence and phenotype data.

[1]  Tamara Munzner,et al.  Evidence-Based Design and Evaluation of a Whole Genome Sequencing Clinical Report for the Reference Microbiology Laboratory , 2017 .

[2]  Trevor Bedford,et al.  Nextstrain: real-time tracking of pathogen evolution , 2017, bioRxiv.

[3]  Phelim Bradley,et al.  Real-time search of all bacterial and viral genomic data , 2017, bioRxiv.

[4]  Genome-scale rates of evolutionary change in bacteria , 2016 .

[5]  Richard A Neher,et al.  panX: pan-genome analysis and exploration , 2016, bioRxiv.

[6]  Trevor Bedford,et al.  Prediction, dynamics, and visualization of antigenic phenotypes of seasonal influenza viruses , 2015, Proceedings of the National Academy of Sciences.

[7]  A. von Haeseler,et al.  IQ-TREE: A Fast and Effective Stochastic Algorithm for Estimating Maximum-Likelihood Phylogenies , 2014, Molecular biology and evolution.

[8]  Alexandros Stamatakis,et al.  RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies , 2014, Bioinform..

[9]  Richard A Neher,et al.  TreeTime: Maximum-likelihood phylodynamic analysis , 2017, bioRxiv.

[10]  Thomas R Rogers,et al.  A cluster of multidrug-resistant Mycobacterium tuberculosis among patients arriving in Europe from the Horn of Africa: a molecular epidemiological study , 2018, The Lancet. Infectious diseases.

[11]  Phelim Bradley,et al.  Same-Day Diagnostic and Surveillance Data for Tuberculosis via Whole-Genome Sequencing of Direct Respiratory Samples , 2016, Journal of Clinical Microbiology.

[12]  Andrew J. Page,et al.  Roary: rapid large-scale prokaryote pan genome analysis , 2015, bioRxiv.

[13]  Alexander Tomasz,et al.  Tracking the in vivo evolution of multidrug resistance in Staphylococcus aureus by whole-genome sequencing , 2007, Proceedings of the National Academy of Sciences.

[14]  Trevor Bedford,et al.  nextflu: real-time tracking of seasonal influenza virus evolution in humans , 2015, Bioinform..

[15]  Eduardo P C Rocha,et al.  Whole genome-based population biology and epidemiological surveillance of Listeria monocytogenes , 2016, Nature Microbiology.

[16]  Olivier Gascuel,et al.  Fast Dating Using Least-Squares Criteria and Algorithms , 2015, Systematic biology.

[17]  Ruth Timme,et al.  The Public Health Impact of a Publically Available, Environmental Database of Microbial Genomes , 2017, Front. Microbiol..

[18]  Katelyn M. Gostic,et al.  Predictive Modeling of Influenza Shows the Promise of Applied Evolutionary Biology. , 2017, Trends in microbiology.

[19]  K. Katoh,et al.  MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform. , 2002, Nucleic acids research.

[20]  David A. Matthews,et al.  Real-time, portable genome sequencing for Ebola surveillance , 2016, Nature.

[21]  E. Holmes,et al.  Rates of evolutionary change in viruses: patterns and determinants , 2008, Nature Reviews Genetics.

[22]  M. Suchard,et al.  SpreaD3: Interactive Visualization of Spatiotemporal History and Trait Evolutionary Processes. , 2016, Molecular biology and evolution.

[23]  R. Dyrdak,et al.  Outbreak of enterovirus D68 of the new B3 lineage in Stockholm, Sweden, August to September 2016 , 2016, Euro surveillance : bulletin Europeen sur les maladies transmissibles = European communicable disease bulletin.

[24]  Andrew Rambaut,et al.  Real-time digital pathogen surveillance — the time is now , 2015, Genome Biology.

[25]  Julian Parkhill,et al.  A genomic portrait of the emergence, evolution, and global spread of a methicillin-resistant Staphylococcus aureus pandemic , 2013, Genome research.

[26]  Khalil Abudahab,et al.  Microreact: visualizing and sharing data for genomic epidemiology and phylogeography , 2016, Microbial genomics.

[27]  Thomas Lengauer,et al.  Geno2pheno: estimating phenotypic drug resistance from HIV-1 genotypes , 2003, Nucleic Acids Res..

[28]  Erik Sohn,et al.  Treelink: data integration, clustering and visualization of phylogenetic trees , 2015, BMC Bioinformatics.

[29]  Trevor Bedford,et al.  Integrating influenza antigenic dynamics with molecular evolution , 2013, eLife.

[30]  Guy Baele,et al.  PhyloGeoTool: interactively exploring large phylogenies in an epidemiological context , 2017, Bioinform..

[31]  James Hadfield,et al.  Phandango: an interactive viewer for bacterial population genomics , 2017, bioRxiv.

[32]  Whole-Genome Sequencing Is Taking over Foodborne Disease Surveillance: Public health microbiology is undergoing its biggest change in a generation, replacing traditional methods with whole-genome sequencing , 2016 .

[33]  Jennifer L. Gardy,et al.  Towards a genomics-informed, real-time, global pathogen surveillance system , 2017, Nature Reviews Genetics.

[34]  M. Suchard,et al.  Bayesian Phylogenetics with BEAUti and the BEAST 1.7 , 2012, Molecular biology and evolution.

[35]  Erik M. Volz,et al.  Scalable relaxed clock phylogenetic dating , 2017 .

[36]  Paramvir S. Dehal,et al.  FastTree 2 – Approximately Maximum-Likelihood Trees for Large Alignments , 2010, PloS one.