BacWGSTdb 2.0: a one-stop repository for bacterial whole-genome sequence typing and source tracking

Abstract An increasing prevalence of hospital acquired infections and foodborne illnesses caused by pathogenic and multidrug-resistant bacteria has stimulated a pressing need for benchtop computational techniques to rapidly and accurately classify bacteria from genomic sequence data, and based on that, to trace the source of infection. BacWGSTdb (http://bacdb.org/BacWGSTdb) is a free publicly accessible database we have developed for bacterial whole-genome sequence typing and source tracking. This database incorporates extensive resources for bacterial genome sequencing data and the corresponding metadata, combined with specialized bioinformatics tools that enable the systematic characterization of the bacterial isolates recovered from infections. Here, we present BacWGSTdb 2.0, which encompasses several major updates, including (i) the integration of the core genome multi-locus sequence typing (cgMLST) approach, which is highly scalable and appropriate for typing isolates belonging to different lineages; (ii) the addition of a multiple genome analysis module that can process dozens of user uploaded sequences in a batch mode; (iii) a new source tracking module for comparing user uploaded plasmid sequences to those deposited in the public databases; (iv) the number of species encompassed in BacWGSTdb 2.0 has increased from 9 to 20, which represents bacterial pathogens of medical importance; (v) a newly designed, user-friendly interface and a set of visualization tools for providing a convenient platform for users are also included. Overall, the updated BacWGSTdb 2.0 bears great utility in continuing to provide users, including epidemiologists, clinicians and bench scientists, with a one-stop solution to bacterial genome sequence analysis.

[1]  Brian D. Ondov,et al.  Mash: fast genome and metagenome distance estimation using MinHash , 2015, Genome Biology.

[2]  S. Salzberg,et al.  Versatile and open software for comparing large genomes , 2004, Genome Biology.

[3]  Ning Ma,et al.  BLAST+: architecture and applications , 2009, BMC Bioinformatics.

[4]  Mark Johnson,et al.  NCBI BLAST: a better web interface , 2008, Nucleic Acids Res..

[5]  Nicholas P. Tatonetti,et al.  Predicting Phenotypic Polymyxin Resistance in Klebsiella pneumoniae through Machine Learning Analysis of Genomic Data , 2020, mSystems.

[6]  F. Baquero,et al.  Defining and combating antibiotic resistance from One Health and Global Health perspectives , 2019, Nature Microbiology.

[7]  M. Huynen,et al.  Whole-Genome Sequencing of Bacterial Pathogens: the Future of Nosocomial Outbreak Analysis , 2017, Clinical Microbiology Reviews.

[8]  Timothy D. Read,et al.  Genome-Based Prediction of Bacterial Antibiotic Resistance , 2018, Journal of Clinical Microbiology.

[9]  Gautam Dantas,et al.  Sequencing-based methods and resources to study antimicrobial resistance , 2019, Nature Reviews Genetics.

[10]  Ye Feng,et al.  BacWGSTdb, a database for genotyping and source tracking bacterial pathogens , 2015, Nucleic Acids Res..

[11]  N. Geard,et al.  Implications of asymptomatic carriers for infectious disease transmission and control , 2018, Royal Society Open Science.

[12]  P. Brachman,et al.  Infectious diseases--past, present, and future. , 2003, International journal of epidemiology.

[13]  Erin Beck,et al.  LOCUST: a custom sequence locus typer for classifying microbial isolates , 2017, Bioinform..

[14]  J. Patel,et al.  An outbreak of multidrug-resistant Acinetobacter baumannii-calcoaceticus complex infection in the US military health care system associated with military operations in Iraq. , 2007, Clinical infectious diseases : an official publication of the Infectious Diseases Society of America.

[15]  James A. Foster,et al.  Phylogenetics Clearcut : a fast implementation of relaxed neighbor joining , 2006 .

[16]  Ye Feng,et al.  The global dissemination of bacterial infections necessitates the study of reverse genomic epidemiology , 2020, Briefings Bioinform..

[17]  R. Goering,et al.  Whole genome sequencing options for bacterial strain typing and epidemiologic analysis based on single nucleotide polymorphism versus gene-by-gene-based approaches. , 2018, Clinical microbiology and infection : the official publication of the European Society of Clinical Microbiology and Infectious Diseases.

[18]  David L. Wheeler,et al.  GenBank , 2015, Nucleic Acids Res..

[19]  Jian Yang,et al.  VFDB 2019: a comparative pathogenomic platform with an interactive web interface , 2018, Nucleic Acids Res..

[20]  Ole Lund,et al.  In Silico Detection and Typing of Plasmids using PlasmidFinder and Plasmid Multilocus Sequence Typing , 2014, Antimicrobial Agents and Chemotherapy.

[21]  Alexandre P. Francisco,et al.  GrapeTree: visualization of core genomic relationships among 100,000 bacterial pathogens , 2017, bioRxiv.

[22]  S. Rasmussen,et al.  Identification of acquired antimicrobial resistance genes , 2012, The Journal of antimicrobial chemotherapy.

[23]  M. Gilmour,et al.  Escherichia coli O104:H4 Infections and International Travel , 2012, Emerging infectious diseases.

[24]  Jennifer L. Gardy,et al.  Towards a genomics-informed, real-time, global pathogen surveillance system , 2017, Nature Reviews Genetics.

[25]  Helen E. Parkinson,et al.  BioSamples database: an updated sample metadata hub , 2018, Nucleic Acids Res..