BacPipe: A Rapid, User-Friendly Whole-Genome Sequencing Pipeline for Clinical Diagnostic Bacteriology

Summary Despite rapid advances in whole genome sequencing (WGS) technologies, their integration into routine microbiological diagnostics has been hampered by the lack of standardized downstream bioinformatics analysis. We developed a comprehensive and computationally low-resource bioinformatics pipeline (BacPipe) enabling direct analyses of bacterial whole-genome sequences (raw reads or contigs) obtained from second- or third-generation sequencing technologies. A graphical user interface was developed to visualize real-time progression of the analysis. The scalability and speed of BacPipe in handling large datasets was demonstrated using 4,139 Illumina paired-end sequence files of publicly available bacterial genomes (2.9–5.4 Mb) from the European Nucleotide Archive. BacPipe is integrated in EBI-SELECTA, a project-specific portal (H2020-COMPARE), and is available as an independent docker image that can be used across Windows- and Unix-based systems. BacPipe offers a fully automated “one-stop” bacterial WGS analysis pipeline to overcome the major hurdle of WGS data analysis in hospitals and public-health and for infection control monitoring.

[1]  C. Arnold Outbreak Breakthrough: Using Whole-Genome Sequencing to Control Hospital Infection , 2015, Environmental health perspectives.

[2]  Mikhail Pachkov,et al.  Automated Reconstruction of Whole-Genome Phylogenies from Short-Sequence Reads , 2014, Molecular biology and evolution.

[3]  Jacob Moran-Gilad,et al.  Whole genome sequencing (WGS) for food-borne pathogen surveillance and control – taking the pulse , 2017, Euro surveillance : bulletin Europeen sur les maladies transmissibles = European communicable disease bulletin.

[4]  Ole Lund,et al.  Real-Time Whole-Genome Sequencing for Routine Typing, Surveillance, and Outbreak Detection of Verotoxigenic Escherichia coli , 2014, Journal of Clinical Microbiology.

[5]  Stefanie Lüth,et al.  Whole genome sequencing as a typing tool for foodborne pathogens like Listeria monocytogenes – The way towards global harmonisation and data exchange , 2018 .

[6]  N. Makridakis,et al.  Whole-genome sequencing targets drug-resistant bacterial infections , 2015, Human Genomics.

[7]  Ole Lund,et al.  A Bacterial Analysis Platform: An Integrated System for Analysing Bacterial Whole Genome Sequencing Data for Clinical Diagnostics and Surveillance , 2016, PloS one.

[8]  Jing Zhang,et al.  Erratum to: The real cost of sequencing: scaling computation to keep pace with data generation , 2016, Genome Biology.

[9]  Jian Yang,et al.  VFDB 2016: hierarchical and refined dataset for big data analysis—10 years on , 2015, Nucleic Acids Res..

[10]  N. McCallum,et al.  Whole genome sequencing in clinical and public health microbiology , 2015, Pathology.

[11]  Evan S Snitkin,et al.  Tracking a Hospital Outbreak of Carbapenem-Resistant Klebsiella pneumoniae with Whole-Genome Sequencing , 2012, Science Translational Medicine.

[12]  Kai Zhou,et al.  Application of next generation sequencing in clinical microbiology and infection prevention. , 2017, Journal of biotechnology.

[13]  Nosocomial transmission of Clostridium difficile ribotype 027 in a Chinese hospital, 2012–2014, traced by whole genome sequencing , 2016, BMC Genomics.

[14]  Daniel J. Wilson,et al.  Transforming clinical microbiology with bacterial genome sequencing , 2012, Nature Reviews Genetics.

[15]  Eric P. Nawrocki,et al.  NCBI prokaryotic genome annotation pipeline , 2016, Nucleic acids research.

[16]  A. Friedrich,et al.  Complete-genome sequencing elucidates outbreak dynamics of CA-MRSA USA300 (ST8-spa t008) in an academic hospital of Paramaribo, Republic of Suriname , 2017, Scientific Reports.

[17]  Siu-Ming Yiu,et al.  SOAP2: an improved ultrafast tool for short read alignment , 2009, Bioinform..

[18]  Michael J. Palumbo,et al.  Characterization of Foodborne Outbreaks of Salmonella enterica Serovar Enteritidis with Whole-Genome Sequencing Single Nucleotide Polymorphism-Based Analysis for Surveillance and Outbreak Detection , 2015, Journal of Clinical Microbiology.

[19]  G. Dougan,et al.  Routine Use of Microbial Whole Genome Sequencing in Diagnostic and Public Health Microbiology , 2012, PLoS pathogens.

[20]  Mete Akgün,et al.  Privacy preserving processing of genomic data: A survey , 2015, J. Biomed. Informatics.

[21]  Andrew Lonie,et al.  Genomics Virtual Laboratory: A Practical Bioinformatics Workbench for the Cloud , 2015, PloS one.

[22]  Stefan Niemann,et al.  Whole-Genome-Based Mycobacterium tuberculosis Surveillance: a Standardized, Portable, and Expandable Approach , 2014, Journal of Clinical Microbiology.

[23]  Andrea Pinna,et al.  Orione, a web-based framework for NGS analysis in microbiology , 2014, Bioinform..