Strain-Level Metagenomic Data Analysis of Enriched In Vitro and In Silico Spiked Food Samples: Paving the Way towards a Culture-Free Foodborne Outbreak Investigation Using STEC as a Case Study

Culture-independent diagnostics, such as metagenomic shotgun sequencing of food samples, could not only reduce the turnaround time of samples in an outbreak investigation, but also allow the detection of multi-species and multi-strain outbreaks. For successful foodborne outbreak investigation using a metagenomic approach, it is, however, necessary to bioinformatically separate the genomes of individual strains, including strains belonging to the same species, present in a microbial community, which has up until now not been demonstrated for this application. The current work shows the feasibility of strain-level metagenomics of enriched food matrix samples making use of data analysis tools that classify reads against a sequence database. It includes a brief comparison of two database-based read classification tools, Sigma and Sparse, using a mock community obtained by in vitro spiking minced meat with a Shiga toxin-producing Escherichia coli (STEC) isolate originating from a described outbreak. The more optimal tool Sigma was further evaluated using in silico simulated metagenomic data to explore the possibilities and limitations of this data analysis approach. The performed analysis allowed us to link the pathogenic strains from food samples to human isolates previously collected during the same outbreak, demonstrating that the metagenomic approach could be applied for the rapid source tracking of foodborne outbreaks. To our knowledge, this is the first study demonstrating a data analysis approach for detailed characterization and phylogenetic placement of multiple bacterial strains of one species from shotgun metagenomic WGS data of an enriched food sample.

[1]  K. Marchal,et al.  Impact of DNA extraction on whole genome sequencing analysis for characterization and relatedness of Shiga toxin-producing Escherichia coli isolates , 2020, Scientific Reports.

[2]  Maarten Nauta,et al.  Whole genome sequencing and metagenomics for outbreak investigation, source attribution and risk assessment of food‐borne microorganisms , 2019, EFSA journal. European Food Safety Authority.

[3]  Jennifer Lu,et al.  Improved metagenomic analysis with Kraken 2 , 2019, Genome Biology.

[4]  H. Van Oyen,et al.  Status and potential of bacterial genomics for public health practice: a scoping review , 2019, Implementation Science.

[5]  Simon H. Tausch,et al.  Fishing in the Soup – Pathogen Detection in Food Safety Using Metabarcoding and Metagenomic Sequencing , 2019, Front. Microbiol..

[6]  John Chapman,et al.  The use of next generation sequencing for improving food safety: Translation into practice , 2018, Food microbiology.

[7]  Alexander V. Tyakht,et al.  Genetic diversity of Escherichia coli in gut microbiota of patients with Crohn’s disease discovered using metagenomic and genomic analyses , 2018, BMC Genomics.

[8]  A. Mellmann,et al.  Attack of the clones: whole genome-based characterization of two closely related enterohemorrhagic Escherichia coli O26 epidemic lineages , 2018, BMC Genomics.

[9]  Aaron M. Walsh,et al.  Species classifier choice is a key consideration when analysing low-complexity food microbiome data , 2018, Microbiome.

[10]  S. D. De Keersmaecker,et al.  Detection and discrimination of five E. coli pathotypes using a combinatory SYBR® Green qPCR screening system , 2018, Applied Microbiology and Biotechnology.

[11]  D. Gevers,et al.  Strain Tracking Reveals the Determinants of Bacterial Engraftment in the Human Gut Following Fecal Microbiota Transplantation. , 2018, Cell host & microbe.

[12]  Amanda Clare,et al.  Recovery of gene haplotypes from a metagenome , 2018, bioRxiv.

[13]  Wei-Hua Chen,et al.  Data-mining of Antibiotic Resistance Genes Provides Insight into the Community Structure of Ocean Microbiome , 2018, bioRxiv.

[14]  Lori Rowe,et al.  High-Quality Whole-Genome Sequences for 21 Enterotoxigenic Escherichia coli Strains Generated with PacBio Sequencing , 2018, Genome Announcements.

[15]  Alejandro Amézquita,et al.  Next generation microbiological risk assessment: opportunities of whole genome sequencing (WGS) for foodborne pathogen surveillance, source tracking and risk assessment. , 2017, International journal of food microbiology.

[16]  Davide Albanese,et al.  Strain profiling and epidemiology of bacterial species from metagenomic sequencing , 2017, Nature Communications.

[17]  Nina Luhmann,et al.  Accurate Reconstruction of Microbial Strains from Metagenomic Sequencing Using Representative Reference Genomes , 2017, bioRxiv.

[18]  Arne Holst-Jensen,et al.  High Throughput Sequencing for Detection of Foodborne Pathogens , 2017, Front. Microbiol..

[19]  Johannes Alneberg,et al.  DESMAN: a new tool for de novo extraction of strains from metagenomes , 2017, Genome Biology.

[20]  Bernhard Y. Renard,et al.  Abundance estimation and differential testing on strain level in metagenomics data , 2017, Bioinform..

[21]  J. Ronholm,et al.  Metagenomics: The Next Culture-Independent Game Changer , 2017, Front. Microbiol..

[22]  Marcus J. Claesson,et al.  Strain-Level Metagenomic Analysis of the Fermented Dairy Beverage Nunu Highlights Potential Food Safety Risks , 2017, Applied and Environmental Microbiology.

[23]  A. Gill The Importance of Bacterial Culture to Food Microbiology in the Age of Genomics , 2017, Front. Microbiol..

[24]  Duy Tin Truong,et al.  Microbial strain-level population structure and genetic diversity from metagenomes , 2017, Genome research.

[25]  Claudio Donati,et al.  MetaMLST: multi-locus strain-level bacterial typing from metagenomic samples , 2016, Nucleic acids research.

[26]  U. Dobrindt,et al.  No evidence for a bovine mastitis Escherichia coli pathotype , 2016, BMC Genomics.

[27]  Susan R. Leonard,et al.  Strain-Level Discrimination of Shiga Toxin-Producing Escherichia coli in Spinach Using Metagenomic Sequencing , 2016, PloS one.

[28]  R. Colwell,et al.  Enrichment dynamics of Listeria monocytogenes and the associated microbiome from naturally contaminated ice cream linked to a listeriosis outbreak , 2016, BMC Microbiology.

[29]  J. Utzinger,et al.  Metagenomic diagnostics for the simultaneous detection of multiple pathogens in human stool specimens from Côte d'Ivoire: a proof-of-concept study. , 2016, Infection, genetics and evolution : journal of molecular epidemiology and evolutionary genetics in infectious diseases.

[30]  Duy Tin Truong,et al.  Strain-level microbial epidemiology and population genomics from shotgun metagenomics , 2016, Nature Methods.

[31]  Simon R. Harris,et al.  SNP-sites: rapid efficient extraction of SNPs from multi-FASTA alignments , 2016, bioRxiv.

[32]  Brian D. Ondov,et al.  Mash: fast genome and metagenome distance estimation using MinHash , 2015, Genome Biology.

[33]  Ana Conesa,et al.  Qualimap 2: advanced multi-sample quality control for high-throughput sequencing data , 2015, Bioinform..

[34]  Susan R. Leonard,et al.  Application of Metagenomic Sequencing to Food Safety: Detection of Shiga Toxin-Producing Escherichia coli on Fresh Bagged Spinach , 2015, Applied and Environmental Microbiology.

[35]  J. White,et al.  Cilantro microbiome before and after nonselective pre-enrichment for Salmonella using 16S rRNA and metagenomic sequencing , 2015, BMC Microbiology.

[36]  Rob Knight,et al.  ConStrains identifies microbial strains in metagenomic datasets , 2015, Nature Biotechnology.

[37]  Kathleen Marchal,et al.  Frequency-based haplotype reconstruction from deep sequencing data of bacterial populations , 2015, Nucleic acids research.

[38]  Frank M. Aarestrup,et al.  Rapid and Easy In Silico Serotyping of Escherichia coli Isolates by Use of Whole-Genome Sequencing Data , 2015, Journal of Clinical Microbiology.

[39]  U. Dobrindt,et al.  Complete Genome Sequences of Escherichia coli Strains 1303 and ECC-1470 Isolated from Bovine Mastitis , 2015, Genome Announcements.

[40]  Chongle Pan,et al.  Sigma: Strain-level inference of genomes from metagenomic analysis for biosurveillance , 2014, Bioinform..

[41]  Jurgen Verluyten,et al.  Lessons learned from a textbook outbreak: EHEC-O157:H7 infections associated with the consumption of raw meat products, June 2012, Limburg, Belgium , 2014, Archives of Public Health.

[42]  F. Navarro-Garcia,et al.  Escherichia coli O104:H4 Pathogenesis: an Enteroaggregative E. coli/Shiga Toxin-Producing E. coli Explosive Cocktail of High Virulence , 2014, Microbiology spectrum.

[43]  Changjin Hong,et al.  PathoScope 2.0: a complete computational framework for strain identification in environmental or clinical sequencing samples , 2014, Microbiome.

[44]  Justin Zobel,et al.  SRST2: Rapid genomic surveillance for public health and hospital microbiology labs , 2014, bioRxiv.

[45]  T. Dallman,et al.  An Investigation of the Diversity of Strains of Enteroaggregative Escherichia coli Isolated from Cases Associated with a Large Multi-Pathogen Foodborne Outbreak in the UK , 2014, PloS one.

[46]  Dag Harmsen,et al.  Bacterial Whole-Genome Sequencing Revisited: Portable, Scalable, and Standardized Analysis for Typing and Detection of Virulence and Antibiotic Resistance Genes , 2014, Journal of Clinical Microbiology.

[47]  Björn Usadel,et al.  Trimmomatic: a flexible trimmer for Illumina sequence data , 2014, Bioinform..

[48]  Brian Bushnell,et al.  BBMap: A Fast, Accurate, Splice-Aware Aligner , 2014 .

[49]  Ole Lund,et al.  Real-Time Whole-Genome Sequencing for Routine Typing, Surveillance, and Outbreak Detection of Verotoxigenic Escherichia coli , 2014, Journal of Clinical Microbiology.

[50]  Alexandros Stamatakis,et al.  RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies , 2014, Bioinform..

[51]  Zamin Iqbal,et al.  A Bayesian Approach to Inferring the Phylogenetic Structure of Communities from Metagenomic Data , 2013, Genetics.

[52]  B. Finlay,et al.  Recent Advances in Understanding Enteric Pathogenic Escherichia coli , 2013, Clinical Microbiology Reviews.

[53]  Rob Knight,et al.  Co-Enriching Microflora Associated with Culture Based Methods to Detect Salmonella from Tomato Phyllosphere , 2013, PloS one.

[54]  Peter J. A. Cock,et al.  Bio.Phylo: A unified toolkit for processing, analyzing and visualizing phylogenetic trees in Biopython , 2012, BMC Bioinformatics.

[55]  Heng Li,et al.  A statistical framework for SNP calling, mutation discovery, association mapping and population genetical parameter estimation from sequencing data , 2011, Bioinform..

[56]  James H. Bullard,et al.  Origins of the E. coli strain causing an outbreak of hemolytic-uremic syndrome in Germany. , 2011, The New England journal of medicine.

[57]  S Morabito,et al.  Characteristics of the enteroaggregative Shiga toxin/verotoxin-producing Escherichia coli O104:H4 strain causing the outbreak of haemolytic uraemic syndrome in Germany, May to June 2011. , 2011, Euro surveillance : bulletin Europeen sur les maladies transmissibles = European communicable disease bulletin.

[58]  Nicholas Eriksson,et al.  ShoRAH: estimating the genetic diversity of a mixed sample from next-generation sequencing data , 2011, BMC Bioinformatics.

[59]  Zemin Ning,et al.  SMALT – A new mapper for DNA sequencing reads , 2010 .

[60]  J. Hunt Shiga Toxin–Producing Escherichia coli (STEC) , 2010, Clinics in Laboratory Medicine.

[61]  R. Rosselló-Móra,et al.  Shifting the genomic gold standard for the prokaryotic species definition , 2009, Proceedings of the National Academy of Sciences.

[62]  Gonçalo R. Abecasis,et al.  The Sequence Alignment/Map format and SAMtools , 2009, Bioinform..

[63]  C. Baylis,et al.  Growth of pure cultures of Verocytotoxin‐producing Escherichia coli in a range of enrichment media , 2008, Journal of applied microbiology.