A microbiome reality check: limitations of in silico-based metagenomic approaches to study complex bacterial communities.

In recent years, whole shotgun metagenomics (WSM) of complex microbial communities has become an established technology to perform compositional analyses of complex microbial communities, an approach which is heavily reliant on bioinformatic pipelines to process and interpret the generated raw sequencing data. However, the use of such in silico pipelines for the microbial taxonomic classification of short sequences may lead to significant errors in the compositional outputs deduced from such sequencing data. To investigate the ability of such in silico pipelines, we employed two commonly applied bioinformatic tools, i.e., MetaPhlAn2 and Kraken2 together with two metagenomic datasets originating from human and animal fecal samples. By using these bioinformatic programs that taxonomically classify WSM data based on marker genes, we observed a tend to depict a lower complexity of the microbial communities. Here, we assess the limitations of the most commonly employed bioinformatic pipelines, i.e., MetaPhlAn2 and Kraken2, and based on our findings, we propose that such analyses should ideally be combined with experimentally-based microbiological validations. This article is protected by copyright. All rights reserved.

[1]  S. A. Boers,et al.  Understanding and overcoming the pitfalls and biases of next-generation sequencing (NGS) methods for use in the routine clinical microbiological diagnostic laboratory , 2019, European Journal of Clinical Microbiology & Infectious Diseases.

[2]  K. Kupkova,et al.  Bioinformatics strategies for taxonomy independent binning and visualization of sequences in shotgun metagenomics , 2016, Computational and structural biotechnology journal.

[3]  Francisco Manzano-Agugliaro,et al.  The metagenomics worldwide research , 2017, Current Genetics.

[4]  N. Segata,et al.  Exploring Vertical Transmission of Bifidobacteria from Mother to Child , 2015, Applied and Environmental Microbiology.

[5]  Pelin Yilmaz,et al.  The SILVA ribosomal RNA gene database project: improved data processing and web-based tools , 2012, Nucleic Acids Res..

[6]  William A. Walters,et al.  QIIME allows analysis of high-throughput community sequencing data , 2010, Nature Methods.

[7]  F. Turroni,et al.  Meta‐analysis of the human gut microbiome from urbanized and pre‐agricultural populations , 2017, Environmental microbiology.

[8]  Duy Tin Truong,et al.  MetaPhlAn2 for enhanced metagenomic taxonomic profiling , 2015, Nature Methods.

[9]  T. Sharpton An introduction to the analysis of shotgun metagenomic data , 2014, Front. Plant Sci..

[10]  F. Turroni,et al.  Unveiling the gut microbiota composition and functionality associated with constipation through metagenomic analyses , 2017, Scientific Reports.

[11]  W. Waegeman,et al.  Absolute quantification of microbial taxon abundances , 2016, The ISME Journal.

[12]  Derrick E. Wood,et al.  Kraken: ultrafast metagenomic sequence classification using exact alignments , 2014, Genome Biology.

[13]  F. Turroni,et al.  Unveiling bifidobacterial biogeography across the mammalian branch of the tree of life , 2017, The ISME Journal.

[14]  Pelin Yilmaz,et al.  The SILVA and “All-species Living Tree Project (LTP)” taxonomic frameworks , 2013, Nucleic Acids Res..

[15]  M. Severgnini,et al.  Diversity of Bifidobacteria within the Infant Gut Microbiota , 2012, PloS one.

[16]  J. Tiedje,et al.  Naïve Bayesian Classifier for Rapid Assignment of rRNA Sequences into the New Bacterial Taxonomy , 2007, Applied and Environmental Microbiology.

[17]  Stefan P. Albaum,et al.  Bioinformatics for NGS-based metagenomics and the application to biogas research. , 2017, Journal of biotechnology.

[18]  Duy Tin Truong,et al.  Microbial strain-level population structure and genetic diversity from metagenomes , 2017, Genome research.

[19]  F. Turroni,et al.  Assessing the Fecal Microbiota: An Optimized Ion Torrent 16S rRNA Gene-Based Analysis Protocol , 2013, PloS one.