MicrobiomeDB: a systems biology platform for integrating, mining and analyzing microbiome experiments

Abstract MicrobiomeDB (http://microbiomeDB.org) is a data discovery and analysis platform that empowers researchers to fully leverage experimental variables to interrogate microbiome datasets. MicrobiomeDB was developed in collaboration with the Eukaryotic Pathogens Bioinformatics Resource Center (http://EuPathDB.org) and leverages the infrastructure and user interface of EuPathDB, which allows users to construct in silico experiments using an intuitive graphical ‘strategy’ approach. The current release of the database integrates microbial census data with sample details for nearly 14 000 samples originating from human, animal and environmental sources, including over 9000 samples from healthy human subjects in the Human Microbiome Project (http://portal.ihmpdcc.org/). Query results can be statistically analyzed and graphically visualized via interactive web applications launched directly in the browser, providing insight into microbial community diversity and allowing users to identify taxa associated with any experimental covariate.

[1]  Jose A Navas-Molina,et al.  The Microbiome and Big Data. , 2017, Current opinion in systems biology.

[2]  Andreas Wilke,et al.  The Biological Observation Matrix (BIOM) format or: how I learned to stop worrying and love the ome-ome , 2012, GigaScience.

[3]  Susan Holmes,et al.  phyloseq: An R Package for Reproducible Interactive Analysis and Graphics of Microbiome Census Data , 2013, PloS one.

[4]  Derrick E. Wood,et al.  Kraken: ultrafast metagenomic sequence classification using exact alignments , 2014, Genome Biology.

[5]  Martin Hartmann,et al.  Introducing mothur: Open-Source, Platform-Independent, Community-Supported Software for Describing and Comparing Microbial Communities , 2009, Applied and Environmental Microbiology.

[6]  Se Jin Song,et al.  Cohabiting family members share microbiota with one another and with their dogs , 2013, eLife.

[7]  Steven Sullivan,et al.  Malaria Study Data Integration and Information Retrieval Based on OBO Foundry Ontologies , 2016, ICBO/BioCreative.

[8]  W. Huber,et al.  Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2 , 2014, Genome Biology.

[9]  Christopher G. Chute,et al.  BioPortal: ontologies and integrated data resources at the click of a mouse , 2009, Nucleic Acids Res..

[10]  Rob Knight,et al.  Longitudinal analysis of microbial interaction between humans and the indoor environment , 2014, Science.

[11]  Eoin L. Brodie,et al.  Greengenes, a Chimera-Checked 16S rRNA Gene Database and Workbench Compatible with ARB , 2006, Applied and Environmental Microbiology.

[12]  C. Huttenhower,et al.  Metagenomic biomarker discovery and explanation , 2011, Genome Biology.

[13]  William A. Walters,et al.  Ultra-high-throughput microbial community analysis on the Illumina HiSeq and MiSeq platforms , 2012, The ISME Journal.

[14]  Rida Assaf,et al.  Improvements to PATRIC, the all-bacterial Bioinformatics Database and Analysis Resource Center , 2016, Nucleic Acids Res..

[15]  Niklas Elmqvist,et al.  Metaviz: interactive statistical and visual analysis of metagenomic data , 2017, bioRxiv.

[16]  Val Tannen,et al.  K2/Kleisli and GUS: Experiments in integrated access to genomic data sources , 2001, IBM Syst. J..

[17]  Jessica A. Turner,et al.  The Ontology for Biomedical Investigations , 2016, PloS one.

[18]  Gregory D. Schuler,et al.  Database resources of the National Center for Biotechnology Information: update , 2004, Nucleic acids research.

[19]  Haiming Wang,et al.  EuPathDB: the eukaryotic pathogen genomics database resource , 2016, Nucleic Acids Res..

[20]  Eric P. Nawrocki,et al.  An improved Greengenes taxonomy with explicit ranks for ecological and evolutionary analyses of bacteria and archaea , 2011, The ISME Journal.

[21]  Bik Holly,et al.  Phinch: An interactive, exploratory data visualization framework for metagenomic datasets , 2014 .

[22]  William A. Walters,et al.  QIIME allows analysis of high-throughput community sequencing data , 2010, Nature Methods.

[23]  R. DeSalle,et al.  Large-scale differences in microbial biodiversity discovery between 16S amplicon and shotgun sequencing , 2017, Scientific Reports.

[24]  Sarah L. Westcott,et al.  Development of a Dual-Index Sequencing Strategy and Curation Pipeline for Analyzing Amplicon Sequence Data on the MiSeq Illumina Sequencing Platform , 2013, Applied and Environmental Microbiology.

[25]  Barry Smith,et al.  The environment ontology: contextualising biological and biomedical entities , 2013, Journal of Biomedical Semantics.

[26]  Oliver Hofmann,et al.  ISA software suite: supporting standards-compliant experimental annotation and enabling curation at the community level , 2010, Bioinform..

[27]  Martin J. Blaser,et al.  Antibiotics, birth mode, and diet shape microbiome maturation during early life , 2016, Science Translational Medicine.

[28]  Katherine H. Huang,et al.  Structure, Function and Diversity of the Healthy Human Microbiome , 2012, Nature.

[29]  Evan Bolton,et al.  Database resources of the National Center for Biotechnology Information , 2017, Nucleic Acids Res..

[30]  Pelin Yilmaz,et al.  The genomic standards consortium: bringing standards to life for microbial ecology , 2011, The ISME Journal.

[31]  R. Knight,et al.  The Human Microbiome Project , 2007, Nature.

[32]  M. Ashburner,et al.  The OBO Foundry: coordinated evolution of ontologies to support biomedical data integration , 2007, Nature Biotechnology.