Calour: an Interactive, Microbe-Centric Analysis Tool

Calour allows us to identify interesting microbial patterns and generate novel biological hypotheses by interactively inspecting microbiome studies and incorporating annotation databases and convenient statistical tools. Calour can be used as a first-step tool for microbiome data exploration. ABSTRACT Microbiome analyses can be challenging because microbial strains are numerous, and often, confounding factors in the data set are also numerous. Many tools reduce, summarize, and visualize these high-dimensional data to provide insight at the community level. However, they lose the detailed information about each taxon and can be misleading (for example, the well-known horseshoe effect in ordination plots). Thus, multiple methods at different levels of resolution are required to capture the full range of microbial patterns. Here we present Calour, a user-friendly data exploration tool for microbiome analyses. Calour provides a study-centric data model to store and manipulate sample-by-feature tables (with features typically being operational taxonomic units) and their associated metadata. It generates an interactive heatmap, allowing visualization of microbial patterns and exploration using microbial knowledge databases. We demonstrate the use of Calour by exploring publicly available data sets, including the gut and skin microbiota of habitat-switched fire salamander larvae, gut microbiota of Trichuris muris-infected mice, skin microbiota of different human body sites, gut microbiota of various ant species, and a metabolome study of mice exposed to intermittent hypoxia and hypercapnia. In these cases, Calour reveals novel patterns and potential contaminants of subgroups of microbes that are otherwise hard to find. Calour is open source under the Berkeley Software Distribution (BSD) license and available from https://github.com/biocore/calour. IMPORTANCE Calour allows us to identify interesting microbial patterns and generate novel biological hypotheses by interactively inspecting microbiome studies and incorporating annotation databases and convenient statistical tools. Calour can be used as a first-step tool for microbiome data exploration.

[1]  Sebastian Steinfartz,et al.  Amphibian gut microbiota shifts differentially in community structure but converges on habitat-specific predicted functions , 2016, Nature Communications.

[2]  Rob Knight,et al.  Dramatic Differences in Gut Bacterial Densities Correlate with Diet and Habitat in Rainforest Ants. , 2017, Integrative and comparative biology.

[3]  Martin J. Blaser,et al.  Body Site Is a More Determinant Factor than Human Population Diversity in the Healthy Skin Microbiome , 2016, PloS one.

[4]  J. Estellé,et al.  Chronic Trichuris muris Infection Decreases Diversity of the Intestinal Microbiota and Concomitantly Increases the Abundance of Lactobacilli , 2015, PloS one.

[5]  Nigel W. Hardy,et al.  Proposed minimum reporting standards for chemical analysis , 2007, Metabolomics.

[6]  T. Sicheritz-Pontén,et al.  Comparative performance of the BGISEQ-500 vs Illumina HiSeq2500 sequencing platforms for palaeogenomic sequencing , 2017, GigaScience.

[7]  Andreas Wilke,et al.  The Biological Observation Matrix (BIOM) format or: how I learned to stop worrying and love the ome-ome , 2012, GigaScience.

[8]  Gary D Bader,et al.  Biological Network Exploration with Cytoscape 3 , 2014, Current protocols in bioinformatics.

[9]  Kristian Fog Nielsen,et al.  Sharing and community curation of mass spectrometry data with Global Natural Products Social Molecular Networking , 2016, Nature Biotechnology.

[10]  Lutz Krause,et al.  Calypso: a user-friendly web-server for mining and visualizing microbiome–environment interactions , 2016, Bioinform..

[11]  Fang Liu,et al.  The sponge microbiome project , 2017, GigaScience.

[12]  Daniel H. Huson,et al.  MEGAN Community Edition - Interactive Exploration and Analysis of Large-Scale Microbiome Sequencing Data , 2016, PLoS Comput. Biol..

[13]  Rob Knight,et al.  American Gut: an Open Platform for Citizen Science Microbiome Research , 2018, mSystems.

[14]  Scott Powell,et al.  Stability and phylogenetic correlation in gut microbiota: lessons from ants and apes , 2014, Molecular ecology.

[15]  Paul J. McMurdie,et al.  DADA2: High resolution sample inference from Illumina amplicon data , 2016, Nature Methods.

[16]  Sophie J. Weiss,et al.  Correlation detection strategies in microbial data sets vary widely in sensitivity and precision , 2016, The ISME Journal.

[17]  Michael R La Frano,et al.  Diet-induced obesity and weight loss alter bile acid concentrations and bile acid-sensitive gene expression in insulin target tissues of C57BL/6J mice. , 2017, Nutrition research.

[18]  Blair J. Rossetti,et al.  Biogeography of a human oral microbiome at the micron scale , 2016, Proceedings of the National Academy of Sciences.

[19]  Amnon Amir,et al.  Discrete False-Discovery Rate Improves Identification of Differentially Abundant Microbes , 2017, mSystems.

[20]  Rob Knight,et al.  EMPeror: a tool for visualizing high-throughput microbial community data , 2013, GigaScience.

[21]  William A. Walters,et al.  QIIME allows analysis of high-throughput community sequencing data , 2010, Nature Methods.

[22]  Rob Knight,et al.  Intermittent Hypoxia and Hypercapnia, a Hallmark of Obstructive Sleep Apnea, Alters the Gut Microbiome and Metabolome , 2018, mSystems.

[23]  Y. Benjamini,et al.  Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .

[24]  Rob Knight,et al.  Uncovering the Horseshoe Effect in Microbial Analyses , 2017, mSystems.

[25]  Soichi Tanabe,et al.  High-fat Diet-induced Intestinal Hyperpermeability is Associated with Increased Bile Acids in the Large Intestine of Mice. , 2016, Journal of food science.

[26]  Trey Ideker,et al.  Cytoscape 2.8: new features for data integration and network visualization , 2010, Bioinform..

[27]  János Podani,et al.  RESEMBLANCE COEFFICIENTS AND THE HORSESHOE EFFECT IN PRINCIPAL COORDINATES ANALYSIS , 2002 .

[28]  Fengzhu Sun,et al.  Extended local similarity analysis (eLSA) of microbial community and other time series data with replicates , 2011, BMC Systems Biology.

[29]  N. Fierer,et al.  Hiding in Plain Sight: Mining Bacterial Species Records for Phenotypic Trait Information , 2017, mSphere.

[30]  Peter B. Gilbert,et al.  A modified false discovery rate multiple‐comparisons procedure for discrete data, applied to human immunodeficiency virus genetics , 2005 .

[31]  Jose A Navas-Molina,et al.  Deblur Rapidly Resolves Single-Nucleotide Community Sequence Patterns , 2017, mSystems.

[32]  Susan M. Huse,et al.  Oligotyping analysis of the human oral microbiome , 2014, Proceedings of the National Academy of Sciences.

[33]  Robert G. Beiko,et al.  STAMP: statistical analysis of taxonomic and functional profiles , 2014, Bioinform..