Balance Trees Reveal Microbial Niche Differentiation

By explicitly accounting for the compositional nature of 16S rRNA gene data through the concept of balances, balance trees yield novel biological insights into niche differentiation. The software to perform this analysis is available under an open-source license and can be obtained at https://github.com/biocore/gneiss . ABSTRACT Advances in sequencing technologies have enabled novel insights into microbial niche differentiation, from analyzing environmental samples to understanding human diseases and informing dietary studies. However, identifying the microbial taxa that differentiate these samples can be challenging. These issues stem from the compositional nature of 16S rRNA gene data (or, more generally, taxon or functional gene data); the changes in the relative abundance of one taxon influence the apparent abundances of the others. Here we acknowledge that inferring properties of individual bacteria is a difficult problem and instead introduce the concept of balances to infer meaningful properties of subcommunities, rather than properties of individual species. We show that balances can yield insights about niche differentiation across multiple microbial environments, including soil environments and lung sputum. These techniques have the potential to reshape how we carry out future ecological analyses aimed at revealing differences in relative taxonomic abundances across different samples. IMPORTANCE By explicitly accounting for the compositional nature of 16S rRNA gene data through the concept of balances, balance trees yield novel biological insights into niche differentiation. The software to perform this analysis is available under an open-source license and can be obtained at https://github.com/biocore/gneiss . Author Video: An author video summary of this article is available.

[1]  Barbara A. Bailey,et al.  A Winogradsky-based culture system shows an association between microbial fermentation and cystic fibrosis exacerbation , 2014, The ISME Journal.

[2]  V. Pawlowsky-Glahn,et al.  Modeling and Analysis of Compositional Data , 2015 .

[3]  Katherine H. Huang,et al.  Structure, Function and Diversity of the Healthy Human Microbiome , 2012, Nature.

[4]  Matthew J. Gebert,et al.  Microbial community assembly and metabolic function during mammalian corpse decomposition , 2016, Science.

[5]  Susan P. Holmes,et al.  Waste Not , Want Not : Why Rarefying Microbiome Data is Inadmissible . October 1 , 2013 , 2013 .

[6]  V. Pawlowsky-Glahn,et al.  Dealing with Zeros and Missing Values in Compositional Data Sets Using Nonparametric Imputation , 2003 .

[7]  Jesse R. Zaneveld,et al.  Effects of library size variance, sparsity, and compositionality on the analysis of microbiome data , 2015 .

[8]  Jonathan Friedman,et al.  Inferring Correlation Networks from Genomic Survey Data , 2012, PLoS Comput. Biol..

[9]  R. Knight,et al.  Pyrosequencing-Based Assessment of Soil pH as a Predictor of Soil Bacterial Community Structure at the Continental Scale , 2009, Applied and Environmental Microbiology.

[10]  Matthew C. B. Tsilimigras,et al.  Compositional data analysis of the microbiome: fundamentals, tools, and challenges. , 2016, Annals of epidemiology.

[11]  Sayan Mukherjee,et al.  Phylogenetic factorization of compositional data , 2016, bioRxiv.

[12]  N. Grimm,et al.  Towards an ecological understanding of biological nitrogen fixation , 2002 .

[13]  J. Raes,et al.  Microbial interactions: from networks to models , 2012, Nature Reviews Microbiology.

[14]  Lawrence A. David,et al.  A phylogenetic transform enhances analysis of compositional microbiota data , 2016, bioRxiv.

[15]  V. Pawlowsky-Glahn,et al.  Exploring Compositional Data with the CoDa-Dendrogram , 2011 .

[16]  M. McBride,et al.  Microbial acidification and pH effects on trace element release from sewage sludge. , 2004, Environmental pollution.

[17]  Vera Pawlowsky-Glahn,et al.  It's all relative: analyzing microbiome data as compositions. , 2016, Annals of epidemiology.

[18]  Lawrence A. David,et al.  Phylogenetic factorization of compositional data yields lineage-level associations in microbiome datasets , 2017, PeerJ.

[19]  G. Mateu-Figueras,et al.  Isometric Logratio Transformations for Compositional Data Analysis , 2003 .

[20]  M. Pop,et al.  Robust methods for differential abundance analysis in marker gene surveys , 2013, Nature Methods.

[21]  Mihai Pop,et al.  Robust methods for differential abundance analysis in marker gene surveys , 2013, Nature Methods.

[22]  Christian L. Müller,et al.  Sparse and Compositionally Robust Inference of Microbial Ecological Networks , 2014, PLoS Comput. Biol..

[23]  Jürg Bähler,et al.  Proportionality: A Valid Alternative to Correlation for Relative Data , 2014, bioRxiv.

[24]  Anru R. Zhang,et al.  Regression Analysis for Microbiome Compositional Data , 2016, 1603.00974.

[25]  V. Pawlowsky-Glahn,et al.  Compositional data analysis : theory and applications , 2011 .

[26]  Rob Knight,et al.  Analysis of composition of microbiomes: a novel method for studying microbial composition , 2015, Microbial ecology in health and disease.

[27]  Eric J Alm,et al.  Host lifestyle affects human microbiota on daily timescales , 2014, Genome Biology.

[28]  V. Pawlowsky-Glahn,et al.  Groups of Parts and Their Balances in Compositional Data Analysis , 2005 .