Analysis of composition of microbiomes: a novel method for studying microbial composition

Background Understanding the factors regulating our microbiota is important but requires appropriate statistical methodology. When comparing two or more populations most existing approaches either discount the underlying compositional structure in the microbiome data or use probability models such as the multinomial and Dirichlet-multinomial distributions, which may impose a correlation structure not suitable for microbiome data. Objective To develop a methodology that accounts for compositional constraints to reduce false discoveries in detecting differentially abundant taxa at an ecosystem level, while maintaining high statistical power. Methods We introduced a novel statistical framework called analysis of composition of microbiomes (ANCOM). ANCOM accounts for the underlying structure in the data and can be used for comparing the composition of microbiomes in two or more populations. ANCOM makes no distributional assumptions and can be implemented in a linear model framework to adjust for covariates as well as model longitudinal data. ANCOM also scales well to compare samples involving thousands of taxa. Results We compared the performance of ANCOM to the standard t-test and a recently published methodology called Zero Inflated Gaussian (ZIG) methodology (1) for drawing inferences on the mean taxa abundance in two or more populations. ANCOM controlled the false discovery rate (FDR) at the desired nominal level while also improving power, whereas the t-test and ZIG had inflated FDRs, in some instances as high as 68% for the t-test and 60% for ZIG. We illustrate the performance of ANCOM using two publicly available microbial datasets in the human gut, demonstrating its general applicability to testing hypotheses about compositional differences in microbial communities. Conclusion Accounting for compositionality using log-ratio analysis results in significantly improved inference in microbiota survey data.

[1]  William D. Shannon,et al.  Patterned progression of bacterial populations in the premature infant gut , 2014, Proceedings of the National Academy of Sciences.

[2]  J. Brian Gray,et al.  Introduction to Linear Regression Analysis , 2002, Technometrics.

[3]  Hongzhe Li,et al.  VARIABLE SELECTION FOR SPARSE DIRICHLET-MULTINOMIAL REGRESSION WITH AN APPLICATION TO MICROBIOME DATA ANALYSIS. , 2013, The annals of applied statistics.

[4]  J. Clemente,et al.  The Impact of the Gut Microbiota on Human Health: An Integrative View , 2012, Cell.

[5]  R. Lorenz,et al.  Role of Postnatal Acquisition of the Intestinal Microbiome in the Early Development of Immune Function , 2010, Journal of pediatric gastroenterology and nutrition.

[6]  John Aitchison,et al.  The Statistical Analysis of Compositional Data , 1986 .

[7]  Barbara L. Welther,et al.  The impact , 1995 .

[8]  C. Xiang,et al.  Human Intestinal Lumen and Mucosa-Associated Microbiota in Patients with Colorectal Cancer , 2012, PloS one.

[9]  James Versalovic,et al.  The Human Microbiome and Its Potential Importance to Pediatrics , 2012, Pediatrics.

[10]  C. Quince,et al.  Dirichlet Multinomial Mixtures: Generative Models for Microbial Metagenomics , 2012, PloS one.

[11]  David J. Edwards,et al.  Hypothesis Testing and Power Calculations for Taxonomic-Based Human Microbiome Data , 2012, PloS one.

[12]  J. Neu The Human Microbiome and Its Potential Importance to Pediatrics , 2012 .

[13]  M. Pop,et al.  Robust methods for differential abundance analysis in marker gene surveys , 2013, Nature Methods.

[14]  J. Mosimann On the compound multinomial distribution, the multivariate β-distribution, and correlations among proportions , 1962 .

[15]  C. Huh,et al.  Comparative Analysis of the Gut Microbiota in People with Different Levels of Ginsenoside Rb1 Degradation to Compound K , 2013, PloS one.

[16]  Eoin L. Brodie,et al.  Greengenes, a Chimera-Checked 16S rRNA Gene Database and Workbench Compatible with ARB , 2006, Applied and Environmental Microbiology.

[17]  J. Clemente,et al.  Human gut microbiome viewed across age and geography , 2012, Nature.

[18]  Jonathan Friedman,et al.  Inferring Correlation Networks from Genomic Survey Data , 2012, PLoS Comput. Biol..

[19]  P. A. van den Brandt,et al.  Factors Influencing the Composition of the Intestinal Microbiota in Early Infancy , 2006, Pediatrics.

[20]  J. Stockman Exposure to Environmental Microorganisms and Childhood Asthma , 2012 .

[21]  John Aitchison,et al.  The single principle of compositional data analysis, continuing fallacies, confusionsand misunderstandings and some suggested remedies , 2008 .

[22]  S. Lynch,et al.  Oral and Airway Microbiota in HIV-Infected Pneumonia Patients , 2012, Journal of Clinical Microbiology.

[23]  R. Knight,et al.  Advancing analytical algorithms and pipelines for billions of microbial sequences. , 2012, Current opinion in biotechnology.

[24]  M. Blaser,et al.  Antibiotics in early life alter the murine colonic microbiome and adiposity , 2012, Nature.