BioMiCo: a supervised Bayesian model for inference of microbial community structure

BackgroundMicrobiome samples often represent mixtures of communities, where each community is composed of overlapping assemblages of species. Such mixtures are complex, the number of species is huge and abundance information for many species is often sparse. Classical methods have a limited value for identifying complex features within such data.ResultsHere, we describe a novel hierarchical model for Bayesian inference of microbial communities (BioMiCo). The model takes abundance data derived from environmental DNA, and models the composition of each sample by a two-level hierarchy of mixture distributions constrained by Dirichlet priors. BioMiCo is supervised, using known features for samples and appropriate prior constraints to overcome the challenges posed by many variables, sparse data, and large numbers of rare species. The model is trained on a portion of the data, where it learns how assemblages of species are mixed to form communities and how assemblages are related to the known features of each sample. Training yields a model that can predict the features of new samples. We used BioMiCo to build models for three serially sampled datasets and tested their predictive accuracy across different time points. The first model was trained to predict both body site (hand, mouth, and gut) and individual human host. It was able to reliably distinguish these features across different time points. The second was trained on vaginal microbiomes to predict both the Nugent score and individual human host. We found that women having normal and elevated Nugent scores had distinct microbiome structures that persisted over time, with additional structure within women having elevated scores. The third was trained for the purpose of assessing seasonal transitions in a coastal bacterial community. Application of this model to a high-resolution time series permitted us to track the rate and time of community succession and accurately predict known ecosystem-level events.ConclusionBioMiCo provides a framework for learning the structure of microbial communities and for making predictions based on microbial assemblages. By training on carefully chosen features (abiotic or biotic), BioMiCo can be used to understand and predict transitions between complex communities composed of hundreds of microbial species.

[1]  Trevor Hastie,et al.  The Elements of Statistical Learning , 2001 .

[2]  R. Knight,et al.  Moving pictures of the human microbiome , 2011, Genome Biology.

[3]  Rob Knight,et al.  Bayesian community-wide culture-independent microbial source tracking , 2011, Nature Methods.

[4]  T. Spagnuolo,et al.  Vaginal microbial flora and outcome of pregnancy , 2010, Archives of Gynecology and Obstetrics.

[5]  Alison S. Waller,et al.  Genomic variation landscape of the human gut microbiome , 2012, Nature.

[6]  J. Fuhrman,et al.  Community structure of marine bacterioplankton: patterns, networks, and relationships to function , 2008 .

[7]  C. Quince,et al.  Dirichlet Multinomial Mixtures: Generative Models for Microbial Metagenomics , 2012, PloS one.

[8]  Daniel Patrick Smith,et al.  Beyond the genome: community-level analysis of the microbial world , 2012, Biology & Philosophy.

[9]  P. Gajer,et al.  Vaginal microbiome of reproductive-age women , 2010, Proceedings of the National Academy of Sciences.

[10]  E. Martens,et al.  How glycan metabolism shapes the human gut microbiota , 2012, Nature Reviews Microbiology.

[11]  L. Mosca,et al.  Vaginal microbiota and viral sexually transmitted diseases. , 2013, Annali di igiene : medicina preventiva e di comunita.

[12]  Zaid Abdo,et al.  Temporal Dynamics of the Human Vaginal Microbiota , 2012, Science Translational Medicine.

[13]  S. Giovannoni,et al.  Cultivation of the ubiquitous SAR11 marine bacterioplankton clade , 2002, Nature.

[14]  R. Amann,et al.  Substrate-Controlled Succession of Marine Bacterioplankton Populations Induced by a Phytoplankton Bloom , 2012, Science.

[15]  David M. Blei,et al.  Supervised Topic Models , 2007, NIPS.

[16]  I. Martínez,et al.  Long-Term Temporal Analysis of the Human Fecal Microbiota Revealed a Stable Core of Dominant Bacterial Species , 2013, PloS one.

[17]  R. Britton,et al.  Role of the intestinal microbiota in resistance to colonization by Clostridium difficile. , 2014, Gastroenterology.

[18]  Ashutosh Kumar Singh,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction , 2010 .

[19]  W. D. de Vos,et al.  Gut Microbiota Signatures Predict Host and Microbiota Responses to Dietary Interventions in Obese Individuals , 2014, PloS one.

[20]  R. Knight,et al.  Supervised classification of human microbiota. , 2011, FEMS microbiology reviews.

[21]  Ruben E. Valas,et al.  Genomic insights to SAR86, an abundant and uncultivated marine bacterial lineage , 2011, The ISME Journal.

[22]  M A Krohn,et al.  Reliability of diagnosing bacterial vaginosis is improved by a standardized method of gram stain interpretation , 1991, Journal of clinical microbiology.

[23]  Kevin Y. Yip,et al.  Analysis of membrane proteins in metagenomics: networks of correlated environmental features and protein families. , 2010, Genome research.

[24]  T. Thomas,et al.  Bacterial community assembly based on functional genes rather than species , 2011, Proceedings of the National Academy of Sciences.

[25]  W K Li,et al.  Monitoring phytoplankton, bacterioplankton, and virioplankton in a coastal inlet (Bedford Basin) by flow cytometry. , 2001, Cytometry.

[26]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[27]  Susan M. Huse,et al.  Microbial diversity in the deep sea and the underexplored “rare biosphere” , 2006, Proceedings of the National Academy of Sciences.

[28]  Jun S. Liu,et al.  The Collapsed Gibbs Sampler in Bayesian Computations with Applications to a Gene Regulation Problem , 1994 .

[29]  Susan M. Huse,et al.  Global Patterns of Bacterial Beta-Diversity in Seafloor and Seawater Ecosystems , 2011, PloS one.

[30]  Jonathan M. Chase,et al.  The metacommunity concept: a framework for multi-scale community ecology , 2004 .

[31]  Hong Gu,et al.  BiomeNet: A Bayesian Model for Inference of Metabolic Divergence among Microbial Communities , 2014, PLoS Comput. Biol..

[32]  Chris Whidden,et al.  Interactions in the microbiome: communities of organisms and communities of genes , 2013, FEMS microbiology reviews.

[33]  I. Hewson,et al.  Annually reoccurring bacterial communities are predictable from ocean conditions , 2006, Proceedings of the National Academy of Sciences.

[34]  Jacques Ravel,et al.  Daily temporal dynamics of vaginal microbiota before, during and after episodes of bacterial vaginosis , 2013, Microbiome.

[35]  D. Walsh,et al.  Seasonal assemblages and short-lived blooms in coastal north-west Atlantic Ocean bacterioplankton. , 2015, Environmental microbiology.

[36]  T. Ferdelman,et al.  Heterotrophic organisms dominate nitrogen fixation in the South Pacific Gyre , 2011, The ISME Journal.

[37]  A. Tsuda,et al.  Differing Growth Responses of Major Phylogenetic Groups of Marine Bacteria to Natural Phytoplankton Blooms in the Western North Pacific Ocean , 2011, Applied and Environmental Microbiology.

[38]  J. Clemente,et al.  The Long-Term Stability of the Human Gut Microbiota , 2013 .

[39]  Se Jin Song,et al.  The treatment-naive microbiome in new-onset Crohn's disease. , 2014, Cell host & microbe.

[40]  S. Giovannoni,et al.  Seasonality in Ocean Microbial Communities , 2012, Science.

[41]  D. Caron,et al.  Marine bacterial, archaeal and protistan association networks reveal ecological linkages , 2011, The ISME Journal.