Latent environment allocation of microbial community data

As data for microbial community structures found in various environments has increased, studies have examined the relationship between environmental labels given to retrieved microbial samples and their community structures. However, because environments continuously change over time and space, mixed states of some environments and its effects on community formation should be considered, instead of evaluating effects of discrete environmental categories. Here we applied a hierarchical Bayesian model to paired datasets containing more than 30,000 samples of microbial community structures and sample description documents. From the training results, we extracted latent environmental topics that associate co-occurring microbes with co-occurring word sets among samples. Topics are the core elements of environmental mixtures and the visualization of topic-based samples clarifies the connections of various environments. Based on the model training results, we developed a web application, LEA (Latent Environment Allocation), which provides the way to evaluate typicality and heterogeneity of microbial communities in newly obtained samples without confining environmental categories to be compared. Because topics link words and microbes, LEA also enables to search samples semantically related to the query out of 30,000 microbiome samples.

[1]  Barry Smith,et al.  The environment ontology: contextualising biological and biomedical entities , 2013, Journal of Biomedical Semantics.

[2]  Célia M Manaia,et al.  Genotypic diversity and antibiotic resistance in Sphingomonadaceae isolated from hospital tap water. , 2014, The Science of the total environment.

[3]  Boris S. Gutkin,et al.  Sensory noise predicts divisive reshaping of receptive fields , 2017, PLoS Comput. Biol..

[4]  Hiroshi Mori,et al.  Time-series metagenomic analysis reveals robustness of soil microbiome against chemical disturbance , 2015, DNA research : an international journal for rapid publication of reports on genes and genomes.

[5]  Lawrence A. David,et al.  Diet rapidly and reproducibly alters the human gut microbiome , 2013, Nature.

[6]  Wataru Iwasaki,et al.  MetaMetaDB: A Database and Analytic System for Investigating Microbial Habitability , 2014, PloS one.

[7]  James M. Curran,et al.  A novel bacterial community index to assess stream ecological health , 2015 .

[8]  C. Lees,et al.  Explorer The Impact of Different DNA Extraction Kits and Laboratories upon the Assessment of Human Gut Microbiota Composition by 16 S rRNA Gene Sequencing , 2017 .

[9]  Laurens van der Maaten,et al.  Learning a Parametric Embedding by Preserving Local Structure , 2009, AISTATS.

[10]  Curtis Huttenhower,et al.  Microbial Co-occurrence Relationships in the Human Microbiome , 2012, PLoS Comput. Biol..

[11]  A. Burguete-García,et al.  Cervical Microbiome and Cytokine Profile at Various Stages of Cervical Cancer: A Pilot Study , 2016, PloS one.

[12]  Peter D. Nichols,et al.  Metabolic Engineering Camelina sativa with Fish Oil-Like Levels of DHA , 2014, PloS one.

[13]  Xindong Wu,et al.  Inferring Functional Groups from Microbial Gene Catalogue with Probabilistic Topic Models , 2011, 2011 IEEE International Conference on Bioinformatics and Biomedicine.

[14]  F. Bushman,et al.  Linking Long-Term Dietary Patterns with Gut Microbial Enterotypes , 2011, Science.

[15]  C. Huttenhower,et al.  Assessment of variation in microbial community amplicon sequencing by the Microbiome Quality Control (MBQC) project consortium , 2017, Nature Biotechnology.

[16]  Peer Bork,et al.  Enterotypes of the human gut microbiome , 2011, Nature.

[17]  Jeanne M. Marrazzo,et al.  Diversity of Human Vaginal Bacterial Communities and Associations with Clinically Defined Bacterial Vaginosis , 2008, Applied and Environmental Microbiology.

[18]  Michael N. Jones,et al.  Generalized Correspondence-LDA Models (GC-LDA) for Identifying Functional Regions in the Brain , 2016, NIPS.

[19]  Christopher E. McKinlay,et al.  Rethinking "enterotypes". , 2014, Cell host & microbe.

[20]  Zaid Abdo,et al.  Differences in the composition of vaginal microbial communities found in healthy Caucasian and black women , 2007, The ISME Journal.

[21]  James Allan,et al.  A Comparative Study of Utilizing Topic Models for Information Retrieval , 2009, ECIR.

[22]  James T. Morton,et al.  Microbiome-wide association studies link dynamic microbial consortia to disease , 2016, Nature.

[23]  Hiroshi Mori,et al.  VITCOMIC: visualization tool for taxonomic compositions of microbial communities based on 16S rRNA gene sequences , 2010, BMC Bioinformatics.

[24]  Anders F. Andersson,et al.  Transitions in bacterial communities along the 2000 km salinity gradient of the Baltic Sea , 2011, The ISME Journal.

[25]  Hiroshi Mori,et al.  VITCOMIC2: visualization tool for the phylogenetic composition of microbial communities based on 16S rRNA gene amplicons and metagenomic shotgun sequencing , 2018, BMC Systems Biology.

[26]  Hong Gu,et al.  BioMiCo: a supervised Bayesian model for inference of microbial community structure , 2015, Microbiome.

[27]  Camilla Nesbø,et al.  Microbial communities involved in methane production from hydrocarbons in oil sands tailings. , 2012, Environmental science & technology.

[28]  Hodon Ryu,et al.  Biofilms on Hospital Shower Hoses: Characterization and Implications for Nosocomial Infections , 2016, Applied and Environmental Microbiology.

[29]  P. Schloss,et al.  Dynamics and associations of microbial community types across the human body , 2014, Nature.

[30]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[31]  Jacques Ravel,et al.  Vaginal microbiome: rethinking health and disease. , 2012, Annual review of microbiology.

[32]  John Salvatier,et al.  Theano: A Python framework for fast computation of mathematical expressions , 2016, ArXiv.

[33]  Hiroshi Mori,et al.  Design and Experimental Application of a Novel Non-Degenerate Universal Primer Set that Amplifies Prokaryotic 16S rRNA Genes with a Low Possibility to Amplify Eukaryotic rRNA Genes , 2013, DNA research : an international journal for rapid publication of reports on genes and genomes.

[34]  Michael N. Jones,et al.  Decoding brain activity using a large-scale probabilistic functional-anatomical atlas of human cognition , 2016, bioRxiv.

[35]  J. Raes,et al.  Microbial interactions: from networks to models , 2012, Nature Reviews Microbiology.

[36]  Gregor Reid,et al.  Characterization of the vaginal microbiota of healthy Canadian women through the menstrual cycle , 2014, Microbiome.

[37]  Paul D. Cotter,et al.  Nucleic acid-based approaches to investigate microbial-related cheese quality defects , 2012, Front. Microbio..

[38]  Rick L. Stevens,et al.  A communal catalogue reveals Earth’s multiscale microbial diversity , 2017, Nature.

[39]  Chris Mungall,et al.  The environment ontology in 2016: bridging domains with increased scope, semantic density, and interoperation , 2016, Journal of Biomedical Semantics.

[40]  Juan Carlos Fernández,et al.  Multiobjective evolutionary algorithms to identify highly autocorrelated areas: the case of spatial distribution in financially compromised farms , 2014, Ann. Oper. Res..

[41]  Fengan Yu,et al.  Environmental Pseudomonads Inhibit Cystic Fibrosis Patient-Derived Pseudomonas aeruginosa , 2016, Applied and Environmental Microbiology.

[42]  Masahira Hattori,et al.  The gut microbiome of healthy Japanese and its microbial and functional uniqueness , 2016, DNA research : an international journal for rapid publication of reports on genes and genomes.

[43]  Susannah G. Tringe,et al.  The YNP Metagenome Project: Environmental Parameters Responsible for Microbial Distribution in the Yellowstone Geothermal Ecosystem , 2013, Front. Microbiol..

[44]  Chenyu Zhu,et al.  MetaTopics: an integration tool to analyze microbial community profile by topic model , 2017, BMC Genomics.

[45]  Dan Knights,et al.  Advances in inflammatory bowel disease pathogenesis: linking host genetics and the microbiome , 2013, Gut.

[46]  Andreas Deutsch,et al.  An Emerging Allee Effect Is Critical for Tumor Initiation and Persistence , 2015, PLoS Comput. Biol..

[47]  John R. Lawrence,et al.  Next-generation sequencing of microbial communities in the Athabasca River 1 and its tributaries in relation to oil sands mining activities 2 3 , 2012 .

[48]  K. Oguma,et al.  Application of Cation-Coated Filter Method to Detection of Noroviruses, Enteroviruses, Adenoviruses, and Torque Teno Viruses in the Tamagawa River in Japan , 2005, Applied and Environmental Microbiology.

[49]  Rob Knight,et al.  Longitudinal analysis of microbial interaction between humans and the indoor environment , 2014, Science.

[50]  Mark Steyvers,et al.  Finding scientific topics , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[51]  Fiona Curran-Cournane,et al.  Bacteria as Emerging Indicators of Soil Condition , 2016, Applied and Environmental Microbiology.

[52]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[53]  R. Knight,et al.  Global patterns in bacterial diversity , 2007, Proceedings of the National Academy of Sciences.

[54]  Michael I. Jordan,et al.  Modeling annotated data , 2003, SIGIR.

[55]  Min Yang,et al.  Pyrosequencing analysis of eukaryotic and bacterial communities in faucet biofilms. , 2012, The Science of the total environment.

[56]  Rob Knight,et al.  Bayesian community-wide culture-independent microbial source tracking , 2011, Nature Methods.

[57]  Naonori Ueda,et al.  Modeling Social Annotation Data with Content Relevance using a Topic Model , 2009, NIPS.

[58]  Eric J Alm,et al.  Host lifestyle affects human microbiota on daily timescales , 2014, Genome Biology.

[59]  R. Knight,et al.  Meta-analyses of studies of the human microbiota , 2013, Genome research.

[60]  Curtis Huttenhower,et al.  A Guide to Enterotypes across the Human Body: Meta-Analysis of Microbial Community Structures in Human Microbiome Datasets , 2013, PLoS Comput. Biol..

[61]  T. Minka Estimating a Dirichlet distribution , 2012 .

[62]  Thomas Hofmann,et al.  Probabilistic Latent Semantic Indexing , 1999, SIGIR Forum.

[63]  P. Gajer,et al.  Vaginal microbiome of reproductive-age women , 2010, Proceedings of the National Academy of Sciences.

[64]  Andreas Henschel,et al.  Comprehensive Meta-analysis of Ontology Annotated 16S rRNA Profiles Identifies Beta Diversity Clusters of Environmental Bacterial Communities , 2015, PLoS Comput. Biol..

[65]  D. Caron,et al.  Marine bacterial, archaeal and protistan association networks reveal ecological linkages , 2011, The ISME Journal.

[66]  Zaid Abdo,et al.  Temporal Dynamics of the Human Vaginal Microbiota , 2012, Science Translational Medicine.

[67]  J. J. Abellán,et al.  Environmental distribution of prokaryotic taxa , 2010, BMC Microbiology.

[68]  Hiroshi Mori,et al.  CLAST: CUDA implemented large-scale alignment search tool , 2014, BMC Bioinformatics.

[69]  Andrew McCallum,et al.  Rethinking LDA: Why Priors Matter , 2009, NIPS.