Microbial community pattern detection in human body habitats via ensemble clustering framework

BackgroundThe human habitat is a host where microbial species evolve, function, and continue to evolve. Elucidating how microbial communities respond to human habitats is a fundamental and critical task, as establishing baselines of human microbiome is essential in understanding its role in human disease and health. Recent studies on healthy human microbiome focus on particular body habitats, assuming that microbiome develop similar structural patterns to perform similar ecosystem function under same environmental conditions. However, current studies usually overlook a complex and interconnected landscape of human microbiome and limit the ability in particular body habitats with learning models of specific criterion. Therefore, these methods could not capture the real-world underlying microbial patterns effectively.ResultsTo obtain a comprehensive view, we propose a novel ensemble clustering framework to mine the structure of microbial community pattern on large-scale metagenomic data. Particularly, we first build a microbial similarity network via integrating 1920 metagenomic samples from three body habitats of healthy adults. Then a novel symmetric Nonnegative Matrix Factorization (NMF) based ensemble model is proposed and applied onto the network to detect clustering pattern. Extensive experiments are conducted to evaluate the effectiveness of our model on deriving microbial community with respect to body habitat and host gender. From clustering results, we observed that body habitat exhibits a strong bound but non-unique microbial structural pattern. Meanwhile, human microbiome reveals different degree of structural variations over body habitat and host gender.ConclusionsIn summary, our ensemble clustering framework could efficiently explore integrated clustering results to accurately identify microbial communities, and provide a comprehensive view for a set of microbial communities. The clustering results indicate that structure of human microbiome is varied systematically across body habitats and host genders. Such trends depict an integrated biography of microbial communities, which offer a new insight towards uncovering pathogenic model of human microbiome.

[1]  Yingdong Zhao,et al.  Non-negative matrix factorization of gene expression profiles: a plug-in for BRB-ArrayTools , 2009, Bioinform..

[2]  Charles Elkan,et al.  Expectation Maximization Algorithm , 2010, Encyclopedia of Machine Learning.

[3]  Kun Tang,et al.  Comparative analysis of human saliva microbiome diversity by barcoded pyrosequencing and cloning approaches. , 2009, Analytical biochemistry.

[4]  R. Knight,et al.  The Human Microbiome Project , 2007, Nature.

[5]  R. Knight,et al.  The influence of sex, handedness, and washing on the diversity of hand surface bacteria , 2008, Proceedings of the National Academy of Sciences.

[6]  S Kullback,et al.  LETTER TO THE EDITOR: THE KULLBACK-LEIBLER DISTANCE , 1987 .

[7]  P. Choler,et al.  Assessment of Microbial Communities by Graph Partitioning in a Study of Soil Fungi in Two Alpine Meadows , 2009, Applied and Environmental Microbiology.

[8]  Karthik Devarajan,et al.  Nonnegative Matrix Factorization: An Analytical and Interpretive Tool in Computational Biology , 2008, PLoS Comput. Biol..

[9]  Xiaoli Li,et al.  Ensemble Positive Unlabeled Learning for Disease Gene Identification , 2014, PloS one.

[10]  Chee Keong Kwoh,et al.  Positive-unlabeled learning for disease gene identification , 2012, Bioinform..

[11]  Chee Keong Kwoh,et al.  Drug-target interaction prediction by learning from local information and neighbors , 2013, Bioinform..

[12]  R. Knight,et al.  Moving pictures of the human microbiome , 2011, Genome Biology.

[13]  Gábor J. Székely,et al.  Hierarchical Clustering via Joint Between-Within Distances: Extending Ward's Minimum Variance Method , 2005, J. Classif..

[14]  C. Daub,et al.  BMC Systems Biology , 2007 .

[15]  R. Knight,et al.  Bacterial Community Variation in Human Body Habitats Across Space and Time , 2009, Science.

[16]  H. Sebastian Seung,et al.  Algorithms for Non-negative Matrix Factorization , 2000, NIPS.

[17]  Hao Ding,et al.  Collaborative matrix factorization with multiple similarities for predicting drug-target interactions , 2013, KDD.

[18]  C. D. Long,et al.  Bacterial diversity in the oral cavity of 10 healthy individuals , 2010, The ISME Journal.

[19]  Derek Greene,et al.  Ensemble non-negative matrix factorization methods for clustering protein-protein interactions , 2008, Bioinform..

[20]  Rob Knight,et al.  UniFrac – An online tool for comparing microbial community diversity in a phylogenetic context , 2006, BMC Bioinformatics.

[21]  J Lederberg,et al.  Infectious History , 2000, Science.

[22]  J. A. Aas,et al.  Defining the Normal Bacterial Flora of the Oral Cavity , 2005, Journal of Clinical Microbiology.

[23]  S. P. Lloyd,et al.  Least squares quantization in PCM , 1982, IEEE Trans. Inf. Theory.

[24]  Ioannis Psorakis,et al.  Soft Partitioning in Networks via Bayesian Non-negative Matrix Factorization , 2010 .

[25]  Amy L McGuire,et al.  Ethical, legal, and social considerations in conducting the Human Microbiome Project. , 2008, Genome research.

[26]  Xiaoli Li,et al.  Inferring Gene-Phenotype Associations via Global Protein Complex Network Propagation , 2011, PloS one.

[27]  B. Roe,et al.  A core gut microbiome in obese and lean twins , 2008, Nature.

[28]  Jian Xu,et al.  Meta-Storms: efficient search for similar microbial communities based on a novel indexing scheme and similarity score for metagenomic data , 2012, Bioinform..

[29]  Santo Fortunato,et al.  Consensus clustering in complex networks , 2012, Scientific Reports.

[30]  Katherine H. Huang,et al.  Structure, Function and Diversity of the Healthy Human Microbiome , 2012, Nature.

[31]  R. Knight,et al.  The human microbiome project: exploring the microbial part of ourselves in a changing world , 2022 .

[32]  Michael Wilson,et al.  Bacteriology of Humans: An Ecological Perspective , 2008 .

[33]  J. Izard,et al.  The Human Oral Microbiome , 2010, Journal of bacteriology.

[34]  D. Relman,et al.  An ecological and evolutionary perspective on human–microbe mutualism and disease , 2007, Nature.

[35]  Juan Liu,et al.  A novel computational framework for simultaneous integration of multiple types of genomic data to identify microRNA-gene regulatory modules , 2011, Bioinform..

[36]  K. McMahon,et al.  Synchrony in aquatic microbial community dynamics , 2007, The ISME Journal.

[37]  Hinrich Schütze,et al.  Introduction to information retrieval , 2008 .

[38]  E. Purdom,et al.  Diversity of the Human Intestinal Microbial Flora , 2005, Science.

[39]  Le Ou-Yang,et al.  Protein Complex Detection via Weighted Ensemble Clustering Based on Bayesian Nonnegative Matrix Factorization , 2013, PloS one.

[40]  Chris H. Q. Ding,et al.  Symmetric Nonnegative Matrix Factorization for Graph Clustering , 2012, SDM.

[41]  Chee Keong Kwoh,et al.  Globalized bipartite local model for drug-target interaction prediction , 2012, BIOKDD '12.

[42]  C. Févotte,et al.  Automatic Relevance Determination in Nonnegative Matrix Factorization , 2009 .

[43]  C. Deming,et al.  Topographical and Temporal Diversity of the Human Skin Microbiome , 2009, Science.