Nonparametric Bayesian Clustering Ensembles

Forming consensus clusters from multiple input clusterings can improve accuracy and robustness. Current clustering ensemble methods require specifying the number of consensus clusters. A poor choice can lead to under or over fitting. This paper proposes a nonparametric Bayesian clustering ensemble (NBCE) method, which can discover the number of clusters in the consensus clustering. Three inference methods are considered: collapsed Gibbs sampling, variational Bayesian inference, and collapsed variational Bayesian inference. Comparison of NBCE with several other algorithms demonstrates its versatility and superior stability.

[1]  Zoubin Ghahramani,et al.  Latent-Space Variational Bayes , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  Zoubin Ghahramani,et al.  Second-Order Latent-Space Variational Bayes for Approximate Bayesian Inference , 2008, IEEE Signal Processing Letters.

[3]  Joydeep Ghosh,et al.  Cluster Ensembles --- A Knowledge Reuse Framework for Combining Multiple Partitions , 2002, J. Mach. Learn. Res..

[4]  Radford M. Neal Probabilistic Inference Using Markov Chain Monte Carlo Methods , 2011 .

[5]  Anil K. Jain,et al.  A Mixture Model for Clustering Ensembles , 2004, SDM.

[6]  Ludmila I. Kuncheva,et al.  Experimental Comparison of Cluster Ensemble Methods , 2006, 2006 9th International Conference on Information Fusion.

[7]  Yee Whye Teh,et al.  The Mondrian Process , 2008, NIPS.

[8]  P. Green Reversible jump Markov chain Monte Carlo computation and Bayesian model determination , 1995 .

[9]  Andrea Tagarelli,et al.  Projective Clustering Ensembles , 2009, ICDM.

[10]  Mohamed S. Kamel,et al.  Finding Natural Clusters Using Multi-clusterer Combiner Based on Shared Nearest Neighbors , 2003, Multiple Classifier Systems.

[11]  Anil K. Jain,et al.  Combining multiple weak clusterings , 2003, Third IEEE International Conference on Data Mining.

[12]  C. Antoniak Mixtures of Dirichlet Processes with Applications to Bayesian Nonparametric Problems , 1974 .

[13]  Anil K. Jain,et al.  Clustering ensembles: models of consensus and weak partitions , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14]  Vipin Kumar,et al.  A Fast and High Quality Multilevel Scheme for Partitioning Irregular Graphs , 1998, SIAM J. Sci. Comput..

[15]  G. Roberts,et al.  Retrospective Markov chain Monte Carlo methods for Dirichlet process hierarchical models , 2007, 0710.4228.

[16]  Lancelot F. James,et al.  Gibbs Sampling Methods for Stick-Breaking Priors , 2001 .

[17]  Arindam Banerjee,et al.  Bayesian Co-clustering , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[18]  Rajeev Sangal,et al.  Proceedings of the 20th international joint conference on Artifical intelligence , 2007 .

[19]  Carla E. Brodley,et al.  Solving cluster ensemble problems by bipartite graph partitioning , 2004, ICML.

[20]  Xiaohua Hu,et al.  Integration of cluster ensemble and text summarization for gene expression analysis , 2004, Proceedings. Fourth IEEE Symposium on Bioinformatics and Bioengineering.

[21]  Sandrine Dudoit,et al.  Bagging to Improve the Accuracy of A Clustering Procedure , 2003, Bioinform..

[22]  Carla E. Brodley,et al.  Random Projection for High Dimensional Data Clustering: A Cluster Ensemble Approach , 2003, ICML.

[23]  Ana L. N. Fred,et al.  Data clustering using evidence accumulation , 2002, Object recognition supported by user interaction for service robots.

[24]  J. Sethuraman A CONSTRUCTIVE DEFINITION OF DIRICHLET PRIORS , 1991 .

[25]  T. Ferguson A Bayesian Analysis of Some Nonparametric Problems , 1973 .

[26]  Thomas Hofmann,et al.  Non-redundant clustering with conditional ensembles , 2005, KDD '05.

[27]  Ana L. N. Fred,et al.  Combining multiple clusterings using evidence accumulation , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[28]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[29]  Yee Whye Teh,et al.  Collapsed Variational Dirichlet Process Mixture Models , 2007, IJCAI.

[30]  Inderjit S. Dhillon,et al.  Information-theoretic co-clustering , 2003, KDD '03.

[31]  M. Escobar,et al.  Markov Chain Sampling Methods for Dirichlet Process Mixture Models , 2000 .

[32]  Derek Greene,et al.  Ensemble clustering in medical diagnostics , 2004, Proceedings. 17th IEEE Symposium on Computer-Based Medical Systems.

[33]  D. Blackwell,et al.  Ferguson Distributions Via Polya Urn Schemes , 1973 .

[34]  H. Ishwaran,et al.  Exact and approximate sum representations for the Dirichlet process , 2002 .

[35]  Matthew J. Beal Variational algorithms for approximate Bayesian inference , 2003 .

[36]  Kathryn B. Laskey,et al.  Latent Dirichlet Bayesian Co-Clustering , 2009, ECML/PKDD.

[37]  Ludmila I. Kuncheva,et al.  Evaluation of Stability of k-Means Cluster Ensembles with Respect to Random Initialization , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[38]  Ludmila I. Kuncheva,et al.  Using diversity in cluster ensembles , 2004, 2004 IEEE International Conference on Systems, Man and Cybernetics (IEEE Cat. No.04CH37583).

[39]  Joydeep Ghosh,et al.  APPROVED BY SUPERVISING COMMITTEE: , 2002 .

[40]  William F. Punch,et al.  A Comparison of Resampling Methods for Clustering Ensembles , 2004, IC-AI.

[41]  S. Roweis,et al.  Nonparametric Bayesian Biclustering , 2007 .

[42]  Derek Greene,et al.  Ensemble clustering in medical diagnostics , 2004 .

[43]  Arindam Banerjee,et al.  Bayesian cluster ensembles , 2009, Stat. Anal. Data Min..