Joint Cluster Based Co-clustering for Clustering Ensembles

This paper introduces a new method for solving clustering ensembles, that is, combining multiple clusterings over a common dataset into a final better one. The ensemble is reduced to a graph that simultaneously models as vertices the original clusters in the ensemble and the joint clusters derived from them. Only edges linking vertices from different types are considered. The resulting graph can be partitioned efficiently to produce the final clustering. Finally, the proposed method is evaluated against two graph formulations commonly used.

[1]  Thomas G. Dietterich Multiple Classifier Systems , 2000, Lecture Notes in Computer Science.

[2]  Anil K. Jain,et al.  Adaptive clustering ensembles , 2004, ICPR 2004.

[3]  Anil K. Jain,et al.  Algorithms for Clustering Data , 1988 .

[4]  Ana L. N. Fred,et al.  Combining multiple clusterings using evidence accumulation , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  Joydeep Ghosh,et al.  Multiclassifier Systems: Back to the Future , 2002, Multiple Classifier Systems.

[6]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[7]  Jean-Pierre Barthélemy,et al.  The Median Procedure for Partitions , 1993, Partitioning Data Sets.

[8]  Anil K. Jain,et al.  Combining multiple weak clusterings , 2003, Third IEEE International Conference on Data Mining.

[9]  Ya Zhang,et al.  Correlating summarization of multi-source news with k-way graph bi-clustering , 2004, SKDD.

[10]  Hillol Kargupta,et al.  Distributed Clustering Using Collective Principal Component Analysis , 2001, Knowledge and Information Systems.

[11]  Ana L. N. Fred,et al.  Analysis of consensus partition in cluster ensemble , 2004, Fourth IEEE International Conference on Data Mining (ICDM'04).

[12]  Andrew B. Kahng,et al.  New spectral methods for ratio cut partitioning and clustering , 1991, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[13]  Wei Tang,et al.  Clusterer ensemble , 2006, Knowl. Based Syst..

[14]  Anil K. Jain,et al.  A Mixture Model for Clustering Ensembles , 2004, SDM.

[15]  Boris G. Mirkin,et al.  Reinterpreting the Category Utility Function , 2001, Machine Learning.

[16]  Jitendra Malik,et al.  Normalized Cuts and Image Segmentation , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[17]  Inderjit S. Dhillon,et al.  Co-clustering documents and words using bipartite spectral graph partitioning , 2001, KDD '01.

[18]  Sandrine Dudoit,et al.  Bagging to Improve the Accuracy of A Clustering Procedure , 2003, Bioinform..

[19]  Joydeep Ghosh,et al.  Cluster Ensembles --- A Knowledge Reuse Framework for Combining Multiple Partitions , 2002, J. Mach. Learn. Res..

[20]  Anil K. Jain,et al.  Clustering ensembles: models of consensus and weak partitions , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[21]  Vipin Kumar,et al.  A Fast and High Quality Multilevel Scheme for Partitioning Irregular Graphs , 1998, SIAM J. Sci. Comput..

[22]  Shashi Shekhar,et al.  Multilevel hypergraph partitioning: application in VLSI domain , 1997, DAC.

[23]  Jimeng Sun,et al.  Relevance search and anomaly detection in bipartite graphs , 2005, SKDD.

[24]  Horst D. Simon,et al.  Fast multilevel implementation of recursive spectral bisection for partitioning unstructured problems , 1994, Concurr. Pract. Exp..