Structure ensemble based on fuzzy c-means

Clustering ensemble is a momentous technique in machine learning and contribute much to the applications in many areas. General clustering ensemble methods pay more attention to predicting cluster labels than structures of clusters. In fact, learning cluster structures implicates sufficient information to rebuild the dataset and is competent for being the replacement of redundant predicted cluster labels. In this paper, we introduce the fuzzy theory into the structure framework and propose a newfangled double fuzzy c-means structure ensemble framework, named as FCM2SE. FCM2SE makes use of the cluster structure information instead of predicted labels to gain a representative ensemble structure. We also design two novel labeling criteria to distribute the samples to the corresponding clusters. The empirical results on synthetic datasets and UCI machine learning datasets demonstrate the effectiveness of the proposed method.

[1]  Xiaohua Hu,et al.  Cluster Ensemble and Its Applications in Gene Expression Analysis , 2004, APBC.

[2]  William M. Rand,et al.  Objective Criteria for the Evaluation of Clustering Methods , 1971 .

[3]  Tossapon Boongoen,et al.  LCE: a link-based cluster ensemble method for improved gene expression data analysis , 2010, Bioinform..

[4]  Jane You,et al.  Hybrid cluster ensemble framework based on the random combination of data transformation operators , 2012, Pattern Recognit..

[5]  Andrea Tagarelli,et al.  Advancing data clustering via projective clustering ensembles , 2011, SIGMOD '11.

[6]  Zengyou He,et al.  A cluster ensemble method for clustering categorical data , 2005, Information Fusion.

[7]  Jane You,et al.  From cluster ensemble to structure ensemble , 2012, Inf. Sci..

[8]  Carla E. Brodley,et al.  Random Projection for High Dimensional Data Clustering: A Cluster Ensemble Approach , 2003, ICML.

[9]  Tossapon Boongoen,et al.  A Link-Based Approach to the Cluster Ensemble Problem , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10]  Zengyou He,et al.  Clustering Mixed Numeric and Categorical Data: A Cluster Ensemble Approach , 2005, ArXiv.

[11]  Zhiwen Yu,et al.  Class Discovery From Gene Expression Data Based on Perturbation and Cluster Ensemble , 2009, IEEE Transactions on NanoBioscience.

[12]  Li Guo,et al.  Classifier and Cluster Ensembles for Mining Concept Drifting Data Streams , 2010, 2010 IEEE International Conference on Data Mining.

[13]  Joydeep Ghosh,et al.  Cluster Ensembles --- A Knowledge Reuse Framework for Combining Multiple Partitions , 2002, J. Mach. Learn. Res..

[14]  Carla E. Brodley,et al.  Solving cluster ensemble problems by bipartite graph partitioning , 2004, ICML.

[15]  Yong Chen,et al.  Automatic malware categorization using cluster ensemble , 2010, KDD.

[16]  André Carlos Ponce de Leon Ferreira de Carvalho,et al.  Partitions selection strategy for set of clustering solutions , 2010, Neurocomputing.

[17]  Sandrine Dudoit,et al.  Bagging to Improve the Accuracy of A Clustering Procedure , 2003, Bioinform..