A multi-clustering fusion scheme for data partitioning

A multi-clustering fusion method is presented based on combining several runs of a clustering algorithm resulting in a common partition. More specifically, the results of several independent runs of the same clustering algorithm are appropriately combined to obtain a distinct partition of the data which is not affected by initialization and overcomes the instabilities of clustering methods. Subsequently, a fusion procedure is applied to the clusters generated during the previous phase to determine the optimal number of clusters in the data set according to some predefined criteria.

[1]  Anil K. Jain,et al.  Algorithms for Clustering Data , 1988 .

[2]  Andreas Stafylopatis,et al.  A divide-and-conquer method for multi-net classifiers , 2003, Pattern Analysis & Applications.

[3]  Georges Hébrail,et al.  Interactive Interpretation of Hierarchical Clustering , 1997, Intell. Data Anal..

[4]  Michalis Vazirgiannis,et al.  Clustering algorithms and validity measures , 2001, Proceedings Thirteenth International Conference on Scientific and Statistical Database Management. SSDBM 2001.

[5]  Peter C. Cheeseman,et al.  Bayesian Classification (AutoClass): Theory and Results , 1996, Advances in Knowledge Discovery and Data Mining.

[6]  Kurt Hornik,et al.  A voting-merging clustering algorithm , 1999 .

[7]  Hava T. Siegelmann,et al.  Support Vector Clustering , 2002, J. Mach. Learn. Res..

[8]  Douglas H. Fisher,et al.  Knowledge Acquisition Via Incremental Conceptual Clustering , 1987, Machine Learning.

[9]  James C. Bezdek,et al.  Pattern Recognition with Fuzzy Objective Function Algorithms , 1981, Advanced Applications in Pattern Recognition.

[10]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[11]  Anil K. Jain,et al.  Statistical Pattern Recognition: A Review , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[12]  Ana L. N. Fred,et al.  Finding Consistent Clusters in Data Partitions , 2001, Multiple Classifier Systems.

[13]  Subhash Sharma Applied multivariate techniques , 1995 .

[14]  Padhraic Smyth,et al.  Clustering Using Monte Carlo Cross-Validation , 1996, KDD.

[15]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[16]  Nikos A. Vlassis,et al.  A Greedy EM Algorithm for Gaussian Mixture Learning , 2002, Neural Processing Letters.

[17]  Andreas Stafylopatis,et al.  A Multi-SVM Classification System , 2001, Multiple Classifier Systems.

[18]  Andreas Stafylopatis,et al.  A Multi-clustering Fusion Algorithm , 2002, SETN.

[19]  Anil K. Jain,et al.  Unsupervised Learning of Finite Mixture Models , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[20]  Sankar K. Pal,et al.  Fuzzy models for pattern recognition : methods that search for structures in data , 1992 .

[21]  Robert P. W. Duin,et al.  STATISTICAL PATTERN RECOGNITION , 2005 .

[22]  Eytan Domany,et al.  Data Clustering Using a Model Granular Magnet , 1997, Neural Computation.

[23]  Vladimir Estivill-Castro,et al.  Cluster Validity Using Support Vector Machines , 2003, DaWaK.