Visual stability analysis for model selection in graded possibilistic clustering

Abstract When compared to standard clustering, fuzzy clustering provides more flexible and powerful data representation. Most fuzzy methods require setting some parameters, as is the case for the Graded Possibilistic c -Means clustering method, which has two parameters in addition to number of centroids. However, for this model selection task there is no well established criterion available. Building on our own previous work on fuzzy clustering similarity indexes, we introduce a technique to evaluate the stability of clusterings by using the fuzzy Jaccard index, and use this procedure to select the most suitable values of parameters. The experiments indicate that the procedure is effective for fuzzy and possibilistic clustering.

[1]  K. Menger Statistical Metrics. , 1942, Proceedings of the National Academy of Sciences of the United States of America.

[2]  Ricardo J. G. B. Campello,et al.  Generalized external indexes for comparing data partitions with overlapping categories , 2010, Pattern Recognit. Lett..

[3]  C. Tappert,et al.  A Survey of Binary Similarity and Distance Measures , 2010 .

[4]  James M. Keller,et al.  A possibilistic approach to clustering , 1993, IEEE Trans. Fuzzy Syst..

[5]  Isabelle Guyon,et al.  A Stability Based Method for Discovering Structure in Clustered Data , 2001, Pacific Symposium on Biocomputing.

[6]  Geoffrey C. Fox,et al.  A deterministic annealing approach to clustering , 1990, Pattern Recognit. Lett..

[7]  Shai Ben-David,et al.  A Sober Look at Clustering Stability , 2006, COLT.

[8]  Ludmila I. Kuncheva,et al.  Evaluation of Stability of k-Means Cluster Ensembles with Respect to Random Initialization , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  Davide Anguita,et al.  Worst case analysis of weight inaccuracy effects in multilayer perceptrons , 1999, IEEE Trans. Neural Networks.

[10]  James C. Bezdek,et al.  Visual Assessment of Clustering Tendency for Rectangular Dissimilarity Matrices , 2007, IEEE Transactions on Fuzzy Systems.

[11]  James M. Keller,et al.  Comparing Fuzzy, Probabilistic, and Possibilistic Partitions , 2010, IEEE Transactions on Fuzzy Systems.

[12]  Olfa Nasraoui,et al.  A robust estimator based on density and scale optimization and its application to clustering , 1996, Proceedings of IEEE 5th International Fuzzy Systems.

[13]  G. Shi Multivariate data analysis in palaeoecology and palaeobiogeography—a review , 1993 .

[14]  Michel Ménard,et al.  Possibilistic and probabilistic fuzzy clustering: unification within the framework of the non-extensive thermostatistics , 2003, Pattern Recognit..

[15]  Matteo Pardo,et al.  A stability based validity method for fuzzy clustering , 2010, Pattern Recognit..

[16]  Joydeep Ghosh,et al.  Cluster Ensembles --- A Knowledge Reuse Framework for Combining Multiple Partitions , 2002, J. Mach. Learn. Res..

[17]  Rajesh N. Davé,et al.  Robust clustering methods: a unified view , 1997, IEEE Trans. Fuzzy Syst..

[18]  R. Baker Kearfott,et al.  Introduction to Interval Analysis , 2009 .

[19]  James C. Bezdek,et al.  Visual cluster validity for prototype generator clustering models , 2003, Pattern Recognit. Lett..

[20]  Joachim M. Buhmann,et al.  Stability-Based Validation of Clustering Solutions , 2004, Neural Computation.

[21]  Rose,et al.  Statistical mechanics and phase transitions in clustering. , 1990, Physical review letters.

[22]  Francesco Masulli,et al.  Tuning Graded Possibilistic Clustering by Visual Stability Analysis , 2011, WILF.

[23]  Ana L. N. Fred,et al.  Data clustering using evidence accumulation , 2002, Object recognition supported by user interaction for service robots.

[24]  Mirko Navara,et al.  A survey on different triangular norm-based fuzzy logics , 1999, Fuzzy Sets Syst..

[25]  James C. Bezdek,et al.  Some new indexes of cluster validity , 1998, IEEE Trans. Syst. Man Cybern. Part B.

[26]  F. Klawonn Fuzzy clustering: insights and a new approach , 2004 .

[27]  Francesco Masulli,et al.  An Experimental Validation of Some Indexes of Fuzzy Clustering Similarity , 2009, WILF.

[28]  C. Mallows,et al.  A Method for Comparing Two Hierarchical Clusterings , 1983 .

[29]  James C. Bezdek,et al.  Pattern Recognition with Fuzzy Objective Function Algorithms , 1981, Advanced Applications in Pattern Recognition.

[30]  Francesco Masulli,et al.  Stability and Performances in Biclustering Algorithms , 2009, CIBB.

[31]  Robert E. Schapire,et al.  Efficient distribution-free learning of probabilistic concepts , 1990, Proceedings [1990] 31st Annual Symposium on Foundations of Computer Science.

[32]  P. Jaccard,et al.  Etude comparative de la distribution florale dans une portion des Alpes et des Jura , 1901 .

[33]  Robert Tibshirani,et al.  Estimating the number of clusters in a data set via the gap statistic , 2000 .

[34]  James M. Keller,et al.  The possibilistic C-means algorithm: insights and recommendations , 1996, IEEE Trans. Fuzzy Syst..

[35]  Francesco Masulli,et al.  Soft transition from probabilistic to possibilistic fuzzy clustering , 2006, IEEE Transactions on Fuzzy Systems.

[36]  William M. Rand,et al.  Objective Criteria for the Evaluation of Clustering Methods , 1971 .

[37]  Shai Ben-David,et al.  Relating Clustering Stability to Properties of Cluster Boundaries , 2008, COLT.

[38]  Ulrike von Luxburg,et al.  How the initialization affects the stability of the $k$-means algorithm , 2009, 0907.5494.

[39]  James C. Bezdek,et al.  A mixed c-means clustering model , 1997, Proceedings of 6th International Fuzzy Systems Conference.

[40]  Hichem Frigui,et al.  A Robust Competitive Clustering Algorithm With Applications in Computer Vision , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[41]  M. Meilă Comparing clusterings---an information based distance , 2007 .

[42]  Roelof K. Brouwer Extending the rand, adjusted rand and jaccard indices to fuzzy partitions , 2008, Journal of Intelligent Information Systems.

[43]  Mauro Barni,et al.  Comments on "A possibilistic approach to clustering" , 1996, IEEE Trans. Fuzzy Syst..

[44]  Lotfi A. Zadeh,et al.  Fuzzy Sets , 1996, Inf. Control..