Rough clustering using generalized fuzzy clustering algorithm

In this paper, we present a rough k-means clustering algorithm based on minimizing the dissimilarity, which is defined in terms of the squared Euclidean distances between data points and their closest cluster centers. This approach is referred to as generalized rough fuzzy k-means (GRFKM) algorithm. The proposed method solves the divergence problem of available approaches, where the cluster centers may not be converged to their final positions, and reduces the number of user-defined parameters. The presented method is shown to be converged experimentally. Compared to available rough k-means clustering algorithms, the proposed method provides less computing time. Unlike available approaches, the convergence of the proposed method is independent of the used threshold value. Moreover, it yields better clustering results than RFKM for the handwritten digits data set, landsat satellite data set and synthetic data set, in terms of validity indices. Compared to MRKM and RFKM, GRFKM can reduce the value of Xie-Beni index using the handwritten digits data set, where a lower Xie-Beni index value implies the better clustering quality. The proposed method can be applied to handle real life situations needing reasoning with uncertainty.

[1]  Jim Z. C. Lai,et al.  Image restoration of compressed image using classified vector quantization , 2002, Pattern Recognit..

[2]  Ramiz M. Aliguliyev,et al.  Performance evaluation of density-based clustering methods , 2009, Inf. Sci..

[3]  Wen-Tsung Huang,et al.  VQ indexes compression and information hiding using hybrid lossless index coding , 2009, Digit. Signal Process..

[4]  Christian Döring,et al.  Data analysis with fuzzy clustering methods , 2006, Comput. Stat. Data Anal..

[5]  Yi-Ching Liaw,et al.  A fast VQ codebook generation algorithm using codeword displacement , 2008, Pattern Recognit..

[6]  Pawan Lingras,et al.  Interval Set Clustering of Web Users with Rough K-Means , 2004, Journal of Intelligent Information Systems.

[7]  Peter J. Rousseeuw,et al.  Finding Groups in Data: An Introduction to Cluster Analysis , 1990 .

[8]  Thippur V. Sreenivas,et al.  Reduced complexity two stage vector quantization , 2009, Digit. Signal Process..

[9]  J. Jäkel,et al.  A New Convergence Proof of Fuzzy c-Means , 2005, IEEE Transactions on Fuzzy Systems.

[10]  Dirk Cattrysse,et al.  Pairwise-adaptive dissimilarity measure for document clustering , 2010, Inf. Sci..

[11]  Witold Pedrycz,et al.  Shadowed c-means: Integrating fuzzy and rough clustering , 2010, Pattern Recognit..

[12]  N. de Freitas,et al.  On-line probabilistic classification with particle filters , 2000, Neural Networks for Signal Processing X. Proceedings of the 2000 IEEE Signal Processing Society Workshop (Cat. No.00TH8501).

[13]  Nasrollah Moghadam Charkari,et al.  An approach in web content mining for clustering web pages , 2010, 2010 Fifth International Conference on Digital Information Management (ICDIM).

[14]  Allen Gersho,et al.  Vector quantization and signal compression , 1991, The Kluwer international series in engineering and computer science.

[15]  Jerry M. Mendel,et al.  Optimality tests for the fuzzy c-means algorithm , 1994, Pattern Recognit..

[16]  Jim Z. C. Lai,et al.  Fast Search Algorithms for VQ Codebook Generation , 1996, J. Vis. Commun. Image Represent..

[17]  Sankar K. Pal,et al.  Data mining in soft computing framework: a survey , 2002, IEEE Trans. Neural Networks.

[18]  Gerardo Beni,et al.  A Validity Measure for Fuzzy Clustering , 1991, IEEE Trans. Pattern Anal. Mach. Intell..

[19]  Karina Gibert,et al.  Knowledge discovery with clustering based on rules by states: A water treatment application , 2010, Environ. Model. Softw..

[20]  Sushmita Mitra An evolutionary rough partitive clustering , 2004, Pattern Recognit. Lett..

[21]  Georg Peters,et al.  Some refinements of rough k-means clustering , 2006, Pattern Recognit..

[22]  Donald W. Bouldin,et al.  A Cluster Separation Measure , 1979, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[23]  Jim Z. C. Lai,et al.  A Fuzzy K-means Clustering Algorithm Using Cluster Center Displacement , 2009, J. Inf. Sci. Eng..

[24]  Sankar K. Pal,et al.  RFCM: A Hybrid Clustering Algorithm Using Rough and Fuzzy Sets , 2007, Fundam. Informaticae.

[25]  Bin Ma,et al.  Speaker Clustering and Cluster Purification Methods for RT07 and RT09 Evaluation Meeting Data , 2012, IEEE Transactions on Audio, Speech, and Language Processing.

[26]  Francisco Herrera,et al.  A practical tutorial on the use of nonparametric statistical tests as a methodology for comparing evolutionary and swarm intelligence algorithms , 2011, Swarm Evol. Comput..