Automatic Determination of the Number of Fuzzy Clusters Using Simulated Annealing with Variable Representation

In this article a simulated annealing based approach for automatically clustering a data set into a number of fuzzy partitions is proposed. This is in contrast to the widely used fuzzy clustering scheme, the fuzzy C-Means (FCM) algorithm, which requires the a priori knowledge of the number of clusters. The said approach uses a real-coded variable representation of the cluster centers encoded as a state of the simulated annealing, while optimizing the Xie-Beni cluster validity index. In order to automatically determine the number of clusters, the perturbation operator is defined appropriately so that it can alter the cluster centers, and increase as well as decrease the encoded number of cluster centers. The operators are designed using some domain specific information. The effectiveness of the proposed technique in determining the appropriate number of clusters is demonstrated for both artificial and real-life data sets.

[1]  Gerardo Beni,et al.  A Validity Measure for Fuzzy Clustering , 1991, IEEE Trans. Pattern Anal. Mach. Intell..

[2]  C. D. Gelatt,et al.  Optimization by Simulated Annealing , 1983, Science.

[3]  Joachim M. Buhmann,et al.  Semi-supervised Image Segmentation by Parametric Distributional Clustering , 2003, EMMCVPR.

[4]  Anil K. Jain,et al.  Algorithms for Clustering Data , 1988 .

[5]  Sankar K. Pal,et al.  Fuzzy multi-layer perceptron, inferencing and rule generation , 1995, IEEE Trans. Neural Networks.

[6]  Tom Heskes,et al.  Model clustering by deterministic annealing , 1999, ESANN.

[7]  Michael P. Windham,et al.  Cluster Validity for the Fuzzy c-Means Clustering Algorithrm , 1982, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  James C. Bezdek,et al.  Pattern Recognition with Fuzzy Objective Function Algorithms , 1981, Advanced Applications in Pattern Recognition.

[9]  J. Bezdek Cluster Validity with Fuzzy Sets , 1973 .

[10]  Reginald E. Hammah,et al.  Validity Measures for the Fuzzy Cluster Analysis of Orientations , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[11]  SANGHAMITRA BANDYOPADHYAY,et al.  Clustering Using Simulated Annealing with Probabilistic Redistribution , 2001, Int. J. Pattern Recognit. Artif. Intell..

[12]  Emile H. L. Aarts,et al.  Simulated Annealing: Theory and Applications , 1987, Mathematics and Its Applications.

[13]  J. C. Dunn,et al.  A Fuzzy Relative of the ISODATA Process and Its Use in Detecting Compact Well-Separated Clusters , 1973 .

[14]  Donald W. Bouldin,et al.  A Cluster Separation Measure , 1979, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15]  James C. Bezdek,et al.  Validity-guided (re)clustering with applications to image segmentation , 1996, IEEE Trans. Fuzzy Syst..

[16]  Rainer Fuchs,et al.  Analysis of temporal gene expression profiles: clustering by simulated annealing and determining the optimal number of clusters , 2001, Bioinform..