A Comparative Study of Fuzzy C-Means Algorithm and Entropy-Based Fuzzy Clustering Algorithms

Fuzzy clustering is useful to mine complex and multi-dimensional data sets, where the members have partial or fuzzy relations. Among the various developed techniques, fuzzy-C-means (FCM) algorithm is the most popular one, where a piece of data has partial membership with each of the pre-defined cluster centers. Moreover, in FCM, the cluster centers are virtual, that is, they are chosen at random and thus might be out of the data set. The cluster centers and membership values of the data points with them are updated through some iterations. On the other hand, entropy-based fuzzy clustering (EFC) algorithm works based on a similarity-threshold value. Contrary to FCM, in EFC, the cluster centers are real, that is, they are chosen from the data points. In the present paper, the performances of these algorithms have been compared on four data sets, such as IRIS, WINES, OLITOS and psychosis (collected with the help of forty doctors), in terms of the quality of the clusters (that is, discrepancy factor, compactness, distinctness) obtained and their computational time. Moreover, the best set of clusters has been mapped into 2-D for visualization using a self-organizing map (SOM).

[1]  Karl Pearson,et al.  Annals of Eugenics. , 1926 .

[2]  J. C. Dunn,et al.  A Fuzzy Relative of the ISODATA Process and Its Use in Detecting Compact Well-Separated Clusters , 1973 .

[3]  James C. Bezdek,et al.  Pattern Recognition with Fuzzy Objective Function Algorithms , 1981, Advanced Applications in Pattern Recognition.

[4]  S. Lanteri,et al.  Chemometric analysis of Tuscan olive oils , 1989 .

[5]  James C. Bezdek,et al.  On cluster validity for the fuzzy c-means model , 1995, IEEE Trans. Fuzzy Syst..

[6]  Oren Etzioni,et al.  Web document clustering , 1998, SIGIR 1998.

[7]  Yi Zhang,et al.  Entropy-based subspace clustering for mining numerical data , 1999, KDD '99.

[8]  Manoranjan Dash,et al.  Entropy-based fuzzy clustering and fuzzy modeling , 2000, Fuzzy Sets Syst..

[9]  Chung-Horng Lung,et al.  - 1-Applications of Clustering to Early Software Life Cycle Phases , 2002 .

[10]  Riccardo Ortale,et al.  Similarity-based clustering of Web transactions , 2003, SAC '03.

[11]  Songul Albayrak,et al.  FUZZY C-MEANS CLUSTERING ON MEDICAL DIAGNOSTIC SYSTEMS , 2003 .

[12]  Ferenc Szeifert,et al.  Fuzzy Self-Organizing Map based on Regularized Fuzzy c-means Clustering , 2003 .

[13]  Dao-Qiang Zhang,et al.  A novel kernelized fuzzy C-means algorithm with application in medical image segmentation , 2004, Artif. Intell. Medicine.

[14]  山川 烈,et al.  Soft Computing , 2000, Soft Comput..

[15]  Dilip Kumar Pratihar,et al.  Performance Studies of Some Similarity-Based Fuzzy Clustering Algorithms , 2006 .

[16]  Dilip Kumar Pratihar,et al.  Some studies on mapping methods , 2006, Int. J. Bus. Intell. Data Min..

[17]  Limin Fu,et al.  FLAME, a novel fuzzy clustering method for the analysis of DNA microarray data , 2007, BMC Bioinformatics.

[18]  M. V. Velzen,et al.  Self-organizing maps , 2007 .

[19]  Nitesh Sinha,et al.  A fully automated algorithm under modified FCM framework for improved brain MR image segmentation. , 2009, Magnetic resonance imaging.

[20]  Habib Zaidi,et al.  A novel fuzzy C-means algorithm for unsupervised heterogeneous tumor quantification in PET. , 2010, Medical physics.

[21]  Stelios Krinidis,et al.  A Robust Fuzzy Local Information C-Means Clustering Algorithm , 2010, IEEE Transactions on Image Processing.

[22]  Keith C. C. Chan,et al.  Incremental Fuzzy Mining of Gene Expression Data for Gene Function Prediction , 2011, IEEE Transactions on Biomedical Engineering.