A Modified Fuzzy K-means Clustering using Expectation Maximization

K-means is a popular clustering algorithm that requires a huge initial set to start the clustering. K-means is an unsupervised clustering method which does not guarantee convergence. Numerous improvements to K-means have been done to make its performance better. Expectation Maximization is a statistical technique for maximum likelihood estimation using mixture models. It searches for a local maxima and generally converges very well. The proposed algorithm combines these two algorithms to generate optimum clusters which do not require a huge value of K and each cluster attains a more natural shape and guarantee convergence. The paper compares the new method with Fuzzy K-means on benchmark iris data.

[1]  M. Aldenderfer Cluster Analysis , 1984 .

[2]  Carl G. Looney,et al.  Interactive clustering and merging with a new fuzzy expected value , 2002, Pattern Recognit..

[3]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[4]  E. Forgy Cluster analysis of multivariate data : efficiency versus interpretability of classifications , 1965 .

[5]  Michael I. Jordan,et al.  Supervised learning from incomplete data via an EM approach , 1993, NIPS.

[6]  L. A. Zadeh,et al.  Fuzzy logic and approximate reasoning , 1975, Synthese.

[7]  Michael J. A. Berry,et al.  Data mining techniques - for marketing, sales, and customer support , 1997, Wiley computer publishing.

[8]  Kevin P. Murphy,et al.  A coupled HMM for audio-visual speech recognition , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[9]  Nikolas P. Galatsanos,et al.  Object recognition based on impulse restoration with use of the expectation-maximization algorithm , 1998 .

[10]  Wael Abd-Almageed,et al.  Mixture models for dynamic statistical pressure snakes , 2002, Object recognition supported by user interaction for service robots.

[11]  James C. Bezdek,et al.  Pattern Recognition with Fuzzy Objective Function Algorithms , 1981, Advanced Applications in Pattern Recognition.

[12]  S. Gunn Support Vector Machines for Classification and Regression , 1998 .

[13]  H. Surmann,et al.  Self-Organizing and Genetic Algorithms for an Automatic Design of Fuzzy Control and Decision Systems , 1993 .

[14]  Wael Abd-Almageed,et al.  Non-parametric expectation maximization: a learning automata approach , 2003, SMC'03 Conference Proceedings. 2003 IEEE International Conference on Systems, Man and Cybernetics. Conference Theme - System Security and Assurance (Cat. No.03CH37483).

[15]  Hyungsuck Cho Opto-mechatronic systems handbook techniques and applications , 2003 .

[16]  Norio Watanabe,et al.  Fuzzy k-means clustering with crisp regions , 2001, 10th IEEE International Conference on Fuzzy Systems. (Cat. No.01CH37297).

[17]  Shokri Z. Selim,et al.  K-Means-Type Algorithms: A Generalized Convergence Theorem and Characterization of Local Optimality , 1984, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[18]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[19]  Vladimir Estivill-Castro,et al.  Fast and Robust General Purpose Clustering Algorithms , 2000, Data Mining and Knowledge Discovery.

[20]  Isak Gath,et al.  Unsupervised Optimal Fuzzy Clustering , 1989, IEEE Trans. Pattern Anal. Mach. Intell..

[21]  R. Fisher THE USE OF MULTIPLE MEASUREMENTS IN TAXONOMIC PROBLEMS , 1936 .

[22]  Paul S. Bradley,et al.  Refining Initial Points for K-Means Clustering , 1998, ICML.

[23]  V. Estivill-Castro,et al.  A Fast and Robust General Purpose Clustering Algorithm , 2000 .

[24]  U. Fayyad,et al.  Scaling EM (Expectation Maximization) Clustering to Large Databases , 1998 .