Likelihood Based Fuzzy Clustering for Data Sets of Mixed Features

A noble clustering algorithm is presented for data sets of mixed features: numerical, ordinal and nominal. The algorithm uses the concept of fuzzy clustering to reduce negative effect from noises, and uses the iterative partitional algorithm founded on an optimization function to reduce the time complexity. The optimization function uses the likelihood for each individual feature as the optimization criterion of the similarity or likeliness between patterns and clusters, not like the fuzzy c-means clustering algorithm based on distance or the EM clustering algorithm. Hence the algorithm can quickly find fuzzy clusters having different distributions in the each feature level. The simulations show the algorithm to be quite efficient

[1]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[2]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[3]  Nils J. Nilsson,et al.  Artificial Intelligence: A New Synthesis , 1997 .

[4]  Anupam Joshi,et al.  Low-complexity fuzzy relational clustering algorithms for Web mining , 2001, IEEE Trans. Fuzzy Syst..

[5]  S. P. Lloyd,et al.  Least squares quantization in PCM , 1982, IEEE Trans. Inf. Theory.

[6]  Anil K. Jain,et al.  Data clustering: a review , 1999, CSUR.

[7]  James C. Bezdek,et al.  Pattern Recognition with Fuzzy Objective Function Algorithms , 1981, Advanced Applications in Pattern Recognition.

[8]  Gerardo Beni,et al.  A Validity Measure for Fuzzy Clustering , 1991, IEEE Trans. Pattern Anal. Mach. Intell..

[9]  J. Bezdek Cluster Validity with Fuzzy Sets , 1973 .

[10]  Mohamed A. Ismail,et al.  Fuzzy clustering for symbolic data , 1998, IEEE Trans. Fuzzy Syst..

[11]  Robert Hecht-Nielsen Cogent confabulation , 2005, Neural Networks.

[12]  Ohn Mar San,et al.  An alternative extension of the k-means algorithm for clustering categorical data , 2004 .

[13]  J. C. Dunn,et al.  A Fuzzy Relative of the ISODATA Process and Its Use in Detecting Compact Well-Separated Clusters , 1973 .

[14]  Michael K. Ng,et al.  A fuzzy k-modes algorithm for clustering categorical data , 1999, IEEE Trans. Fuzzy Syst..

[15]  Joydeep Ghosh,et al.  A Unified Framework for Model-based Clustering , 2003, J. Mach. Learn. Res..

[16]  Pavel Berkhin,et al.  A Survey of Clustering Data Mining Techniques , 2006, Grouping Multidimensional Data.

[17]  Edward A. Bender,et al.  Mathematical methods in artificial intelligence , 1996 .