Agglomerative Fuzzy K-Means Clustering Algorithm with Selection of Number of Clusters

In this paper, we present an agglomerative fuzzy K-means clustering algorithm for numerical data, an extension to the standard fuzzy K-means algorithm by introducing a penalty term to the objective function to make the clustering process not sensitive to the initial cluster centers. The new algorithm can produce more consistent clustering results from different sets of initial clusters centers. Combined with cluster validation techniques, the new algorithm can determine the number of clusters in a data set, which is a well known problem in $k$-means clustering. Experimental results on synthetic data sets (2 to 5 dimensions, 500 to 5000 objects and 3 to 7 clusters), the BIRCH two-dimensional data set of 20000 objects and 100 clusters, and the WINE data set of 178 objects, 17 dimensions and 3 clusters from UCI, have demonstrated the effectiveness of the new algorithm in producing consistent clustering results and determining the correct number of clusters in different data sets, some with overlapping inherent clusters.

[1]  Sadaaki Miyamoto,et al.  Fuzzy c-means as a regularization and maximum entropy approach , 1997 .

[2]  G H Ball,et al.  A clustering technique for summarizing multivariate data. , 1967, Behavioral science.

[3]  Hichem Frigui,et al.  Clustering by competitive agglomeration , 1997, Pattern Recognit..

[4]  Greg Hamerly,et al.  Alternatives to the k-means algorithm that find better clusterings , 2002, CIKM '02.

[5]  Padhraic Smyth,et al.  Model selection for probabilistic clustering using cross-validated likelihood , 2000, Stat. Comput..

[6]  Greg Hamerly,et al.  Learning the k in k-means , 2003, NIPS.

[7]  M. Narasimha Murty,et al.  Genetic K-means algorithm , 1999, IEEE Trans. Syst. Man Cybern. Part B.

[8]  Frank Hoeppner,et al.  Fuzzy shell clustering algorithms in image processing: fuzzy C-rectangular and 2-rectangular shells , 1997, IEEE Trans. Fuzzy Syst..

[9]  Lei Xu,et al.  Bayesian Ying-Yang machine, clustering and number of clusters , 1997, Pattern Recognit. Lett..

[10]  Tian Zhang,et al.  BIRCH: A New Data Clustering Algorithm and Its Applications , 1997, Data Mining and Knowledge Discovery.

[11]  Isak Gath,et al.  Unsupervised Optimal Fuzzy Clustering , 1989, IEEE Trans. Pattern Anal. Mach. Intell..

[12]  M. Narasimha Murty,et al.  A near-optimal initial seed value selection in K-means means algorithm using a genetic algorithm , 1993, Pattern Recognit. Lett..

[13]  James C. Bezdek,et al.  On cluster validity for the fuzzy c-means model , 1995, IEEE Trans. Fuzzy Syst..

[14]  Wei Pan,et al.  Bootstrapping Likelihood for Model Selection with Small Samples , 1999 .

[15]  H. Akaike A new look at the statistical model identification , 1974 .

[16]  Hichem Frigui,et al.  Fuzzy and possibilistic shell clustering algorithms and their application to boundary detection and surface approximation. II , 1995, IEEE Trans. Fuzzy Syst..

[17]  William M. Rand,et al.  Objective Criteria for the Evaluation of Clustering Methods , 1971 .

[18]  Boudewijn P. F. Lelieveldt,et al.  A new cluster validity index for the fuzzy c-mean , 1998, Pattern Recognit. Lett..

[19]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[20]  B. Leroux Consistent estimation of a mixing distribution , 1992 .

[21]  Enrique H. Ruspini,et al.  A New Approach to Clustering , 1969, Inf. Control..

[22]  Michael K. Ng,et al.  Automated variable weighting in k-means type clustering , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[23]  P. Deb Finite Mixture Models , 2008 .

[24]  Shengrui Wang,et al.  FCM-Based Model Selection Algorithms for Determining the Number of Clusters , 2004, Pattern Recognit..

[25]  Michael J. Laszlo,et al.  A genetic algorithm that exchanges neighboring centers for k-means clustering , 2007, Pattern Recognit. Lett..

[26]  Adele Cutler,et al.  Information Ratios for Validating Mixture Analysis , 1992 .

[27]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[28]  Yiu-ming Cheung,et al.  Maximum weighted likelihood via rival penalized EM for density mixture clustering with automatic model selection , 2005, IEEE Transactions on Knowledge and Data Engineering.

[29]  C. L. Philip Chen,et al.  Cluster number selection for a small set of samples using the Bayesian Ying-Yang model , 2002, IEEE Trans. Neural Networks.

[30]  James C. Bezdek,et al.  A Convergence Theorem for the Fuzzy ISODATA Clustering Algorithms , 1980, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[31]  V. J. Rayward-Smith,et al.  Fuzzy Cluster Analysis: Methods for Classification, Data Analysis and Image Recognition , 1999 .

[32]  H. Bozdogan Choosing the Number of Component Clusters in the Mixture-Model Using a New Informational Complexity Criterion of the Inverse-Fisher Information Matrix , 1993 .

[33]  Michael J. Laszlo,et al.  A genetic algorithm using hyper-quadtrees for low-dimensional k-means clustering , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[34]  Lei Xu,et al.  Rival penalized competitive learning, finite mixture, and multisets clustering , 1998, 1998 IEEE International Joint Conference on Neural Networks Proceedings. IEEE World Congress on Computational Intelligence (Cat. No.98CH36227).

[35]  Umeshwar Dayal,et al.  K-Harmonic Means - A Data Clustering Algorithm , 1999 .

[36]  Anil K. Jain,et al.  Algorithms for Clustering Data , 1988 .

[37]  G. W. Milligan,et al.  An examination of procedures for determining the number of clusters in a data set , 1985 .

[38]  André Hardy,et al.  An examination of procedures for determining the number of clusters in a data set , 1994 .

[39]  Michael R. Anderberg,et al.  Cluster Analysis for Applications , 1973 .

[40]  Sergei Vassilvitskii,et al.  k-means++: the advantages of careful seeding , 2007, SODA '07.

[41]  Wei Pan,et al.  Bootstrapping Likelihood for Model Selection with Small Samples , 1998 .

[42]  Lei Xu,et al.  How many clusters?: A Ying-Yang machine based theory for a classical open problem in pattern recognition , 1996, Proceedings of International Conference on Neural Networks (ICNN'96).