Deterministic Initialization of the k-Means Algorithm using Hierarchical Clustering

K-means is undoubtedly the most widely used partitional clustering algorithm. Unfortunately, due to its gradient descent nature, this algorithm is highly sensitive to the initial placement of the cluster centers. Numerous initialization methods have been proposed to address this problem. Many of these methods, however, have superlinear complexity in the number of data points, making them impractical for large data sets. On the other hand, linear methods are often random and/or order-sensitive, which renders their results unrepeatable. Recently, Su and Dy proposed two highly successful hierarchical initialization methods named Var-Part and PCA-Part that are not only linear, but also deterministic (nonrandom) and order-invariant. In this paper, we propose a discriminant analysis based approach that addresses a common deficiency of these two methods. Experiments on a large and diverse collection of data sets from the UCI machine learning repository demonstrate that Var-Part and PCA-Part are highly competitive with one of the best random initialization methods to date, i.e. k-means++, and that the proposed approach significantly improves the performance of both hierarchical methods.

[1]  Anil K. Jain,et al.  Data clustering: a review , 1999, CSUR.

[2]  G H Ball,et al.  A clustering technique for summarizing multivariate data. , 1967, Behavioral science.

[3]  N. Otsu A threshold selection method from gray level histograms , 1979 .

[4]  Stephen J. Redmond,et al.  A method for initialising the K-means clustering algorithm using kd-trees , 2007, Pattern Recognit. Lett..

[5]  P.K Sahoo,et al.  A survey of thresholding techniques , 1988, Comput. Vis. Graph. Image Process..

[6]  Michael R. Anderberg,et al.  Cluster Analysis for Applications , 1973 .

[7]  Victoria Mantzopoulos,et al.  A Comparative Performance Study , 2011 .

[8]  Bülent Sankur,et al.  Survey over image thresholding techniques and quantitative performance evaluation , 2004, J. Electronic Imaging.

[9]  Carlos Ordonez,et al.  Efficient disk-based K-means clustering for relational databases , 2004, IEEE Transactions on Knowledge and Data Engineering.

[10]  Julius T. Tou,et al.  Pattern Recognition Principles , 1974 .

[11]  Anil K. Jain Data clustering: 50 years beyond K-means , 2008, Pattern Recognit. Lett..

[12]  Teofilo F. GONZALEZ,et al.  Clustering to Minimize the Maximum Intercluster Distance , 1985, Theor. Comput. Sci..

[13]  Seiji Yamada,et al.  Careful Seeding Method based on Independent Components Analysis for k-means Clustering , 2012 .

[14]  Sungzoon Cho,et al.  K-Means Clustering Seeds Initialization Based on Centrality, Sparsity, and Isotropy , 2009, IDEAL.

[15]  Francisco José Madrid-Cuevas,et al.  Evaluation of global thresholding techniques in non-contextual edge detection , 2005, Pattern Recognit. Lett..

[16]  Patricio A. Vela,et al.  A Comparative Study of Efficient Initialization Methods for the K-Means Clustering Algorithm , 2012, Expert Syst. Appl..

[17]  Meena Mahajan,et al.  The Planar k-means Problem is NP-hard I , 2009 .

[18]  S. P. Lloyd,et al.  Least squares quantization in PCM , 1982, IEEE Trans. Inf. Theory.

[19]  Paul S. Bradley,et al.  Refining Initial Points for K-Means Clustering , 1998, ICML.

[20]  Huan Liu,et al.  Merging Distance and Density Based Clustering , 2001 .

[21]  Sergei Vassilvitskii,et al.  k-means++: the advantages of careful seeding , 2007, SODA '07.

[22]  Gerald Schaefer,et al.  Lesion Border Detection in Dermoscopy Images Using Ensembles of Thresholding Methods , 2013, Skin research and technology : official journal of International Society for Bioengineering and the Skin (ISBS) [and] International Society for Digital Imaging of Skin (ISDIS) [and] International Society for Skin Imaging.

[23]  Ricardo J. G. B. Campello,et al.  Relative clustering validity criteria: A comparative overview , 2010, Stat. Anal. Data Min..

[24]  Takuji Nishimura,et al.  Mersenne twister: a 623-dimensionally equidistributed uniform pseudo-random number generator , 1998, TOMC.

[25]  R. Jancey Multidimensional group analysis , 1966 .

[26]  Nikos A. Vlassis,et al.  The global k-means clustering algorithm , 2003, Pattern Recognit..

[27]  Anil K. Jain,et al.  A self-organizing network for hyperellipsoidal clustering (HEC) , 1994, Proceedings of 1994 IEEE International Conference on Neural Networks (ICNN'94).

[28]  Greg Hamerly,et al.  Making k-means Even Faster , 2010, SDM.

[29]  M. Emre Celebi,et al.  Improving the performance of k-means for color quantization , 2011, Image Vis. Comput..

[30]  G. N. Lance,et al.  A general theory of classificatory sorting strategies: II. Clustering systems , 1967, Comput. J..

[31]  M M Astrahan SPEECH ANALYSIS BY CLUSTERING, OR THE HYPERPHONEME METHOD , 1970 .

[32]  Enrique H. Ruspini,et al.  Numerical methods for fuzzy clustering , 1970, Inf. Sci..

[33]  Yoshua Bengio,et al.  Convergence Properties of the K-Means Algorithms , 1994, NIPS.

[34]  Huan Liu,et al.  '1+1>2': merging distance and density based clustering , 2001, Proceedings Seventh International Conference on Database Systems for Advanced Applications. DASFAA 2001.

[35]  Man Lan,et al.  Initialization of cluster refinement algorithms: a review and comparative study , 2004, 2004 IEEE International Joint Conference on Neural Networks (IEEE Cat. No.04CH37541).

[36]  Jiang-She Zhang,et al.  Robust clustering by pruning outliers , 2003, IEEE Trans. Syst. Man Cybern. Part B.

[37]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[38]  Øivind Due Trier,et al.  Evaluation of Binarization Methods for Document Images , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[39]  Pedro Larrañaga,et al.  An empirical comparison of four initialization methods for the K-Means algorithm , 1999, Pattern Recognit. Lett..

[40]  E. Forgy,et al.  Cluster analysis of multivariate data : efficiency versus interpretability of classifications , 1965 .

[41]  Ali S. Hadi,et al.  Finding Groups in Data: An Introduction to Chster Analysis , 1991 .

[42]  Harold Hotelling,et al.  Simplified calculation of principal components , 1936 .

[43]  J. A. Hartigan,et al.  A k-means clustering algorithm , 1979 .

[44]  Edward Y. Chang,et al.  Parallel Spectral Clustering in Distributed Systems , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[45]  Jiye Liang,et al.  An initialization method for the K-Means algorithm using neighborhood model , 2009, Comput. Math. Appl..

[46]  Shao-Yi Chien,et al.  Bandwidth adaptive hardware architecture of K-Means clustering for intelligent video processing , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[47]  H. Späth,et al.  Computational experiences with the exchange method , 1977 .

[48]  Shao-Yi Chien,et al.  Bandwidth Adaptive Hardware Architecture of K-Means Clustering for Video Analysis , 2010, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[49]  G. W. Milligan,et al.  A study of standardization of variables in cluster analysis , 1988 .

[50]  Mohammad Al Hasan,et al.  Robust partitional clustering by outlier and density insensitive seeding , 2009, Pattern Recognit. Lett..

[51]  D.M. Mount,et al.  An Efficient k-Means Clustering Algorithm: Analysis and Implementation , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[52]  Mothd Belal Al-Daoud A New Algorithm for Cluster Initialization , 2005, WEC.

[53]  Ting Su,et al.  In search of deterministic methods for initializing K-means and Gaussian mixture clustering , 2007, Intell. Data Anal..

[54]  Sang Uk Lee,et al.  A comparative performance study of several global thresholding techniques for segmentation , 1990, Comput. Vis. Graph. Image Process..

[55]  Agostino Tarsitano,et al.  A computational study of several relocation methods for k-means algorithms , 2003, Pattern Recognit..

[56]  Pierre Hansen,et al.  NP-hardness of Euclidean sum-of-squares clustering , 2008, Machine Learning.

[57]  Shokri Z. Selim,et al.  K-Means-Type Algorithms: A Generalized Convergence Theorem and Characterization of Local Optimality , 1984, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[58]  M. Norusis IBM SPSS Statistics 19 Statistical Procedures Companion , 2011 .