Effective Deterministic Initialization for $k$-Means-Like Methods via Local Density Peaks Searching

The $k$-means clustering algorithm is popular but has the following main drawbacks: 1) the number of clusters, $k$, needs to be provided by the user in advance, 2) it can easily reach local minima with randomly selected initial centers, 3) it is sensitive to outliers, and 4) it can only deal with well separated hyperspherical clusters. In this paper, we propose a Local Density Peaks Searching (LDPS) initialization framework to address these issues. The LDPS framework includes two basic components: one of them is the local density that characterizes the density distribution of a data set, and the other is the local distinctiveness index (LDI) which we introduce to characterize how distinctive a data point is compared with its neighbors. Based on these two components, we search for the local density peaks which are characterized with high local densities and high LDIs to deal with 1) and 2). Moreover, we detect outliers characterized with low local densities but high LDIs, and exclude them out before clustering begins. Finally, we apply the LDPS initialization framework to $k$-medoids, which is a variant of $k$-means and chooses data samples as centers, with diverse similarity measures other than the Euclidean distance to fix the last drawback of $k$-means. Combining the LDPS initialization framework with $k$-means and $k$-medoids, we obtain two novel clustering methods called LDPS-means and LDPS-medoids, respectively. Experiments on synthetic data sets verify the effectiveness of the proposed methods, especially when the ground truth of the cluster number $k$ is large. Further, experiments on several real world data sets, Handwritten Pendigits, Coil-20, Coil-100 and Olivetti Face Database, illustrate that our methods give a superior performance than the analogous approaches on both estimating $k$ and unsupervised object categorization.

[1]  Pasi Fränti,et al.  Iterative shrinking method for clustering problems , 2006, Pattern Recognit..

[2]  Yoshua Bengio,et al.  Convergence Properties of the K-Means Algorithms , 1994, NIPS.

[3]  T. Moon The expectation-maximization algorithm , 1996, IEEE Signal Process. Mag..

[4]  Victoria J. Hodge,et al.  A Survey of Outlier Detection Methodologies , 2004, Artificial Intelligence Review.

[5]  Andrew W. Moore,et al.  X-means: Extending K-means with Efficient Estimation of the Number of Clusters , 2000, ICML.

[6]  Fionn Murtagh,et al.  Algorithms for hierarchical clustering: an overview , 2012, WIREs Data Mining Knowl. Discov..

[7]  Peter Meer,et al.  Semi-Supervised Kernel Mean Shift Clustering , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  Tomi Kinnunen,et al.  Improving K-Means by Outlier Removal , 2005, SCIA.

[9]  Ting Su,et al.  In search of deterministic methods for initializing K-means and Gaussian mixture clustering , 2007, Intell. Data Anal..

[10]  W. Rudin Principles of mathematical analysis , 1964 .

[11]  Jiang-She Zhang,et al.  Robust clustering by pruning outliers , 2003, IEEE Trans. Syst. Man Cybern. Part B.

[12]  Sean Hughes,et al.  Clustering by Fast Search and Find of Density Peaks , 2016 .

[13]  Sergei Vassilvitskii,et al.  k-means++: the advantages of careful seeding , 2007, SODA '07.

[14]  Ali S. Hadi,et al.  Finding Groups in Data: An Introduction to Chster Analysis , 1991 .

[15]  Chih-Jen Lin,et al.  A Practical Guide to Support Vector Classication , 2008 .

[16]  Zhou Wang,et al.  Complex Wavelet Structural Similarity: A New Image Similarity Index , 2009, IEEE Transactions on Image Processing.

[17]  Argyris Kalogeratos,et al.  Dip-means: an incremental clustering method for estimating the number of clusters , 2012, NIPS.

[18]  Sameer A. Nene,et al.  Columbia Object Image Library (COIL100) , 1996 .

[19]  Jiye Liang,et al.  An initialization method for the K-Means algorithm using neighborhood model , 2009, Comput. Math. Appl..

[20]  Aristidis Likas,et al.  The MinMax k-Means clustering algorithm , 2014, Pattern Recognit..

[21]  Philip S. Yu,et al.  Top 10 algorithms in data mining , 2007, Knowledge and Information Systems.

[22]  Pavel Berkhin,et al.  A Survey of Clustering Data Mining Techniques , 2006, Grouping Multidimensional Data.

[23]  J. Hartigan,et al.  The Dip Test of Unimodality , 1985 .

[24]  Rui Xu,et al.  Survey of clustering algorithms , 2005, IEEE Transactions on Neural Networks.

[25]  Andy Harter,et al.  Parameterisation of a stochastic model for human face identification , 1994, Proceedings of 1994 IEEE Workshop on Applications of Computer Vision.

[26]  Lorenzo Rosasco,et al.  Learning Manifolds with K-Means and K-Flats , 2012, NIPS.

[27]  Andrew Y. Ng,et al.  The Importance of Encoding Versus Training with Sparse Coding and Vector Quantization , 2011, ICML.

[28]  Simone Santini,et al.  Similarity Measures , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[29]  Patricio A. Vela,et al.  A Comparative Study of Efficient Initialization Methods for the K-Means Clustering Algorithm , 2012, Expert Syst. Appl..

[30]  H. Sebastian Seung,et al.  The Manifold Ways of Perception , 2000, Science.

[31]  Fevzi Alimo Methods of Combining Multiple Classiiers Based on Diierent Representations for Pen-based Handwritten Digit Recognition , 1996 .

[32]  D. Rajan Probability, Random Variables, and Stochastic Processes , 2017 .

[33]  S. Sheather Density Estimation , 2004 .

[34]  Kamesh Munagala,et al.  Local Search Heuristics for k-Median and Facility Location Problems , 2004, SIAM J. Comput..

[35]  L. Wasserman,et al.  A Reference Bayesian Test for Nested Hypotheses and its Relationship to the Schwarz Criterion , 1995 .

[36]  Pasi Fränti,et al.  A Dynamic local search algorithm for the clustering problem , 2002 .

[37]  Shokri Z. Selim,et al.  K-Means-Type Algorithms: A Generalized Convergence Theorem and Characterization of Local Optimality , 1984, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[38]  Greg Hamerly,et al.  Learning the k in k-means , 2003, NIPS.

[39]  Christos Boutsidis,et al.  Randomized Dimensionality Reduction for $k$ -Means Clustering , 2011, IEEE Transactions on Information Theory.

[40]  Anil K. Jain,et al.  Data clustering: a review , 1999, CSUR.

[41]  Brendan J. Frey,et al.  Non-metric affinity propagation for unsupervised image categorization , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[42]  Fengfu Li,et al.  A new manifold distance measure for visual object categorization , 2016, 2016 12th World Congress on Intelligent Control and Automation (WCICA).

[43]  Yuanyuan Li,et al.  Feature selection based on sensitivity analysis of fuzzy ISODATA , 2012, Neurocomputing.

[44]  Hassan A. Kingravi,et al.  Linear, Deterministic, and Order-Invariant Initialization Methods for the K-Means Clustering Algorithm , 2014, ArXiv.

[45]  Hans-Peter Kriegel,et al.  A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise , 1996, KDD.

[46]  Anil K. Jain Data clustering: 50 years beyond K-means , 2008, Pattern Recognit. Lett..

[47]  Hae-Sang Park,et al.  A simple and fast algorithm for K-medoids clustering , 2009, Expert Syst. Appl..

[48]  Longbing Cao,et al.  A novel graph-based k-means for nonlinear manifold clustering and representative selection , 2014, Neurocomputing.