An efficient k 0-means clustering algorithm

This paper introduces k0-means algorithm that performs correct clustering without pre-assigning the exact number of clusters. This is achieved by minimizing a suggested cost-function. The cost-function extends the mean-square-error cost-function of k-means. The algorithm consists of two separate steps. The first is a pre-processing procedure that performs initial clustering and assigns at least one seed point to each cluster. During the second step, the seed-points are adjusted to minimize the cost-function. The algorithm automatically penalizes any possible winning chances for all rival seed-points in subsequent iterations. When the cost-function reaches a global minimum, the correct number of clusters is determined and the remaining seed points are located near the centres of actual clusters. The simulated experiments described in this paper confirm good performance of the proposed algorithm. 2008 Elsevier B.V. All rights reserved.

[1]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[2]  H. Bozdogan Model selection and Akaike's Information Criterion (AIC): The general theory and its analytical extensions , 1987 .

[3]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[4]  Lei Xu,et al.  Bayesian Ying-Yang machine, clustering and number of clusters , 1997, Pattern Recognit. Lett..

[5]  C. S. Wallace,et al.  Unsupervised Learning Using MML , 1996, ICML.

[6]  Michael J. Laszlo,et al.  A genetic algorithm that exchanges neighboring centers for k-means clustering , 2007, Pattern Recognit. Lett..

[7]  James C. Bezdek,et al.  Pattern Recognition with Fuzzy Objective Function Algorithms , 1981, Advanced Applications in Pattern Recognition.

[8]  David L. Dowe,et al.  Minimum Message Length and Kolmogorov Complexity , 1999, Comput. J..

[9]  Michael J. Brusco,et al.  Initializing K-means Batch Clustering: A Critical Evaluation of Several Techniques , 2007, J. Classif..

[10]  Shehroz S. Khan,et al.  Cluster center initialization algorithm for K-means clustering , 2004, Pattern Recognit. Lett..

[11]  H. Akaike,et al.  Information Theory and an Extension of the Maximum Likelihood Principle , 1973 .

[12]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[13]  Stephen J. Redmond,et al.  A method for initialising the K-means clustering algorithm using kd-trees , 2007, Pattern Recognit. Lett..

[14]  Lei Xu,et al.  Rival penalized competitive learning , 2007, Scholarpedia.

[15]  Stephen J. Roberts,et al.  Maximum certainty data partitioning , 2000, Pattern Recognit..

[16]  Jinwen Ma,et al.  The Mahalanobis Distance Based Rival Penalized Competitive Learning Algorithm , 2006, ISNN.

[17]  Yiu-ming Cheung,et al.  Color image segmentation using rival penalized controlled competitive learning , 2003, Proceedings of the International Joint Conference on Neural Networks, 2003..