A clustering method combining differential evolution with the K-means algorithm

The present paper considers the problem of partitioning a dataset into a known number of clusters using the sum of squared errors criterion (SSE). A new clustering method, called DE-KM, which combines differential evolution algorithm (DE) with the well known K-means procedure is described. In the method, the K-means algorithm is used to fine-tune each candidate solution obtained by mutation and crossover operators of DE. Additionally, a reordering procedure which allows the evolutionary algorithm to tackle the redundant representation problem is proposed. The performance of the DE-KM clustering method is compared to the performance of differential evolution, global K-means method, genetic K-means algorithm and two variants of the K-means algorithm. The experimental results show that if the number of clusters K is sufficiently large, DE-KM obtains solutions with lower SSE values than the other five algorithms.

[1]  Anil K. Jain,et al.  Data clustering: a review , 1999, CSUR.

[2]  Pierre Hansen,et al.  J-MEANS: a new local search heuristic for minimum sum of squares clustering , 1999, Pattern Recognit..

[3]  C. A. Murthy,et al.  In search of optimal clusters using genetic algorithms , 1996, Pattern Recognit. Lett..

[4]  Ujjwal Maulik,et al.  Genetic algorithm-based clustering technique , 2000, Pattern Recognit..

[5]  Pierre Hansen,et al.  NP-hardness of Euclidean sum-of-squares clustering , 2008, Machine Learning.

[6]  Shokri Z. Selim,et al.  K-Means-Type Algorithms: A Generalized Convergence Theorem and Characterization of Local Optimality , 1984, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  James C. Bezdek,et al.  Clustering with a genetically optimized approach , 1999, IEEE Trans. Evol. Comput..

[8]  Michael R. Anderberg,et al.  Cluster Analysis for Applications , 1973 .

[9]  Shokri Z. Selim,et al.  A simulated annealing algorithm for the clustering problem , 1991, Pattern Recognit..

[10]  Andries Petrus Engelbrecht,et al.  Differential evolution methods for unsupervised image classification , 2005, 2005 IEEE Congress on Evolutionary Computation.

[11]  Janez Brest,et al.  Self-Adapting Control Parameters in Differential Evolution: A Comparative Study on Numerical Benchmark Problems , 2006, IEEE Transactions on Evolutionary Computation.

[12]  Nasser M. Nasrabadi,et al.  Image coding using vector quantization: a review , 1988, IEEE Trans. Commun..

[13]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[14]  David E. Goldberg,et al.  Optimizing Global-Local Search Hybrids , 1999, GECCO.

[15]  Alex Alves Freitas,et al.  A Survey of Evolutionary Algorithms for Clustering , 2009, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[16]  Ali S. Hadi,et al.  Finding Groups in Data: An Introduction to Chster Analysis , 1991 .

[17]  Joseph C. Culberson,et al.  On the Futility of Blind Search: An Algorithmic View of No Free Lunch , 1998, Evolutionary Computation.

[18]  David E. Goldberg,et al.  Genetic Algorithms in Search Optimization and Machine Learning , 1988 .

[19]  Allen Gersho,et al.  Vector quantization and signal compression , 1991, The Kluwer international series in engineering and computer science.

[20]  J. H. Ward Hierarchical Grouping to Optimize an Objective Function , 1963 .

[21]  Swagatam Das,et al.  Automatic Clustering Using an Improved Differential Evolution Algorithm , 2007 .

[22]  Zbigniew Michalewicz,et al.  Genetic Algorithms + Data Structures = Evolution Programs , 1996, Springer Berlin Heidelberg.

[23]  Gerhard Reinelt,et al.  TSPLIB - A Traveling Salesman Problem Library , 1991, INFORMS J. Comput..

[24]  Rainer Storn,et al.  Differential Evolution – A Simple and Efficient Heuristic for global Optimization over Continuous Spaces , 1997, J. Glob. Optim..

[25]  Nikos A. Vlassis,et al.  The global k-means clustering algorithm , 2003, Pattern Recognit..

[26]  R. Fisher THE USE OF MULTIPLE MEASUREMENTS IN TAXONOMIC PROBLEMS , 1936 .

[27]  M. Narasimha Murty,et al.  Genetic K-means algorithm , 1999, IEEE Trans. Syst. Man Cybern. Part B.

[28]  Wojciech Kwedlo,et al.  Parallelizing Evolutionary Algorithms for Clustering Data , 2005, PPAM.

[29]  Dayou Liu,et al.  K-harmonic means data clustering with Differential Evolution , 2009, 2009 International Conference on Future BioMedical Information Engineering (FBIE).

[30]  Olli Nevalainen,et al.  Genetic Algorithms for Large-Scale Clustering Problems , 1997, Comput. J..

[31]  Pierre Hansen,et al.  An Interior Point Algorithm for Minimum Sum-of-Squares Clustering , 1997, SIAM J. Sci. Comput..

[32]  Michael J. Laszlo,et al.  A genetic algorithm that exchanges neighboring centers for k-means clustering , 2007, Pattern Recognit. Lett..

[33]  Sandra Paterlini,et al.  Differential evolution and particle swarm optimisation in partitional clustering , 2006, Comput. Stat. Data Anal..

[34]  C.-C. Jay Kuo,et al.  A new initialization technique for generalized Lloyd iteration , 1994, IEEE Signal Processing Letters.

[35]  Suresh Chandra Satapathy,et al.  Effective Image Clustering with Differential Evolution Technique , 2011 .