Application of the cross-entropy method to clustering and vector quantization

We apply the cross-entropy (CE) method to problems in clustering and vector quantization. The CE algorithm for clustering involves the following iterative steps: (a) generate random clusters according to a specified parametric probability distribution, (b) update the parameters of this distribution according to the Kullback–Leibler cross-entropy. Through various numerical experiments, we demonstrate the high accuracy of the CE algorithm and show that it can generate near-optimal clusters for fairly large data sets. We compare the CE method with well-known clustering and vector quantization methods such as K-means, fuzzy K-means and linear vector quantization, and apply each method to benchmark and image analysis data.

[1]  David G. Stork,et al.  Pattern Classification , 1973 .

[2]  James C. Bezdek,et al.  Pattern Recognition with Fuzzy Objective Function Algorithms , 1981, Advanced Applications in Pattern Recognition.

[3]  Donald Geman,et al.  Stochastic Relaxation, Gibbs Distributions, and the Bayesian Restoration of Images , 1984, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  David E. Goldberg,et al.  Genetic Algorithms in Search Optimization and Machine Learning , 1988 .

[5]  Stuart German,et al.  Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images , 1988 .

[6]  J. Bezdek,et al.  Recent convergence results for the fuzzy c-means clustering algorithms , 1988 .

[7]  D. E. Goldberg,et al.  Genetic Algorithms in Search , 1989 .

[8]  Peter J. Rousseeuw,et al.  Finding Groups in Data: An Introduction to Cluster Analysis , 1990 .

[9]  Stanley C. Ahalt,et al.  Competitive learning algorithms for vector quantization , 1990, Neural Networks.

[10]  Gerhard Reinelt,et al.  TSPLIB - A Traveling Salesman Problem Library , 1991, INFORMS J. Comput..

[11]  Lisa M. Brown,et al.  A survey of image registration techniques , 1992, CSUR.

[12]  Margrit Betke,et al.  Fast object recognition in noisy images using simulated annealing , 1995, Proceedings of IEEE International Conference on Computer Vision.

[13]  Jia-Lin Chen,et al.  Unsupervised texture segmentation using multichannel decomposition and hidden Markov models , 1995, IEEE Trans. Image Process..

[14]  G. McLachlan,et al.  The EM algorithm and extensions , 1996 .

[15]  David Salomon,et al.  Data Compression: The Complete Reference , 2006 .

[16]  Teuvo Kohonen,et al.  The self-organizing map , 1990, Neurocomputing.

[17]  R. Rubinstein The Cross-Entropy Method for Combinatorial and Continuous Optimization , 1999 .

[18]  Anil K. Jain,et al.  Data clustering: a review , 1999, CSUR.

[19]  G. Di Caro,et al.  Ant colony optimization: a new meta-heuristic , 1999, Proceedings of the 1999 Congress on Evolutionary Computation-CEC99 (Cat. No. 99TH8406).

[20]  Marco Dorigo,et al.  The ant colony optimization meta-heuristic , 1999 .

[21]  Leyuan Shi,et al.  Nested Partitions Method for Global Optimization , 2000, Oper. Res..

[22]  Pierre Hansen,et al.  Variable Neighborhood Decomposition Search , 1998, J. Heuristics.

[23]  Shigeo Abe DrEng Pattern Classification , 2001, Springer London.

[24]  Pierre Hansen,et al.  J-MEANS: a new local search heuristic for minimum sum of squares clustering , 1999, Pattern Recognit..

[25]  Dirk P. Kroese,et al.  Sequence alignment by rare event simulation , 2002, Proceedings of the Winter Simulation Conference.

[26]  James C. Spall,et al.  Introduction to stochastic search and optimization - estimation, simulation, and control , 2003, Wiley-Interscience series in discrete mathematics and optimization.

[27]  J. Wade Davis,et al.  Statistical Pattern Recognition , 2003, Technometrics.

[28]  C. N Bouza,et al.  Spall, J.C. Introduction to stochastic search and optimization. Estimation, simulation and control. Wiley Interscience Series in Discrete Mathematics and Optimization, 2003 , 2004 .

[29]  Dirk P. Kroese,et al.  Global likelihood optimization via the cross-entropy method with an application to mixture models , 2004, Proceedings of the 2004 Winter Simulation Conference, 2004..

[30]  David G. Stork,et al.  Computer Manual in MATLAB to Accompany Pattern Classification, Second Edition , 2004 .

[31]  Hanif D. Sherali,et al.  A Global Optimization RLT-based Approach for Solving the Hard Clustering Problem , 2005, J. Glob. Optim..

[32]  Shie Mannor,et al.  A Tutorial on the Cross-Entropy Method , 2005, Ann. Oper. Res..

[33]  Kuo-Lung Wu,et al.  Unsupervised possibilistic clustering , 2006, Pattern Recognit..

[34]  Lih-Yuan Deng,et al.  The Cross-Entropy Method: A Unified Approach to Combinatorial Optimization, Monte-Carlo Simulation, and Machine Learning , 2006, Technometrics.