Clustering by Adaptive Local Search with Multiple Search Operators

Abstract:Local Search (LS) has proven to be an efficient optimisation technique in clustering applications and in the minimisation of stochastic complexity of a data set. In the present paper, we propose two ways of organising LS in these contexts, the Multi-operator Local Search (MOLS) and the Adaptive Multi-Operator Local Search (AMOLS), and compare their performance to single operator (random swap) LS method and repeated GLA (Generalised Lloyd Algorithm). Both of the proposed methods use several different LS operators to solve the problem. MOLS applies the operators cyclically in the same order, whereas AMOLS adapts itself to favour the operators which manage to improve the result more frequently. We use a large database of binary vectors representing strains of bacteria belonging to the family Enterobacteriaceae and a binary image as our test materials. The new techniques turn out to be very promising in these tests.

[1]  Olli Nevalainen,et al.  Tabu search algorithm for codebook generation in vector quantization , 1998, Pattern Recognit..

[2]  Pasi Fränti,et al.  Binary vector quantizer design using soft centroids , 1999, Signal Process. Image Commun..

[3]  Mats Gyllenberg,et al.  Classification of Enterobacteriaceae by minimization of stochastic complexity. , 1997, Microbiology.

[4]  Robert A. Lordo,et al.  Learning from Data: Concepts, Theory, and Methods , 2001, Technometrics.

[5]  E. Forgy,et al.  Cluster analysis of multivariate data : efficiency versus interpretability of classifications , 1965 .

[6]  Olli Nevalainen,et al.  Reallocation of GLA codevectors for evading local minimum , 1996 .

[7]  J J Farmer,et al.  Biochemical identification of new species and biogroups of Enterobacteriaceae isolated from clinical specimens , 1985, Journal of clinical microbiology.

[8]  M. Verlaan,et al.  Classification of Binary Vectors by Stochastic Complexity , 1997 .

[9]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[10]  Pedro Larrañaga,et al.  An empirical comparison of four initialization methods for the K-Means algorithm , 1999, Pattern Recognit. Lett..

[11]  Jorma Rissanen,et al.  Stochastic Complexity in Statistical Inquiry , 1989, World Scientific Series in Computer Science.

[12]  J. Vaisey,et al.  Simulated annealing and codebook design , 1988, ICASSP-88., International Conference on Acoustics, Speech, and Signal Processing.

[13]  Zbigniew Michalewicz,et al.  Adaptation in evolutionary computation: a survey , 1997, Proceedings of 1997 IEEE International Conference on Evolutionary Computation (ICEC '97).

[14]  BischofHorst,et al.  MDL Principle for Robust Vector Quantisation , 1999 .

[15]  C. Reeves Modern heuristic techniques for combinatorial problems , 1993 .

[16]  Nasser M. Nasrabadi,et al.  Vector quantization of images based upon the Kohonen self-organizing feature maps , 1988, ICNN.

[17]  Piero Mussio,et al.  Toward a Practice of Autonomous Systems , 1994 .

[18]  F. Glover,et al.  In Modern Heuristic Techniques for Combinatorial Problems , 1993 .

[19]  Robert M. Gray,et al.  An Algorithm for Vector Quantizer Design , 1980, IEEE Trans. Commun..

[20]  David H. Wolpert,et al.  No free lunch theorems for optimization , 1997, IEEE Trans. Evol. Comput..

[21]  Mika Johnsson,et al.  An adaptive hybrid genetic algorithm for the three-matching problem , 2000, IEEE Trans. Evol. Comput..

[22]  William Equitz,et al.  A new vector quantization clustering algorithm , 1989, IEEE Trans. Acoust. Speech Signal Process..

[23]  M Gyllenberg,et al.  Minimizing stochastic complexity using local search and GLA with applications to classification of bacteria. , 2000, Bio Systems.