Parallelizing Evolutionary Algorithms for Clustering Data

In the paper the problem of using an evolutionary algorithm to partition a dataset into a known number of clusters is considered. A novel approach, based on data decomposition, for parallel computing of the fitness function is proposed. Both the learning set and the population of the evolutionary algorithm are distributed among processors. Processors form a pipeline using the ring topology. In a single step each processor computes the local fitness of its current subpopulation while sending the previous subpopulation to the successor and receiving next subpopulation from the predecessor. Thus it is possible to overlap communication and computation using non-blocking MPI routines. Our approach to parallel fitness computation was applied to differential evolution algorithm. The results of initial experiments show, that for large datasets the algorithm is capable of achieving very good scalability.

[1]  James C. Bezdek,et al.  Clustering with a genetically optimized approach , 1999, IEEE Trans. Evol. Comput..

[2]  Ujjwal Maulik,et al.  Genetic algorithm-based clustering technique , 2000, Pattern Recognit..

[3]  Enrique Alba,et al.  Parallelism and evolutionary algorithms , 2002, IEEE Trans. Evol. Comput..

[4]  Rainer Storn,et al.  Minimizing the real functions of the ICEC'96 contest by differential evolution , 1996, Proceedings of IEEE International Conference on Evolutionary Computation.

[5]  Anthony Skjellum,et al.  A High-Performance, Portable Implementation of the MPI Message Passing Interface Standard , 1996, Parallel Comput..

[6]  Jack Dongarra,et al.  MPI: The Complete Reference , 1996 .

[7]  Rainer Storn,et al.  Differential Evolution – A Simple and Efficient Heuristic for global Optimization over Continuous Spaces , 1997, J. Glob. Optim..

[8]  Anil K. Jain,et al.  Algorithms for Clustering Data , 1988 .

[9]  Allen Gersho,et al.  Vector quantization and signal compression , 1991, The Kluwer international series in engineering and computer science.

[10]  Wojciech Kwedlo,et al.  A Parallel Evolutionary Algorithm for Discovery of Decision Rules , 2003, PPAM.

[11]  Zbigniew Michalewicz,et al.  Genetic Algorithms + Data Structures = Evolution Programs , 1996, Springer Berlin Heidelberg.

[12]  David G. Stork,et al.  Pattern Classification , 1973 .

[13]  Sandra Paterlini,et al.  High performance clustering with differential evolution , 2004, Proceedings of the 2004 Congress on Evolutionary Computation (IEEE Cat. No.04TH8753).