Multi-Objective Data Clustering using Variable-Length Real Jumping Genes Genetic Algorithm and Local Search Method

In this paper, we present a novel multi-objective evolutionary clustering approach using variable-length real jumping genes genetic algorithms (VRJGGA). The proposed algorithm that extends jumping genes genetic algorithm (JGGA) [1] evolves clustering solutions using multiple clustering criteria, without a-priori knowledge of the actual number of clusters. Some local search methods such as probabilistic cluster merging and splitting are introduced in VRJGGA for the clustering improvement. Experimental results based on several artificial and real-world data show that VRJGGA can obtain non-dominated and near-optimal clustering solutions in terms of different cluster quality measures and classification performance.

[1]  Roy George,et al.  A variable-length genetic algorithm for clustering and classification , 1995, Pattern Recognit. Lett..

[2]  T.M. Chan,et al.  Jumping-genes in evolutionary computing , 2004, 30th Annual Conference of IEEE Industrial Electronics Society, 2004. IECON 2004.

[3]  Kalyanmoy Deb,et al.  A fast and elitist multiobjective genetic algorithm: NSGA-II , 2002, IEEE Trans. Evol. Comput..

[4]  N. Wicker,et al.  Density of points clustering, application to transcriptomic data analysis. , 2002, Nucleic acids research.

[5]  Joydeep Ghosh,et al.  Cluster Ensembles --- A Knowledge Reuse Framework for Combining Multiple Partitions , 2002, J. Mach. Learn. Res..

[6]  Joshua D. Knowles,et al.  Exploiting the Trade-off - The Benefits of Multiple Objectives in Data Clustering , 2005, EMO.

[7]  Philip Chan,et al.  Determining the number of clusters/segments in hierarchical clustering/segmentation algorithms , 2004, 16th IEEE International Conference on Tools with Artificial Intelligence.

[8]  Kalyanmoy Deb,et al.  A combined genetic adaptive search (GeneAS) for engineering design , 1996 .

[9]  Joydeep Ghosh,et al.  CLUMP: a scalable and robust framework for structure discovery , 2005, Fifth IEEE International Conference on Data Mining (ICDM'05).

[10]  James C. Bezdek,et al.  Some new indexes of cluster validity , 1998, IEEE Trans. Syst. Man Cybern. Part B.

[11]  Joshua D. Knowles,et al.  Evolutionary Multiobjective Clustering , 2004, PPSN.

[12]  James C. Bezdek,et al.  Clustering with a genetically optimized approach , 1999, IEEE Trans. Evol. Comput..

[13]  Francisco Herrera,et al.  Tackling Real-Coded Genetic Algorithms: Operators and Tools for Behavioural Analysis , 1998, Artificial Intelligence Review.

[14]  Ann K. Ganesan Darwin in the Genome: Molecular Strategies in Biological Evolution , 2003 .