A Co-Evolutionary Multi-Objective approach for a K-adaptive graph-based clustering algorithm

Clustering is a field of Data Mining that deals with the problem of extract knowledge from data blindly. Basically, clustering identifies similar data in a dataset and groups them in sets named clusters. The high number of clustering practical applications has made it a fertile research topic with several approaches. One recent method that is gaining popularity in the research community is Spectral Clustering (SC). It is a clustering method that builds a similarity graph and applies spectral analysis to preserve the data continuity in the cluster. This work presents a new algorithm inspired by SC algorithm, the Co-Evolutionary Multi-Objective Genetic Graph-based Clustering (CEMOG) algorithm, which is based on the Multi-Objective Genetic Graph-based Clustering (MOGGC) algorithm and extends it by introducing an adaptative number of clusters. CEMOG takes an island-model approach where each island keeps a population of candidate solutions for ki clusters. Individuals in the islands can migrate to encourage genetic diversity and the propagation of individuals around promising search regions. This new approach shows its competitive performance, compared to several classical clustering algorithms (EM, SC and K-means), through a set of experiments involving synthetic and real datasets.

[1]  Jieping Ye,et al.  Multi-objective Multi-view Spectral Clustering via Pareto Optimization , 2013, SDM.

[2]  Tomoyuki Hiroyasu,et al.  Multiobjective clustering with automatic k-determination for large-scale data , 2007, GECCO '07.

[3]  Anil K. Jain,et al.  Data Clustering: A User's Dilemma , 2005, PReMI.

[4]  Joshua D. Knowles,et al.  Exploiting the Trade-off - The Benefits of Multiple Objectives in Data Clustering , 2005, EMO.

[5]  Hojjat Adeli,et al.  Principal Component Analysis-Enhanced Cosine Radial Basis Function Neural Network for Robust Epilepsy and Seizure Detection , 2008, IEEE Transactions on Biomedical Engineering.

[6]  Amit Banerjee,et al.  An improved genetic algorithm for robust fuzzy clustering with unknown number of clusters , 2010, 2010 Annual Meeting of the North American Fuzzy Information Processing Society.

[7]  Paulo Cortez,et al.  Modeling wine preferences by data mining from physicochemical properties , 2009, Decis. Support Syst..

[8]  Alex Alves Freitas,et al.  A Survey of Evolutionary Algorithms for Clustering , 2009, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[9]  David F. Barrero,et al.  A Multi-Objective Genetic Graph-Based Clustering algorithm with memory optimization , 2013, 2013 IEEE Congress on Evolutionary Computation.

[10]  Peter Haider,et al.  Discriminative clustering for market segmentation , 2012, KDD.

[11]  Yangyang Li,et al.  A spectral clustering-based adaptive hybrid multi-objective harmony search algorithm for community detection , 2012, 2012 IEEE Congress on Evolutionary Computation.

[12]  María Dolores Rodríguez-Moreno,et al.  Clustering avatars behaviours from virtual worlds interactions , 2012, WI&C '12.

[13]  Juan Julián Merelo Guervós,et al.  Application of the Fuzzy Kohonen Clustering Network to Biological Macromolecules Images Classification , 1999, IWANN.

[14]  David Camacho,et al.  A Genetic Graph-Based Clustering Algorithm , 2012, IDEAL.

[15]  Cor J. Veenman,et al.  A Maximum Variance Cluster Algorithm , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[16]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[17]  Ulrike von Luxburg,et al.  A tutorial on spectral clustering , 2007, Stat. Comput..

[18]  David J. C. MacKay,et al.  Information Theory, Inference, and Learning Algorithms , 2004, IEEE Transactions on Information Theory.

[19]  Gary B. Lamont,et al.  Evolutionary Algorithms for Solving Multi-Objective Problems , 2002, Genetic Algorithms and Evolutionary Computation.

[20]  Marco Laumanns,et al.  SPEA2: Improving the strength pareto evolutionary algorithm , 2001 .

[21]  Carla E. Brodley,et al.  Solving cluster ensemble problems by bipartite graph partitioning , 2004, ICML.

[22]  Michael I. Jordan,et al.  On Spectral Clustering: Analysis and an algorithm , 2001, NIPS.

[23]  Sam Kwong,et al.  Multi-Objective Data Clustering using Variable-Length Real Jumping Genes Genetic Algorithm and Local Search Method , 2006, The 2006 IEEE International Joint Conference on Neural Network Proceedings.

[24]  Anil K. Jain Data Clustering: User's Dilemma , 2007, MLDM.

[25]  Dit-Yan Yeung,et al.  Robust path-based spectral clustering , 2008, Pattern Recognit..

[26]  Aristides Gionis,et al.  Clustering Aggregation , 2005, ICDE.

[27]  Xiaohua Hu,et al.  A comprehensive comparison study of document clustering for a biomedical digital library MEDLINE , 2006, Proceedings of the 6th ACM/IEEE-CS Joint Conference on Digital Libraries (JCDL '06).