Multi-objective evolutionary clustering with complex networks

Abstract Evolutionary clustering (EC) refers to the applications of evolutionary optimization algorithms such as genetic algorithm to data clustering. Although multi-objective evolutionary clustering algorithms were proposed to simultaneously consider different cluster properties such as compactness and separation, these techniques usually suffer from a reasonable initial population and a pre-defined number of clusters. Besides, the effectiveness of evolutionary operators is decreased in dealing with the clustering problem. On the other side, complex networks play an essential role in different fields of machine learning. In a complex network, points are considered as nodes, and the dataset is shown as a connected weighted graph. Also, complex networks tend to present a modular structure. This paper applies two concepts of complex networks including node centrality and community modularity to introduce a novel multi-objective evolutionary clustering. The proposed centrality modularity-based multi-objective evolutionary clustering (CMMOEC) takes the advantage of nodes similarity to find the best initial population of clustering solutions and provide new structural-based modularity to determine the optimal number of clusters automatically. Moreover, the proposed modularity is used to design a new recombination and mutation operator so that it generates offspring solutions that satisfy more diversity. Experiments carried out on several artificial and real-world datasets with different structures. The performance of the proposed algorithm is evaluated by the Adjusted Rand Index (ARI). Simulation results indicate that the proposed algorithm satisfies better performance in comparison to traditional methods.

[1]  Giuliano Armano,et al.  Multiobjective clustering analysis using particle swarm optimization , 2016, Expert Syst. Appl..

[2]  L. Freeman Centrality in social networks conceptual clarification , 1978 .

[3]  Luciano da Fontoura Costa,et al.  A Complex Networks Approach for Data Clustering , 2011, ArXiv.

[4]  Donald W. Bouldin,et al.  A Cluster Separation Measure , 1979, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  Sergey Brin,et al.  The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.

[6]  Hisao Ishibuchi,et al.  Multi-clustering via evolutionary multi-objective optimization , 2018, Inf. Sci..

[7]  László Szilágyi,et al.  A fast hierarchical clustering algorithm for large-scale protein sequence data sets , 2014, Comput. Biol. Medicine.

[8]  Dushmanta Kumar Das,et al.  A modified Bee Colony Optimization (MBCO) and its hybridization with k-means for an application to data clustering , 2018, Appl. Soft Comput..

[9]  Alireza Abdollahpouri,et al.  Ranking nodes in complex networks based on local structure and improving closeness centrality , 2019, Neurocomputing.

[10]  Joshua D. Knowles,et al.  An Improved and More Scalable Evolutionary Approach to Multiobjective Clustering , 2018, IEEE Transactions on Evolutionary Computation.

[11]  Joshua D. Knowles,et al.  An Evolutionary Approach to Multiobjective Clustering , 2007, IEEE Transactions on Evolutionary Computation.

[12]  Adam Baharum,et al.  Automatic Clustering Using Multi-objective Particle Swarm and Simulated Annealing , 2015, PloS one.

[13]  S. Fortunato,et al.  Resolution limit in community detection , 2006, Proceedings of the National Academy of Sciences.

[14]  Pravat Kumar Rout,et al.  Automatic clustering by elitism-based multi-objective differential evolution , 2017 .

[15]  Andries P. Engelbrecht,et al.  Computational Intelligence: An Introduction , 2002 .

[16]  Qingfu Zhang,et al.  MOEA/D: A Multiobjective Evolutionary Algorithm Based on Decomposition , 2007, IEEE Transactions on Evolutionary Computation.

[17]  Sergio Gómez,et al.  Mesoscopic analysis of networks: applications to exploratory analysis and data clustering. , 2011, Chaos.

[18]  Nicandro Cruz-Ramírez,et al.  Improved multi-objective clustering with automatic determination of the number of clusters , 2016, Neural Computing and Applications.

[19]  Cun-Quan Zhang,et al.  Laplacian centrality: A new centrality measure for weighted networks , 2012, Inf. Sci..

[20]  Qi Zhao,et al.  Reference vector-based multi-objective clustering for high-dimensional data , 2019, Appl. Soft Comput..

[21]  L. Hubert,et al.  Comparing partitions , 1985 .

[22]  Clara Pizzuti,et al.  Evolutionary Computation for Community Detection in Networks: A Review , 2018, IEEE Transactions on Evolutionary Computation.

[23]  Gert Sabidussi,et al.  The centrality index of a graph , 1966 .

[24]  Carlos A. Coello Coello,et al.  Handling multiple objectives with particle swarm optimization , 2004, IEEE Transactions on Evolutionary Computation.

[25]  Kalyanmoy Deb,et al.  A fast and elitist multiobjective genetic algorithm: NSGA-II , 2002, IEEE Trans. Evol. Comput..

[26]  Jean-Loup Guillaume,et al.  Fast unfolding of communities in large networks , 2008, 0803.0476.

[27]  Andries Petrus Engelbrecht,et al.  An overview of clustering methods , 2007, Intell. Data Anal..

[28]  Shokri Z. Selim,et al.  K-Means-Type Algorithms: A Generalized Convergence Theorem and Characterization of Local Optimality , 1984, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[29]  Anil K. Jain Data clustering: 50 years beyond K-means , 2010, Pattern Recognit. Lett..