A Multi-Objective Genetic Algorithm for overlapping community detection based on edge encoding

Abstract The Community Detection Problem (CDP) in Social Networks has been widely studied from different areas such as Data Mining, Graph Theory Physics, or Social Network Analysis, among others. This problem tries to divide a graph into different groups of nodes (communities), according to the graph topology. A partition is a division of the graph where each node belongs to only one community. However, a common feature observed in real-world networks is the existence of overlapping communities, where a given node can belong to more than one community. This paper presents a new Multi-Objective Genetic Algorithm (MOGA-OCD) designed to detect overlapping communities, by using measures related to the network connectivity. For this purpose, the proposed algorithm uses a phenotype-type encoding based on the edge information, and a new fitness function focused on optimizing two classical objectives in CDP: the first one is used to maximize the internal connectivity of the communities, whereas the second one is used to minimize the external connections to the rest of the graph. To select the most appropriate metrics for these objectives, a comparative assessment of several connectivity metrics has been carried out using real-world networks. Finally, the algorithm has been evaluated against other well-known algorithms from the state of the art in CDP. The experimental results show that the proposed approach improves overall the accuracy and quality of alternative methods in CDP, showing its effectiveness as a new powerful algorithm for detecting structured overlapping communities.

[1]  Joshua D. Knowles,et al.  An Evolutionary Approach to Multiobjective Clustering , 2007, IEEE Transactions on Evolutionary Computation.

[2]  Boleslaw K. Szymanski,et al.  Overlapping community detection in networks: The state-of-the-art and comparative study , 2011, CSUR.

[3]  David Camacho,et al.  Adaptive k-Means Algorithm for Overlapped Graph Clustering , 2012, Int. J. Neural Syst..

[4]  Clara Pizzuti,et al.  Evolutionary Clustering for Mining and Tracking Dynamic Multilayer Networks , 2017, Comput. Intell..

[5]  Maurizio Naldi,et al.  A traffic-based evolutionary algorithm for network clustering , 2013, Appl. Soft Comput..

[6]  Santo Fortunato,et al.  Community detection in graphs , 2009, ArXiv.

[7]  Kalyanmoy Deb,et al.  A fast and elitist multiobjective genetic algorithm: NSGA-II , 2002, IEEE Trans. Evol. Comput..

[8]  Réka Albert,et al.  Near linear time algorithm to detect community structures in large-scale networks. , 2007, Physical review. E, Statistical, nonlinear, and soft matter physics.

[9]  Jure Leskovec,et al.  Friendship and mobility: user movement in location-based social networks , 2011, KDD.

[10]  Reinhard Lipowsky,et al.  Network Brownian Motion: A New Method to Measure Vertex-Vertex Proximity and to Identify Communities and Subcommunities , 2004, International Conference on Computational Science.

[11]  Martin Rosvall,et al.  Maps of random walks on complex networks reveal community structure , 2007, Proceedings of the National Academy of Sciences.

[12]  Jure Leskovec,et al.  Community Structure in Large Networks: Natural Cluster Sizes and the Absence of Large Well-Defined Clusters , 2008, Internet Math..

[13]  Clara Pizzuti,et al.  An Evolutionary and Local Refinement Approach for Community Detection in Signed Networks , 2016, Int. J. Artif. Intell. Tools.

[14]  Qingfu Zhang,et al.  Community detection in networks by using multiobjective evolutionary algorithm with decomposition , 2012 .

[15]  David Camacho,et al.  Evolutionary clustering algorithm for community detection using graph-based information , 2014, 2014 IEEE Congress on Evolutionary Computation (CEC).

[16]  M E J Newman,et al.  Fast algorithm for detecting community structure in networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[17]  Steve Gregory,et al.  A Fast Algorithm to Find Overlapping Communities in Networks , 2008, ECML/PKDD.

[18]  M E J Newman,et al.  Finding and evaluating community structure in networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[19]  Clara Pizzuti,et al.  Overlapped community detection in complex networks , 2009, GECCO.

[20]  Satu Elisa Schaeffer,et al.  Graph Clustering , 2017, Encyclopedia of Machine Learning and Data Mining.

[21]  S. Salcedo-Sanz,et al.  An Island Grouping Genetic Algorithm for Fuzzy Partitioning Problems , 2014, TheScientificWorldJournal.

[22]  Andrea Lancichinetti,et al.  Benchmarks for testing community detection algorithms on directed and weighted graphs with overlapping communities. , 2009, Physical review. E, Statistical, nonlinear, and soft matter physics.

[23]  Jason J. Jung,et al.  Social big data: Recent achievements and new challenges , 2015, Information Fusion.

[24]  T. Vicsek,et al.  Uncovering the overlapping community structure of complex networks in nature and society , 2005, Nature.

[25]  Jure Leskovec,et al.  Detecting cohesive and 2-mode communities indirected and undirected networks , 2014, WSDM.

[26]  V. Carchiolo,et al.  Extending the definition of modularity to directed graphs with overlapping communities , 2008, 0801.1647.

[27]  R. Lambiotte,et al.  Line graphs, link partitions, and overlapping communities. , 2009, Physical review. E, Statistical, nonlinear, and soft matter physics.

[28]  Jure Leskovec,et al.  Overlapping community detection at scale: a nonnegative matrix factorization approach , 2013, WSDM.

[29]  Christos Faloutsos,et al.  Graph evolution: Densification and shrinking diameters , 2006, TKDD.

[30]  Francesco Folino,et al.  An Evolutionary Multiobjective Approach for Community Discovery in Dynamic Networks , 2014, IEEE Transactions on Knowledge and Data Engineering.

[31]  Clara Pizzuti,et al.  FOR CLOSENESS : ADJUSTING NORMALIZED MUTUAL INFORMATION MEASURE FOR CLUSTERING COMPARISON , 2016 .

[32]  Jason J. Jung Evolutionary approach for semantic-based query sampling in large-scale information sources , 2012, Inf. Sci..

[33]  Bin Wu,et al.  A link clustering based overlapping community detection algorithm , 2013, Data Knowl. Eng..

[34]  Pascal Frossard,et al.  Clustering on Multi-Layer Graphs via Subspace Analysis on Grassmann Manifolds , 2013, IEEE Transactions on Signal Processing.

[35]  Sune Lehmann,et al.  Link communities reveal multiscale complexity in networks , 2009, Nature.

[36]  Duncan J. Watts,et al.  Collective dynamics of ‘small-world’ networks , 1998, Nature.

[37]  Clara Pizzuti,et al.  A Multiobjective Genetic Algorithm to Find Communities in Complex Networks , 2012, IEEE Transactions on Evolutionary Computation.

[38]  Shihua Zhang,et al.  Identification of overlapping community structure in complex networks using fuzzy c-means clustering , 2007 .

[39]  Yong-Yeol Ahn,et al.  Communities and Hierarchical Organization of Links in Complex Networks , 2009 .

[40]  Jure Leskovec,et al.  Defining and evaluating network communities based on ground-truth , 2012, Knowledge and Information Systems.

[41]  Santosh S. Vempala,et al.  On clusterings-good, bad and spectral , 2000, Proceedings 41st Annual Symposium on Foundations of Computer Science.

[42]  Steve Gregory,et al.  An Algorithm to Find Overlapping Community Structure in Networks , 2007, PKDD.

[43]  Jean-Loup Guillaume,et al.  Fast unfolding of communities in large networks , 2008, 0803.0476.