A memetic algorithm for community detection by maximising the connected cohesion

Community detection is an exciting field of research which has attracted the interest of many researchers during the last decade. While many algorithms and heuristics have been proposed to scale existing approaches a relatively smaller number of studies have looked at exploring different measures of quality of the detected community. Recently, a new score called ‘cohesion’ was introduced in the computing literature. The cohesion score is based comparing the number of triangles in a given group of vertices to the number of triangles only partly in that group. In this contribution, we propose a memetic algorithm that aims to find a subset of the vertices of an undirected graph that maximizes the cohesion score. The associated combinatorial optimisation problem is known to be NP-Hard and we also prove it to be W[1]-hard when parameterized by the score. We used a Local Search individual improvement heuristic to expand the putative solution. Then we removed all vertices from the group which are not a part of any triangle and expand the neighbourhood by adding triangles which have at least two nodes already in the group. Finally we compute the maximum connected component of this group. The highest quality solutions of the memetic algorithm have been obtained for four real-world network scenarios and we compare our results with ground-truth information about the graphs. We also compare the results to those obtained with eight other community detection algorithms via interrater agreement measures. Our results give a new lower bound on the parameterized complexity of this problem and give novel insights on its potential usefulness as a new natural score for community detection.

[1]  Eric Fleury,et al.  Maximizing the Cohesion is NP-hard , 2011, ArXiv.

[2]  M E J Newman,et al.  Modularity and community structure in networks. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[3]  Alex Bavelas A Mathematical Model for Group Structures , 1948 .

[4]  M. McHugh Interrater reliability: the kappa statistic , 2012, Biochemia medica.

[5]  Pablo Moscato,et al.  An introduction to population approaches for optimization and hierarchical objective functions: A discussion on the role of tabu search , 1993, Ann. Oper. Res..

[6]  Mathieu Bastian,et al.  Gephi: An Open Source Software for Exploring and Manipulating Networks , 2009, ICWSM.

[7]  Xin Liu,et al.  Community Detection Algorithm based on Centrality and Node Closeness in Scale-Free Networks , 2014 .

[8]  Santo Fortunato,et al.  Community detection in graphs , 2009, ArXiv.

[9]  L. Freeman Centrality in social networks conceptual clarification , 1978 .

[10]  S. Fortunato,et al.  Resolution limit in community detection , 2006, Proceedings of the National Academy of Sciences.

[11]  George R. R. Martin A Storm of Swords: Part 1 Steel and Snow , 2000 .

[12]  Yangyang Li,et al.  An improved memetic algorithm for community detection in complex networks , 2012, 2012 IEEE Congress on Evolutionary Computation.

[13]  Duncan J. Watts,et al.  Collective dynamics of ‘small-world’ networks , 1998, Nature.

[14]  Regina Berretta,et al.  Proceedings in Adaptation, Learning and Optimization , 2016, IES.

[15]  Xuelong Li,et al.  Overlapping Community Detection for Multimedia Social Networks , 2017, IEEE Transactions on Multimedia.

[16]  Michael J. Dinneen,et al.  Runtime analysis comparison of two fitness functions on a memetic algorithm for the Clique Problem , 2014, 2014 IEEE Congress on Evolutionary Computation (CEC).

[17]  D. Lusseau,et al.  The bottlenose dolphin community of Doubtful Sound features a large proportion of long-lasting associations , 2003, Behavioral Ecology and Sociobiology.

[18]  P. Moscato,et al.  A Data-Driven Approach to Reverse Engineering Customer Engagement Models: Towards Functional Constructs , 2014, PloS one.

[19]  Guillaume Chelius,et al.  Triangles to Capture Social Cohesion , 2011, 2011 IEEE Third Int'l Conference on Privacy, Security, Risk and Trust and 2011 IEEE Third Int'l Conference on Social Computing.

[20]  Carlo Ratti,et al.  A General Optimization Technique for High Quality Community Detection in Complex Networks , 2013, Physical review. E, Statistical, nonlinear, and soft matter physics.

[21]  Regina Berretta,et al.  MA-Net: A reliable memetic algorithm for community detection by modularity optimization , 2015 .

[22]  Jacob Cohen A Coefficient of Agreement for Nominal Scales , 1960 .

[23]  Regina Berretta,et al.  A Novel Clustering Methodology Based on Modularity Optimisation for Detecting Authorship Affinities in Shakespearean Era Plays , 2016, PloS one.

[24]  Regina Berretta,et al.  Identifying Communities of Trust and Confidence in the Charity and Not-for-Profit Sector: A Memetic Algorithm Approach , 2014, 2014 IEEE Fourth International Conference on Big Data and Cloud Computing.