Size Matters: A Comparative Analysis of Community Detection Algorithms

Understanding the community structure of social media is critical due to its broad applications such as friend recommendations, user modeling, and content personalization. Existing research uses structural metrics such as modularity and conductance and functional metrics such as ground truth to measure the quality of the communities discovered by various community detection algorithms, while overlooking a natural and important dimension, community size. Recently, the anthropologist Dunbar suggests that the size of a stable community in social media should be limited to 150, referred to as Dunbar’s number. In this paper, we propose a systematic way of algorithm comparison by orthogonally integrating community size as a new dimension into existing structural metrics for consistently and holistically evaluating the community quality in the social media context. We design a heuristic clique-based algorithm which controls the size and overlap of communities with adjustable parameters and evaluate it along with six state-of-the-art community detection algorithms on both Twitter and DBLP networks. Specifically, we divide the discovered communities based on their size into four classes called a close friend, a casual friend, acquaintance, and just-a-face, and then calculate the coverage, modularity, triangle participation ratio, conductance, transitivity, and the internal density of communities in each class. We discover that communities in different classes exhibit diverse structural qualities and many existing community detection algorithms tend to output extremely large communities.

[1]  Jure Leskovec,et al.  Defining and Evaluating Network Communities Based on Ground-Truth , 2012, ICDM.

[2]  M E J Newman,et al.  Finding and evaluating community structure in networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[3]  Dino Pedreschi,et al.  Uncovering Hierarchical and Overlapping Communities with a Local-First Approach , 2014, TKDD.

[4]  J. Liu,et al.  Traveling salesman problems with PageRank Distance on complex networks reveal community structure , 2016 .

[5]  Réka Albert,et al.  Near linear time algorithm to detect community structures in large-scale networks. , 2007, Physical review. E, Statistical, nonlinear, and soft matter physics.

[6]  Meng Wang,et al.  Community Detection in Social Networks: An In-depth Benchmarking Study with a Procedure-Oriented Framework , 2015, Proc. VLDB Endow..

[7]  Jean-Loup Guillaume,et al.  Fast unfolding of communities in large networks , 2008, 0803.0476.

[8]  Martin Rosvall,et al.  Maps of random walks on complex networks reveal community structure , 2007, Proceedings of the National Academy of Sciences.

[9]  R I M Dunbar,et al.  Do online social media cut through the constraints that limit the size of offline social networks? , 2016, Royal Society Open Science.

[10]  S. Fortunato,et al.  Resolution limit in community detection , 2006, Proceedings of the National Academy of Sciences.

[11]  Mason A. Porter,et al.  Multilayer networks , 2013, J. Complex Networks.

[12]  Santo Fortunato,et al.  Community detection in graphs , 2009, ArXiv.

[13]  Dino Pedreschi,et al.  DEMON: a local-first discovery method for overlapping communities , 2012, KDD.

[14]  Zhao Yang,et al.  A Comparative Analysis of Community Detection Algorithms on Artificial Networks , 2016, Scientific Reports.

[15]  Xingyi Zhang,et al.  Overlapping Community Detection based on Network Decomposition , 2016, Scientific Reports.

[16]  Jing Liu,et al.  A comparative analysis of evolutionary and memetic algorithms for community detection from signed social networks , 2013, Soft Computing.

[17]  M E J Newman,et al.  Community structure in social and biological networks , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[18]  Boleslaw K. Szymanski,et al.  Overlapping community detection in networks: The state-of-the-art and comparative study , 2011, CSUR.

[19]  M. Newman,et al.  Finding community structure in networks using the eigenvectors of matrices. , 2006, Physical review. E, Statistical, nonlinear, and soft matter physics.

[20]  Steve Harenberg,et al.  Community detection in large‐scale networks: a survey and empirical evaluation , 2014 .

[21]  Jure Leskovec,et al.  Empirical comparison of algorithms for network community detection , 2010, WWW '10.

[22]  Xiaodong Wang,et al.  A layer reduction based community detection algorithm on multiplex networks , 2017 .

[23]  Jing Liu,et al.  A Multiobjective Evolutionary Algorithm Based on Similarity for Community Detection From Signed Social Networks , 2014, IEEE Transactions on Cybernetics.

[24]  T. Vicsek,et al.  Uncovering the overlapping community structure of complex networks in nature and society , 2005, Nature.

[25]  J. Liu,et al.  A multi-agent genetic algorithm for community detection in complex networks , 2016 .

[26]  M. Newman,et al.  Finding community structure in very large networks. , 2004, Physical review. E, Statistical, nonlinear, and soft matter physics.

[27]  Andrea Tagarelli,et al.  Ensemble-based community detection in multilayer networks , 2017, Data Mining and Knowledge Discovery.

[28]  Maria Konnikova the Limits of friendship , 2014 .

[29]  Tam'as Vicsek,et al.  Modularity measure of networks with overlapping communities , 2009, 0910.5072.

[30]  Buzhou Tang,et al.  Overlapping community detection in networks with positive and negative links , 2014 .

[31]  M E J Newman,et al.  Fast algorithm for detecting community structure in networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[32]  Mao-Bin Hu,et al.  Detect overlapping and hierarchical community structure in networks , 2008, ArXiv.