GLFR: A Generalized LFR Benchmark for Testing Community Detection Algorithms

Comparisons between community detection methods are mostly based on their accuracies in recovering the built-in community structure in artificial benchmark networks. Current community detection benchmarks assign a fixed fraction of inter-community links, referred to as the mixing fraction, for every community in the same network. We first show in this paper that the variation in community mixing fractions has different impacts on the performances of different community detection methods that could change the decision to select a particular detecting algorithm. To comprehensively compare community detection methods, we therefore need a benchmark that generates heterogeneous community mixing fractions, which is not currently available. We address this gap by generalizing the state-of-the-art Lancichinetti-Fortunato- Radicchi benchmark to generate networks with heterogeneous community mixing fractions. Using our new benchmark, we can quantify the impact of the variation in community mixing fractions on existing community detection methods and re- evaluate the performance of the detecting algorithms as a function of the heterogeneity among the mixing fractions. Furthermore, we show that the heterogeneous community mixing tests using our generalized benchmark reflect better the performance that would be expected on real networks than the homogeneous community mixing tests using the original benchmark.

[1]  Hong Shen,et al.  Community Detection in Networks with Less Significant Community Structure , 2016, ADMA.

[2]  T. Murata,et al.  Advanced modularity-specialized label propagation algorithm for detecting communities in networks , 2009, 0910.1154.

[3]  Richard M. Karp,et al.  Algorithms for graph partitioning on the planted partition model , 2001, Random Struct. Algorithms.

[4]  Pietro Liò,et al.  Community Structure in Social Networks: Applications for Epidemiological Modelling , 2011, PloS one.

[5]  M E J Newman,et al.  Modularity and community structure in networks. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[6]  M. Newman,et al.  Finding community structure in very large networks. , 2004, Physical review. E, Statistical, nonlinear, and soft matter physics.

[7]  Filippo Menczer,et al.  Virality Prediction and Community Structure in Social Networks , 2013, Scientific Reports.

[8]  Santo Fortunato,et al.  Community detection in networks: A user guide , 2016, ArXiv.

[9]  M. Barber,et al.  Detecting network communities by propagating labels under constraints. , 2009, Physical review. E, Statistical, nonlinear, and soft matter physics.

[10]  Matthieu Latapy,et al.  Computing Communities in Large Networks Using Random Walks , 2004, J. Graph Algorithms Appl..

[11]  Pablo M. Gleiser,et al.  Community Structure in Jazz , 2003, Adv. Complex Syst..

[12]  Jean-Loup Guillaume,et al.  Fast unfolding of communities in large networks , 2008, 0803.0476.

[13]  Mark E. J. Newman,et al.  Stochastic blockmodels and community structure in networks , 2010, Physical review. E, Statistical, nonlinear, and soft matter physics.

[14]  Réka Albert,et al.  Near linear time algorithm to detect community structures in large-scale networks. , 2007, Physical review. E, Statistical, nonlinear, and soft matter physics.

[15]  Martin Rosvall,et al.  Maps of random walks on complex networks reveal community structure , 2007, Proceedings of the National Academy of Sciences.

[16]  Gábor Csárdi,et al.  The igraph software package for complex network research , 2006 .

[17]  M E J Newman,et al.  Finding and evaluating community structure in networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[18]  M. Newman,et al.  The structure of scientific collaboration networks. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[19]  Andrea Lancichinetti,et al.  Community detection algorithms: a comparative analysis: invited presentation, extended abstract , 2009, VALUETOOLS.

[20]  D. Watts,et al.  Small Worlds: The Dynamics of Networks between Order and Randomness , 2001 .

[21]  Kathryn B. Laskey,et al.  Stochastic blockmodels: First steps , 1983 .

[22]  Leon Danon,et al.  The effect of size heterogeneity on community identification in complex networks , 2006, physics/0601144.

[23]  Albert,et al.  Emergence of scaling in random networks , 1999, Science.

[24]  Leon Danon,et al.  Comparing community structure identification , 2005, cond-mat/0505245.

[25]  Ken Wakita,et al.  Finding community structure in mega-scale social networks: [extended abstract] , 2007, WWW '07.

[26]  A. Arenas,et al.  Models of social networks based on social distance attachment. , 2004, Physical review. E, Statistical, nonlinear, and soft matter physics.

[27]  Santo Fortunato,et al.  Community detection in graphs , 2009, ArXiv.

[28]  R. Albert,et al.  The large-scale organization of metabolic networks , 2000, Nature.

[29]  A Díaz-Guilera,et al.  Self-similar community structure in a network of human interactions. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[30]  Ricardo J. G. B. Campello,et al.  Communities validity: methodical evaluation of community mining algorithms , 2013, Social Network Analysis and Mining.

[31]  James P. Bagrow Evaluating local community methods in networks , 2007, 0706.3880.

[32]  Andrea Lancichinetti,et al.  Benchmarks for testing community detection algorithms on directed and weighted graphs with overlapping communities. , 2009, Physical review. E, Statistical, nonlinear, and soft matter physics.

[33]  Richard M. Karp,et al.  Algorithms for graph partitioning on the planted partition model , 1999, Random Struct. Algorithms.

[34]  M E J Newman,et al.  Community structure in social and biological networks , 2001, Proceedings of the National Academy of Sciences of the United States of America.