Community detection algorithms: a comparative analysis: invited presentation, extended abstract

Uncovering the community structure exhibited by real networks is a crucial step toward an understanding of complex systems that goes beyond the local organization of their constituents. Many algorithms have been proposed so far, but none of them has been subjected to strict tests to evaluate their performance. Most of the sporadic tests performed so far involved small networks with known community structure and/or artificial graphs with a simplified structure, which is very uncommon in real systems. Here we test several methods against a recently introduced class of benchmark graphs, with heterogeneous distributions of degree and community size. The methods are also tested against the benchmark by Girvan and Newman [Proc. Natl. Acad. Sci. U.S.A. 99, 7821 (2002)] and on random graphs. As a result of our analysis, three recent algorithms introduced by Rosvall and Bergstrom [Proc. Natl. Acad. Sci. U.S.A. 104, 7327 (2007); Proc. Natl. Acad. Sci. U.S.A. 105, 1118 (2008)], Blondel [J. Stat. Mech.: Theory Exp. (2008), P10008], and Ronhovde and Nussinov [Phys. Rev. E 80, 016109 (2009)] have an excellent performance, with the additional advantage of low computational complexity, which enables one to analyze large systems.

[1]  C. Lanczos An iteration method for the solution of the eigenvalue problem of linear differential and integral operators , 1950 .

[2]  L. Goddard Information Theory , 1962, Nature.

[3]  Robert B. Ash,et al.  Information Theory , 2020, The SAGE International Encyclopedia of Mass Media and Society.

[4]  C. Bron,et al.  Algorithm 457: finding all cliques of an undirected graph , 1973 .

[5]  J. Rissanen,et al.  Modeling By Shortest Data Description* , 1978, Autom..

[6]  P. Gács,et al.  Algorithms , 1992 .

[7]  Bruce A. Reed,et al.  A Critical Point for Random Graphs with a Given Degree Sequence , 1995, Random Struct. Algorithms.

[8]  Duncan J. Watts,et al.  Collective dynamics of ‘small-world’ networks , 1998, Nature.

[9]  Albert-László Barabási,et al.  Internet: Diameter of the World-Wide Web , 1999, Nature.

[10]  R. Karp,et al.  Algorithms for graph partitioning on the planted partition model , 2001 .

[11]  M E J Newman,et al.  Community structure in social and biological networks , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[12]  S. Shen-Orr,et al.  Network motifs: simple building blocks of complex networks. , 2002, Science.

[13]  R. Milo,et al.  Subgraphs in random networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[14]  A Díaz-Guilera,et al.  Self-similar community structure in a network of human interactions. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[15]  Mark E. J. Newman,et al.  The Structure and Function of Complex Networks , 2003, SIAM Rev..

[16]  M E J Newman,et al.  Fast algorithm for detecting community structure in networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[17]  M. A. Muñoz,et al.  Journal of Statistical Mechanics: An IOP and SISSA journal Theory and Experiment Detecting network communities: a new systematic and efficient algorithm , 2004 .

[18]  M. Newman,et al.  Finding community structure in very large networks. , 2004, Physical review. E, Statistical, nonlinear, and soft matter physics.

[19]  R. Guimerà,et al.  Modularity from fluctuations in random graphs and complex networks. , 2004, Physical review. E, Statistical, nonlinear, and soft matter physics.

[20]  Claudio Castellano,et al.  Defining and identifying communities in networks. , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[21]  M E J Newman,et al.  Finding and evaluating community structure in networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[22]  A. Vespignani,et al.  The architecture of complex weighted networks. , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[23]  David J. C. MacKay,et al.  Information Theory, Inference, and Learning Algorithms , 2004, IEEE Transactions on Information Theory.

[24]  U. Alon,et al.  Subgraphs and network motifs in geometric networks. , 2004, Physical review. E, Statistical, nonlinear, and soft matter physics.

[25]  Mark A. Pitt,et al.  Advances in Minimum Description Length: Theory and Applications , 2005 .

[26]  宁北芳,et al.  疟原虫var基因转换速率变化导致抗原变异[英]/Paul H, Robert P, Christodoulou Z, et al//Proc Natl Acad Sci U S A , 2005 .

[27]  T. Vicsek,et al.  Uncovering the overlapping community structure of complex networks in nature and society , 2005, Nature.

[28]  J. Doye,et al.  Identifying communities within energy landscapes. , 2004, Physical review. E, Statistical, nonlinear, and soft matter physics.

[29]  Leon Danon,et al.  Comparing community structure identification , 2005, cond-mat/0505245.

[30]  A. Medus,et al.  Detection of community structures in networks via global optimization , 2005 .

[31]  R. Guimerà,et al.  Functional cartography of complex metabolic networks , 2005, Nature.

[32]  S. Bornholdt,et al.  When are networks truly modular , 2006, cond-mat/0606220.

[33]  J. Reichardt,et al.  Statistical mechanics of community detection. , 2006, Physical review. E, Statistical, nonlinear, and soft matter physics.

[34]  Alex Arenas,et al.  Synchronization reveals topological scales in complex networks. , 2006, Physical review letters.

[35]  V. Latora,et al.  Complex networks: Structure and dynamics , 2006 .

[36]  Shihua Zhang,et al.  Identification of overlapping community structure in complex networks using fuzzy c-means clustering , 2007 .

[37]  J. Reichardt,et al.  Partitioning and modularity of graphs with arbitrary degree distribution. , 2006, Physical review. E, Statistical, nonlinear, and soft matter physics.

[38]  E A Leicht,et al.  Mixture models and exploratory analysis in networks , 2006, Proceedings of the National Academy of Sciences.

[39]  S. Fortunato,et al.  Resolution limit in community detection , 2006, Proceedings of the National Academy of Sciences.

[40]  Martin Rosvall,et al.  An information-theoretic framework for resolving community structure in complex networks , 2007, Proceedings of the National Academy of Sciences.

[41]  M. Meilă Comparing clusterings---an information based distance , 2007 .

[42]  Jean-Loup Guillaume,et al.  Fast unfolding of communities in large networks , 2008, 0803.0476.

[43]  F. Radicchi,et al.  Benchmark graphs for testing community detection algorithms. , 2008, Physical review. E, Statistical, nonlinear, and soft matter physics.

[44]  Michele Leone,et al.  (Un)detectable cluster structure in sparse networks. , 2007, Physical review letters.

[45]  T. Nepusz,et al.  Fuzzy communities and the concept of bridgeness in complex networks. , 2007, Physical review. E, Statistical, nonlinear, and soft matter physics.

[46]  J. Ramasco,et al.  Inversion method for content-based networks. , 2007, Physical review. E, Statistical, nonlinear, and soft matter physics.

[47]  Martin Rosvall,et al.  Maps of random walks on complex networks reveal community structure , 2007, Proceedings of the National Academy of Sciences.

[48]  Andrea Lancichinetti,et al.  Detecting the overlapping and hierarchical community structure in complex networks , 2008, 0802.1218.

[49]  R. Rosenfeld Nature , 2009, Otolaryngology--head and neck surgery : official journal of American Academy of Otolaryngology-Head and Neck Surgery.

[50]  E. N. Sawardecker,et al.  Detection of node group membership in networks with group overlap , 2008, 0812.1243.

[51]  R. Lambiotte,et al.  Line graphs, link partitions, and overlapping communities. , 2009, Physical review. E, Statistical, nonlinear, and soft matter physics.

[52]  Andrea Lancichinetti,et al.  Benchmarks for testing community detection algorithms on directed and weighted graphs with overlapping communities. , 2009, Physical review. E, Statistical, nonlinear, and soft matter physics.

[53]  Robert A. Meyers,et al.  Encyclopedia of Complexity and Systems Science , 2009 .

[54]  P. Pin,et al.  Assessing the relevance of node features for network structure , 2008, Proceedings of the National Academy of Sciences.

[55]  P. Ronhovde,et al.  Multiresolution community detection for megascale networks by information-based replica correlations. , 2008, Physical review. E, Statistical, nonlinear, and soft matter physics.