Fast Detection of Size-Constrained Communities in Large Networks

The community detection in networks is a prominent task in the graph data mining, because of the rapid emergence of the graph data; e.g., information networks or social networks. In this paper, we propose a new algorithm for detecting communities in networks. Our approach differs from others in the ability of constraining the size of communities being generated, a property important for a class of applications. In addition, the algorithm is greedy in nature and belongs to a small family of community detection algorithms with the pseudo-linear time complexity, making it applicable also to large networks. The algorithm is able to detect small-sized clusters independently of the network size. It can be viewed as complementary approach to methods optimizing modularity, which tend to increase the size of generated communities with the increase of the network size. Extensive evaluation of the algorithm on synthetic benchmark graphs for community detection showed that the proposed approach is very competitive with state-of-the-art methods, outperforming other approaches in some of the settings.

[1]  Jianyong Wang,et al.  Parallel community detection on large networks with propinquity dynamics , 2009, KDD.

[2]  Réka Albert,et al.  Near linear time algorithm to detect community structures in large-scale networks. , 2007, Physical review. E, Statistical, nonlinear, and soft matter physics.

[3]  R. Guimerà,et al.  Functional cartography of complex metabolic networks , 2005, Nature.

[4]  Peter A. Flach,et al.  Evaluation Measures for Multi-class Subgroup Discovery , 2009, ECML/PKDD.

[5]  S. Fortunato,et al.  Resolution limit in community detection , 2006, Proceedings of the National Academy of Sciences.

[6]  M. Newman,et al.  Finding community structure in very large networks. , 2004, Physical review. E, Statistical, nonlinear, and soft matter physics.

[7]  V. Latora,et al.  Complex networks: Structure and dynamics , 2006 .

[8]  Andrea Lancichinetti,et al.  Detecting the overlapping and hierarchical community structure in complex networks , 2008, 0802.1218.

[9]  Mason A. Porter,et al.  Communities in Networks , 2009, ArXiv.

[10]  Santo Fortunato,et al.  Community detection in graphs , 2009, ArXiv.

[11]  Leon Danon,et al.  Comparing community structure identification , 2005, cond-mat/0505245.

[12]  Yun Chi,et al.  Combining link and content for community detection: a discriminative approach , 2009, KDD.

[13]  Andrea Lancichinetti,et al.  Community detection algorithms: a comparative analysis: invited presentation, extended abstract , 2009, VALUETOOLS.

[14]  Stefan Boettcher,et al.  Optimization with extremal dynamics , 2003, Complex..

[15]  M E J Newman,et al.  Community structure in social and biological networks , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[16]  Andrea Lancichinetti,et al.  Benchmarks for testing community detection algorithms on directed and weighted graphs with overlapping communities. , 2009, Physical review. E, Statistical, nonlinear, and soft matter physics.

[17]  Steve Gregory,et al.  A Fast Algorithm to Find Overlapping Communities in Networks , 2008, ECML/PKDD.

[18]  M E J Newman,et al.  Finding and evaluating community structure in networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[19]  Jean-Loup Guillaume,et al.  Fast unfolding of communities in large networks , 2008, 0803.0476.

[20]  P. Ronhovde,et al.  Multiresolution community detection for megascale networks by information-based replica correlations. , 2008, Physical review. E, Statistical, nonlinear, and soft matter physics.

[21]  Philip S. Yu,et al.  Hierarchical, Parameter-Free Community Discovery , 2008, ECML/PKDD.

[22]  Peng Zhang,et al.  Comparative definition of community and corresponding identifying algorithm. , 2008, Physical review. E, Statistical, nonlinear, and soft matter physics.

[23]  Martin Rosvall,et al.  Maps of random walks on complex networks reveal community structure , 2007, Proceedings of the National Academy of Sciences.

[24]  Richard M. Karp,et al.  Algorithms for graph partitioning on the planted partition model , 2001, Random Struct. Algorithms.