On community detection in real-world networks and the importance of degree assortativity

Graph clustering, often addressed as community detection, is a prominent task in the domain of graph data mining with dozens of algorithms proposed in recent years. In this paper, we focus on several popular community detection algorithms with low computational complexity and with decent performance on the artificial benchmarks, and we study their behaviour on real-world networks. Motivated by the observation that there is a class of networks for which the community detection methods fail to deliver good community structure, we examine the assortativity coefficient of ground-truth communities and show that assortativity of a community structure can be very different from the assortativity of the original network. We then examine the possibility of exploiting the latter by weighting edges of a network with the aim to improve the community detection outputs for networks with assortative community structure. The evaluation shows that the proposed weighting can significantly improve the results of community detection methods on networks with assortative community structure.

[1]  Andrea Lancichinetti,et al.  Detecting the overlapping and hierarchical community structure in complex networks , 2008, 0802.1218.

[2]  Ian H. Witten,et al.  Learning to link with wikipedia , 2008, CIKM '08.

[3]  Philip S. Yu,et al.  Hierarchical, Parameter-Free Community Discovery , 2008, ECML/PKDD.

[4]  M. Hasler,et al.  Network community-detection enhancement by proper weighting. , 2011, Physical review. E, Statistical, nonlinear, and soft matter physics.

[5]  Jure Leskovec,et al.  Empirical comparison of algorithms for network community detection , 2010, WWW '10.

[6]  Santo Fortunato,et al.  Community detection in graphs , 2009, ArXiv.

[7]  Jure Leskovec,et al.  Defining and evaluating network communities based on ground-truth , 2012, Knowledge and Information Systems.

[8]  M E J Newman Assortative mixing in networks. , 2002, Physical review letters.

[9]  Kjetil Nørvåg,et al.  Fast Detection of Size-Constrained Communities in Large Networks , 2010, WISE.

[10]  Réka Albert,et al.  Near linear time algorithm to detect community structures in large-scale networks. , 2007, Physical review. E, Statistical, nonlinear, and soft matter physics.

[11]  Jean-Loup Guillaume,et al.  Fast unfolding of communities in large networks , 2008, 0803.0476.

[12]  Jonathan W. Berry,et al.  Tolerating the community detection resolution limit with edge weighting. , 2008, Physical review. E, Statistical, nonlinear, and soft matter physics.

[13]  Andrea Lancichinetti,et al.  Benchmarks for testing community detection algorithms on directed and weighted graphs with overlapping communities. , 2009, Physical review. E, Statistical, nonlinear, and soft matter physics.

[14]  Leon Danon,et al.  Comparing community structure identification , 2005, cond-mat/0505245.

[15]  Yun Chi,et al.  Combining link and content for community detection: a discriminative approach , 2009, KDD.

[16]  Robin I. M. Dunbar Coevolution of neocortical size, group size and language in humans , 1993, Behavioral and Brain Sciences.

[17]  Andrea Lancichinetti,et al.  Community detection algorithms: a comparative analysis: invited presentation, extended abstract , 2009, VALUETOOLS.

[18]  Steve Gregory,et al.  A Fast Algorithm to Find Overlapping Communities in Networks , 2008, ECML/PKDD.

[19]  M E J Newman,et al.  Finding and evaluating community structure in networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[20]  Jianyong Wang,et al.  Parallel community detection on large networks with propinquity dynamics , 2009, KDD.

[21]  J. Reichardt,et al.  Structure in Complex Networks , 2008 .

[22]  John E. Hopcroft,et al.  On the separability of structural classes of communities , 2012, KDD.

[23]  Ladislav Hluchý,et al.  The SemSets model for ad-hoc semantic list search , 2012, WWW.