Community detection using a neighborhood strength driven Label Propagation Algorithm

Studies of community structure and evolution in large social networks require a fast and accurate algorithm for community detection. As the size of analyzed communities grows, complexity of the community detection algorithm needs to be kept close to linear. The Label Propagation Algorithm (LPA) has the benefits of nearly-linear running time and easy implementation, thus it forms a good basis for efficient community detection methods. In this paper, we propose new update rule and label propagation criterion in LPA to improve both its computational efficiency and the quality of communities that it detects. The speed is optimized by avoiding unnecessary updates performed by the original algorithm. This change reduces significantly (by order of magnitude for large networks) the number of iterations that the algorithm executes. We also evaluate our generalization of the LPA update rule that takes into account, with varying strength, connections to the neighborhood of a node considering a new label. Experiments on computer generated networks and a wide range of social networks show that our new rule improves the quality of the detected communities compared to those found by the original LPA. The benefit of considering positive neighborhood strength is pronounced especially on real-world networks containing sufficiently large fraction of nodes with high clustering coefficient.

[1]  Steve Gregory,et al.  Finding overlapping communities in networks by label propagation , 2009, ArXiv.

[2]  Duncan J. Watts,et al.  Collective dynamics of ‘small-world’ networks , 1998, Nature.

[3]  M. Newman,et al.  Finding community structure in networks using the eigenvectors of matrices. , 2006, Physical review. E, Statistical, nonlinear, and soft matter physics.

[4]  Stefan Bornholdt,et al.  Detecting fuzzy community structures in complex networks with a Potts model. , 2004, Physical review letters.

[5]  Damien Magoni,et al.  Completeness of the Internet Core Topology Collected by a Fast Mapping Software , 2003 .

[6]  M. Newman,et al.  The structure of scientific collaboration networks. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[7]  Réka Albert,et al.  Near linear time algorithm to detect community structures in large-scale networks. , 2007, Physical review. E, Statistical, nonlinear, and soft matter physics.

[8]  Fang Wu,et al.  Finding communities in linear time: a physics approach , 2003, ArXiv.

[9]  S. Dongen Graph clustering by flow simulation , 2000 .

[10]  Ken Wakita,et al.  Finding community structure in mega-scale social networks: [extended abstract] , 2007, WWW '07.

[11]  M. Barber,et al.  Detecting network communities by propagating labels under constraints. , 2009, Physical review. E, Statistical, nonlinear, and soft matter physics.

[12]  M E J Newman,et al.  Fast algorithm for detecting community structure in networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[13]  Leon Danon,et al.  Comparing community structure identification , 2005, cond-mat/0505245.

[14]  Christos Faloutsos,et al.  Graph evolution: Densification and shrinking diameters , 2006, TKDD.

[15]  Nitesh V. Chawla,et al.  Identifying and evaluating community structure in complex networks , 2010, Pattern Recognit. Lett..

[16]  Pietro Liò,et al.  Towards real-time community detection in large networks. , 2008, Physical review. E, Statistical, nonlinear, and soft matter physics.

[17]  Padhraic Smyth,et al.  A Spectral Clustering Approach To Finding Communities in Graph , 2005, SDM.

[18]  D. Fell,et al.  The small world of metabolism , 2000, Nature Biotechnology.

[19]  Mark E. J. Newman,et al.  The Structure and Function of Complex Networks , 2003, SIAM Rev..

[20]  M. Newman,et al.  Finding community structure in very large networks. , 2004, Physical review. E, Statistical, nonlinear, and soft matter physics.

[21]  M E J Newman,et al.  Finding and evaluating community structure in networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[22]  Andrei Z. Broder,et al.  Graph structure in the Web , 2000, Comput. Networks.

[23]  Eytan Domany,et al.  Superparamagnetic Clustering of Data , 1996 .

[24]  Donald E. Knuth,et al.  The Stanford GraphBase - a platform for combinatorial computing , 1993 .

[25]  Jitendra Malik,et al.  Normalized Cuts and Image Segmentation , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[26]  Z. Di,et al.  Community detection by signaling on complex networks. , 2007, Physical review. E, Statistical, nonlinear, and soft matter physics.

[27]  K. Kaski,et al.  Limited resolution in complex network community detection with Potts model approach , 2006 .

[28]  A. Arenas,et al.  Models of social networks based on social distance attachment. , 2004, Physical review. E, Statistical, nonlinear, and soft matter physics.

[29]  Hawoong Jeong,et al.  Random field Ising model and community structure in complex networks , 2005, cond-mat/0502672.

[30]  A Díaz-Guilera,et al.  Self-similar community structure in a network of human interactions. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[31]  S. Fortunato,et al.  Resolution limit in community detection , 2006, Proceedings of the National Academy of Sciences.

[32]  G. Caldarelli,et al.  Detecting communities in large networks , 2004, cond-mat/0402499.

[33]  A. Arenas,et al.  Community detection in complex networks using extremal optimization. , 2005, Physical review. E, Statistical, nonlinear, and soft matter physics.

[34]  John Scott Social Network Analysis , 1988 .

[35]  L. D. Costa Hub-Based Community Finding , 2004, cond-mat/0405022.

[36]  Reinhard Lipowsky,et al.  Network Brownian Motion: A New Method to Measure Vertex-Vertex Proximity and to Identify Communities and Subcommunities , 2004, International Conference on Computational Science.

[37]  Amedeo Caflisch,et al.  Efficient modularity optimization by multistep greedy algorithm and vertex mover refinement. , 2007, Physical review. E, Statistical, nonlinear, and soft matter physics.

[38]  Claudio Castellano,et al.  Defining and identifying communities in networks. , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[39]  Richard M. Karp,et al.  Algorithms for graph partitioning on the planted partition model , 1999, Random Struct. Algorithms.

[40]  F. Radicchi,et al.  Benchmark graphs for testing community detection algorithms. , 2008, Physical review. E, Statistical, nonlinear, and soft matter physics.

[41]  Erik M Bollt,et al.  Local method for detecting communities. , 2005, Physical review. E, Statistical, nonlinear, and soft matter physics.

[42]  Matthieu Latapy,et al.  Computing Communities in Large Networks Using Random Walks , 2004, J. Graph Algorithms Appl..

[43]  J. Kertész,et al.  On the equivalence of the label propagation method of community detection and a Potts model approach , 2008, 0803.2804.

[44]  Jean-Loup Guillaume,et al.  Fast unfolding of communities in large networks , 2008, 0803.0476.

[45]  M E J Newman,et al.  Community structure in social and biological networks , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[46]  Jitendra Malik,et al.  Normalized cuts and image segmentation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[47]  W. Zachary,et al.  An Information Flow Model for Conflict and Fission in Small Groups , 1977, Journal of Anthropological Research.