Scalable link community detection: A local dispersion-aware approach

Real-life systems involving interacting objects are typically modeled as graphs and can often grow very large in size. Revealing the community structure of such systems is crucial in helping us better understand their complex nature. However, the ever-increasing size of real-world graphs, and our evolving perception of what a community is, make the task of community detection very challenging. One such challenge, is the discovery of the possibly overlapping communities of a given node in a billion-node graph. This problem is very common in modern large social networks like Facebook and Linkedln. In this paper, we propose a scalable local community detection approach to efficiently unfold the communities of individual target nodes in a given network. Our goal is to reveal the groupings formed around nodes (e.g., users) by leveraging the relations of the different contexts the nodes participate in. Our algorithm, termed Local Dispersion-aware Link Communities or LDLC, measures the similarity of pairs of links in the graph as well as the extent of their participation in multiple contexts. Then, it determines the ordering that we should group the links in order to form communities. Our approach is not affected by constraints existent in previous techniques (e.g., the need for several seed nodes or the need to collapse multiple overlapping communities to one). Our experimental evaluation using ground-truth communities for a wide range of large real-world networks show that LDLC significantly outperforms state-of-the-art methods on both accuracy and efficiency.

[1]  David F. Gleich,et al.  Mining Large Graphs , 2016, Handbook of Big Data.

[2]  Jure Leskovec,et al.  Defining and Evaluating Network Communities Based on Ground-Truth , 2012, ICDM.

[3]  Martin Rosvall,et al.  Multilevel Compression of Random Walks on Networks Reveals Hierarchical Organization in Large Integrated Systems , 2010, PloS one.

[4]  M E J Newman,et al.  Finding and evaluating community structure in networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[5]  Jure Leskovec,et al.  Community-Affiliation Graph Model for Overlapping Network Community Detection , 2012, 2012 IEEE 12th International Conference on Data Mining.

[6]  Jon M. Kleinberg,et al.  Romantic partnerships and the dispersion of social ties: a network analysis of relationship status on facebook , 2013, CSCW.

[7]  Santo Fortunato,et al.  Community detection in graphs , 2009, ArXiv.

[8]  T. Vicsek,et al.  Uncovering the overlapping community structure of complex networks in nature and society , 2005, Nature.

[9]  Mark S. Granovetter The Strength of Weak Ties , 1973, American Journal of Sociology.

[10]  Christos Faloutsos,et al.  Scalable community discovery from multi-faceted graphs , 2015, 2015 IEEE International Conference on Big Data (Big Data).

[11]  Michalis Faloutsos,et al.  On power-law relationships of the Internet topology , 1999, SIGCOMM '99.

[12]  Jure Leskovec,et al.  Overlapping community detection at scale: a nonnegative matrix factorization approach , 2013, WSDM.

[13]  Kun He,et al.  Uncovering the Small Community Structure in Large Networks: A Local Spectral Approach , 2015, WWW.

[14]  M. Kochen,et al.  Contacts and influence , 1978 .

[15]  Matthieu Latapy,et al.  Computing Communities in Large Networks Using Random Walks , 2004, J. Graph Algorithms Appl..

[16]  Jean-Loup Guillaume,et al.  Fast unfolding of communities in large networks , 2008, 0803.0476.

[17]  Jure Leskovec,et al.  Overlapping Communities Explain Core–Periphery Organization of Networks , 2014, Proceedings of the IEEE.

[18]  M. Newman,et al.  Finding community structure in very large networks. , 2004, Physical review. E, Statistical, nonlinear, and soft matter physics.

[19]  Jure Leskovec,et al.  Structure and Overlaps of Ground-Truth Communities in Networks , 2014, TIST.

[20]  David F. Gleich,et al.  Heat kernel based community detection , 2014, KDD.

[21]  Santo Fortunato,et al.  Community detection in networks: Structural communities versus ground truth , 2014, Physical review. E, Statistical, nonlinear, and soft matter physics.

[22]  Kun He,et al.  Detecting Overlapping Communities from Local Spectral Subspaces , 2015, 2015 IEEE International Conference on Data Mining.

[23]  Leonard M. Freeman,et al.  A set of measures of centrality based upon betweenness , 1977 .

[24]  M E J Newman,et al.  Community structure in social and biological networks , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[25]  Dino Pedreschi,et al.  DEMON: a local-first discovery method for overlapping communities , 2012, KDD.

[26]  David F. Gleich,et al.  Vertex neighborhoods, low conductance cuts, and good seeds for local community methods , 2012, KDD.

[27]  Sune Lehmann,et al.  Link communities reveal multiscale complexity in networks , 2009, Nature.

[28]  P. V. Marsden,et al.  Measuring Tie Strength , 1984 .

[29]  N. Metropolis,et al.  An Efficient Heuristic Procedure for Partitioning Graphs , 2017 .

[30]  Inderjit S. Dhillon,et al.  Overlapping community detection using seed set expansion , 2013, CIKM.

[31]  S. Fortunato,et al.  Resolution limit in community detection , 2006, Proceedings of the National Academy of Sciences.

[32]  R. Lambiotte,et al.  Line graphs, link partitions, and overlapping communities. , 2009, Physical review. E, Statistical, nonlinear, and soft matter physics.

[33]  S.,et al.  An Efficient Heuristic Procedure for Partitioning Graphs , 2022 .