FOX: Fast Overlapping Community Detection Algorithm in Big Weighted Networks

Community detection is a hot topic for researchers in the fields of graph theory, social networks, and biological networks. Generally speaking, a community refers to a group of densely linked nodes in the network. Nodes usually have more than one community label, indicating their multiple roles or functions in the network. Unfortunately, existing solutions aiming at overlapping community detection are not capable of scaling to large-scale networks with millions of nodes and edges. In this article, we propose a fastoverlapping-community-detection algorithm—FOX. In the experiment on a network with 3.9 millions nodes and 20 millions edges, the detection finishes in 41 min and provides the most qualified results. The secondfastest algorithm, however, takes almost five times longer to run. As for another network with 22 millions nodes and 127 millions edges, our algorithm is the only one that can provide an overlapping community detection result and it only takes 533 min. Our algorithm is a typical heuristic algorithm, measuring the closeness of a node to a community by counting the number of triangles formed by the node and two other nodes in the community. We also extend the exploitation of triangle to open-triangle, which enlarges the scale of the detected communities.

[1]  A. Rapoport Spread of information through a population with socio-structural bias: I. Assumption of transitivity , 1953 .

[2]  Anatol Rapoport,et al.  Spread of information through a population with socio-structural bias: III. Suggested experimental procedures , 1954 .

[3]  L. Miller,et al.  Self-disclosure and liking: a meta-analytic review. , 1994, Psychological bulletin.

[4]  Deborah A. Prentice,et al.  Asymmetries in Attachments to Groups and to their Members: Distinguishing between Common-Identity and Common-Bond Groups , 1994 .

[5]  F C Santos,et al.  Epidemic spreading and cooperation dynamics on homogeneous small-world networks. , 2005, Physical review. E, Statistical, nonlinear, and soft matter physics.

[6]  M E J Newman,et al.  Modularity and community structure in networks. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[7]  Illés J. Farkas,et al.  CFinder: locating cliques and overlapping modules in biological networks , 2006, Bioinform..

[8]  Thomas Schank,et al.  Algorithmic Aspects of Triangle-Based Network Analysis , 2007 .

[9]  S. Kiesler,et al.  Applying Common Identity and Bond Theory to Design of Online Communities , 2007 .

[10]  A. Barabasi,et al.  Quantifying social group evolution , 2007, Nature.

[11]  J. Kumpula,et al.  Sequential algorithm for fast clique percolation. , 2008, Physical review. E, Statistical, nonlinear, and soft matter physics.

[12]  Jean-Loup Guillaume,et al.  Fast unfolding of communities in large networks , 2008, 0803.0476.

[13]  Martin Rosvall,et al.  Maps of random walks on complex networks reveal community structure , 2007, Proceedings of the National Academy of Sciences.

[14]  Stephen Kelley The existence and discovery of overlapping communities in large-scale networks , 2009 .

[15]  Andrea Lancichinetti,et al.  Detecting the overlapping and hierarchical community structure in complex networks , 2008, 0802.1218.

[16]  V. Carchiolo,et al.  Extending the definition of modularity to directed graphs with overlapping communities , 2008, 0801.1647.

[17]  Tom L. Roberts,et al.  Proposing the online community self-disclosure model: the case of working professionals in France and the U.K. who use online communities , 2010, Eur. J. Inf. Syst..

[18]  Bradley S. Rees,et al.  Overlapping Community Detection by Collective Friendship Group Inference , 2010, 2010 International Conference on Advances in Social Networks Analysis and Mining.

[19]  Wei Chen,et al.  A game-theoretic framework to identify overlapping communities in social networks , 2010, Data Mining and Knowledge Discovery.

[20]  Santo Fortunato,et al.  Finding Statistically Significant Communities in Networks , 2010, PloS one.

[21]  Y. Narahari,et al.  A game theory inspired, decentralized, local information based algorithm for community detection in social graphs , 2012, Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012).

[22]  Ling Huang,et al.  Evolution of social-attribute networks: measurements, modeling, and implications using google+ , 2012, Internet Measurement Conference.

[23]  Josep-Lluís Larriba-Pey,et al.  Shaping communities out of triangles , 2012, CIKM.

[24]  Jure Leskovec,et al.  Defining and evaluating network communities based on ground-truth , 2012, Knowledge and Information Systems.

[25]  Tamara G. Kolda,et al.  Triadic Measures on Graphs: The Power of Wedge Sampling , 2012, SDM.

[26]  Boleslaw K. Szymanski,et al.  Overlapping community detection in networks: The state-of-the-art and comparative study , 2011, CSUR.

[27]  David M Blei,et al.  Efficient discovery of overlapping communities in massive networks , 2013, Proceedings of the National Academy of Sciences.

[28]  Bin Wu,et al.  A link clustering based overlapping community detection algorithm , 2013, Data Knowl. Eng..

[29]  Lars Backstrom,et al.  Balanced label propagation for partitioning massive graphs , 2013, WSDM.

[30]  Jure Leskovec,et al.  Overlapping community detection at scale: a nonnegative matrix factorization approach , 2013, WSDM.

[31]  Ulrik Brandes,et al.  Triangle Listing Algorithms: Back from the Diversion , 2014, ALENEX.

[32]  Jure Leskovec,et al.  {SNAP Datasets}: {Stanford} Large Network Dataset Collection , 2014 .

[33]  Anima Anandkumar,et al.  A tensor approach to learning mixed membership community models , 2013, J. Mach. Learn. Res..

[34]  Chris Hankin,et al.  Fast multi-scale detection of overlapping communities using local criteria , 2014, Computing.

[35]  Jeffrey Xu Yu,et al.  Querying k-truss community in large and dynamic graphs , 2014, SIGMOD Conference.

[36]  Josep-Lluís Larriba-Pey,et al.  High quality, scalable and parallel community detection for large real graphs , 2014, WWW.

[37]  Stéphane Bressan,et al.  Fast Disjoint and Overlapping Community Detection , 2015, Trans. Large Scale Data Knowl. Centered Syst..

[38]  Lizhen Wang,et al.  A Fast Approach for Detecting Overlapping Communities in Social Networks Based on Game Theory , 2015, BICOD.

[39]  Michel Crampes,et al.  Overlapping Community Detection Optimization and Nash Equilibrium , 2014, WIMS.

[40]  Radhika Arava,et al.  An Efficient homophilic model and Algorithms for Community Detection using Nash Dynamics , 2015, ArXiv.

[41]  Junming Shao,et al.  Community Detection based on Distance Dynamics , 2015, KDD.

[42]  Meng Wang,et al.  Community Detection in Social Networks: An In-depth Benchmarking Study with a Procedure-Oriented Framework , 2015, Proc. VLDB Endow..

[43]  Inderjit S. Dhillon,et al.  Non-exhaustive, Overlapping Clustering via Low-Rank Semidefinite Programming , 2015, KDD.

[44]  David Dominguez-Sal,et al.  Distributed Community Detection with the WCC Metric , 2014, WWW.

[45]  Jeffrey Xu Yu,et al.  Influential Community Search in Large Networks , 2015, Proc. VLDB Endow..

[46]  Inderjit S. Dhillon,et al.  Overlapping Community Detection Using Neighborhood-Inflated Seed Expansion , 2015, IEEE Transactions on Knowledge and Data Engineering.

[47]  Yan Zhang,et al.  Efficient and Scalable Detection of Overlapping Communities in Big Networks , 2016, 2016 IEEE 16th International Conference on Data Mining (ICDM).

[48]  S. Athey,et al.  A Theory of Community Formation and Social Hierarchy , 2016 .

[49]  Linqiang Pan,et al.  A Fast Overlapping Community Detection Algorithm Based on Weak Cliques for Large-Scale Networks , 2017, IEEE Transactions on Computational Social Systems.

[50]  Kevin Chen-Chuan Chang,et al.  Learning Community Embedding with Community Detection and Node Embedding on Graphs , 2017, CIKM.

[51]  Jian Pei,et al.  Community Preserving Network Embedding , 2017, AAAI.

[52]  Xingyi Zhang,et al.  A Mixed Representation-Based Multiobjective Evolutionary Algorithm for Overlapping Community Detection , 2017, IEEE Transactions on Cybernetics.