Scalable Community Detection with the Louvain Algorithm

In this paper we present and evaluate a parallel community detection algorithm derived from the state-of-the-art Louvain modularity maximization method. Our algorithm adopts a novel graph mapping and data representation, and relies on can efficient communication runtime, specifically designed for fine-grained applications executed on large-scale supercomputers. We have been able to parallelize graphs with up to 138 billion edges on 8, 192 Blue Gene/Q nodes and 1, 024 P7-IH nodes. Leveraging the convergence properties of our algorithm and the efficient implementation, we can analyze communities of large scale graphs in just a few seconds. To the best of our knowledge, this is the first parallel implementation of the Louvain algorithm that scales to these large data and processor configurations.

[1]  Weixiong Zhang,et al.  An Efficient Spectral Algorithm for Network Community Discovery and Its Applications to Biological and Social Networks , 2007, Seventh IEEE International Conference on Data Mining (ICDM 2007).

[2]  Jure Leskovec,et al.  Statistical properties of community structure in large social and information networks , 2008, WWW.

[3]  Ramakrishnan Rajamony,et al.  PERCS: The IBM POWER7-IH high-performance computing system , 2011, IBM J. Res. Dev..

[4]  R. Dunlap The Golden Ratio and Fibonacci Numbers , 1997 .

[5]  Jure Leskovec,et al.  Defining and Evaluating Network Communities Based on Ground-Truth , 2012, ICDM.

[6]  Hosung Park,et al.  What is Twitter, a social network or a news media? , 2010, WWW '10.

[7]  Anantharaman Kalyanaraman,et al.  Parallel Heuristics for Scalable Community Detection , 2014, 2014 IEEE International Parallel & Distributed Processing Symposium Workshops.

[8]  Michael Ovelgönne,et al.  Distributed community detection in web-scale networks , 2013, 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2013).

[9]  Duncan J. Watts,et al.  Collective dynamics of ‘small-world’ networks , 1998, Nature.

[10]  Tamara G. Kolda,et al.  A Scalable Generative Graph Model with Community Structure , 2013, SIAM J. Sci. Comput..

[11]  Marco Rosa,et al.  Layered label propagation: a multiresolution coordinate-free ordering for compressing social networks , 2010, WWW.

[12]  Fabio Checconi,et al.  Performance Analysis of Graph Algorithms on P7IH , 2014, ISC.

[13]  Boleslaw K. Szymanski,et al.  Overlapping community detection in networks: The state-of-the-art and comparative study , 2011, CSUR.

[14]  Jean-Loup Guillaume,et al.  Fast unfolding of communities in large networks , 2008, 0803.0476.

[15]  Edward T. Bullmore,et al.  Neuroinformatics Original Research Article , 2022 .

[16]  Santo Fortunato,et al.  Community detection in graphs , 2009, ArXiv.

[17]  Andrea Lancichinetti,et al.  Benchmarks for testing community detection algorithms on directed and weighted graphs with overlapping communities. , 2009, Physical review. E, Statistical, nonlinear, and soft matter physics.

[18]  R. Guimerà,et al.  Modularity from fluctuations in random graphs and complex networks. , 2004, Physical review. E, Statistical, nonlinear, and soft matter physics.

[19]  Christian Staudt,et al.  Engineering High-Performance Community Detection Heuristics for Massive Graphs , 2013, 2013 42nd International Conference on Parallel Processing.

[20]  Viktor K. Prasanna,et al.  Fast parallel algorithm for unfolding of communities in large graphs , 2014, 2014 IEEE High Performance Extreme Computing Conference (HPEC).

[21]  A. Arenas,et al.  Community detection in complex networks using extremal optimization. , 2005, Physical review. E, Statistical, nonlinear, and soft matter physics.

[22]  Pablo Rodriguez,et al.  Divide and Conquer: Partitioning Online Social Networks , 2009, ArXiv.

[23]  Fabio Checconi,et al.  Scalable Single Source Shortest Path Algorithms for Massively Parallel Systems , 2017, IEEE Transactions on Parallel and Distributed Systems.

[24]  Ankur Narang,et al.  Fast Community Detection Algorithm with GPUs and Multicore Architectures , 2011, 2011 IEEE International Parallel & Distributed Processing Symposium.

[25]  U. Brandes,et al.  Maximizing Modularity is hard , 2006, physics/0608255.

[26]  Jianyong Wang,et al.  Parallel community detection on large networks with propinquity dynamics , 2009, KDD.

[27]  M E J Newman,et al.  Finding and evaluating community structure in networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[28]  Albert-László Barabási,et al.  Internet: Diameter of the World-Wide Web , 1999, Nature.

[29]  Tamara G. Kolda,et al.  Community structure and scale-free collections of Erdös-Rényi graphs , 2011, Physical review. E, Statistical, nonlinear, and soft matter physics.

[30]  Nitesh V. Chawla,et al.  Market basket analysis with networks , 2011, Social Network Analysis and Mining.

[31]  S. E. Schaeffer Survey Graph clustering , 2007 .

[32]  Bin Wu,et al.  Efficient Dense Structure Mining Using MapReduce , 2009, 2009 IEEE International Conference on Data Mining Workshops.

[33]  R. C. Prevost The Load Factor , 1993 .

[34]  Sanjukta Bhowmick,et al.  A Template for Parallelizing the Louvain Method for Modularity Maximization , 2013 .

[35]  M. Newman Communities, modules and large-scale structure in networks , 2011, Nature Physics.

[36]  Fabio Checconi,et al.  Traversing Trillions of Edges in Real Time: Graph Exploration on Large-Scale Parallel Machines , 2014, 2014 IEEE 28th International Parallel and Distributed Processing Symposium.

[37]  David Lo,et al.  Hierarchical Parallel Algorithm for Modularity-Based Community Detection Using GPUs , 2013, Euro-Par.

[38]  M E J Newman,et al.  Fast algorithm for detecting community structure in networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[39]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[40]  Donald Ervin Knuth,et al.  The Art of Computer Programming , 1968 .

[41]  Christos Faloutsos,et al.  R-MAT: A Recursive Model for Graph Mining , 2004, SDM.

[42]  Sebastiano Vigna,et al.  The webgraph framework I: compression techniques , 2004, WWW '04.

[43]  David A. Bader,et al.  Scalable Multi-threaded Community Detection in Social Networks , 2012, 2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops & PhD Forum.

[44]  M. Newman Analysis of weighted networks. , 2004, Physical review. E, Statistical, nonlinear, and soft matter physics.

[45]  Sergio Gómez,et al.  Size reduction of complex networks preserving modularity , 2007, ArXiv.

[46]  Chris Hankin,et al.  Fast Multi-Scale Community Detection based on Local Criteria within a Multi-Threaded Algorithm , 2013, ArXiv.

[47]  Donald E. Knuth,et al.  The art of computer programming, volume 3: (2nd ed.) sorting and searching , 1998 .

[48]  Réka Albert,et al.  Near linear time algorithm to detect community structures in large-scale networks. , 2007, Physical review. E, Statistical, nonlinear, and soft matter physics.

[49]  M. Weigt,et al.  On the properties of small-world network models , 1999, cond-mat/9903411.