Scalable static and dynamic community detection using Grappolo

Graph clustering, popularly known as community detection, is a fundamental kernel for several applications of relevance to the Defense Advanced Research Projects Agency's (DARPA) Hierarchical Identify Verify Exploit (HIVE) Program. Clusters or communities represent natural divisions within a network that are densely connected within a cluster and sparsely connected to the rest of the network. The need to compute clustering on large scale data necessitates the development of efficient algorithms that can exploit modern architectures that are fundamentally parallel in nature. However, due to their irregular and inherently sequential nature, many of the current algorithms for community detection are challenging to parallelize. In response to the HIVE Graph Challenge, we present several parallelization heuristics for fast community detection using the Louvain method as the serial template. We implement all the heuristics in a software library called Grappolo. Using the inputs from the HIVE Challenge, we demonstrate superior performance and high quality solutions based on four parallelization heuristics. We use Grappolo on static graphs as the first step towards community detection on streaming graphs.

[1]  Ulrik Brandes,et al.  On Modularity Clustering , 2008, IEEE Transactions on Knowledge and Data Engineering.

[2]  Hao Lu,et al.  Fast Uncovering of Graph Communities on a Chip: Toward Scalable Community Detection on Multicore and Manycore Platforms , 2016, Found. Trends Electron. Des. Autom..

[3]  Santo Fortunato,et al.  Community detection in graphs , 2009, ArXiv.

[4]  David A. Bader,et al.  Scalable Multi-threaded Community Detection in Social Networks , 2012, 2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops & PhD Forum.

[5]  Anantharaman Kalyanaraman,et al.  Parallel Heuristics for Scalable Community Detection , 2014, 2014 IEEE International Parallel & Distributed Processing Symposium Workshops.

[6]  M. Newman Analysis of weighted networks. , 2004, Physical review. E, Statistical, nonlinear, and soft matter physics.

[7]  George Karypis,et al.  Multi-threaded modularity based graph clustering using the multilevel paradigm , 2015, J. Parallel Distributed Comput..

[8]  M. Newman,et al.  Finding community structure in very large networks. , 2004, Physical review. E, Statistical, nonlinear, and soft matter physics.

[9]  Antonino Tumeo,et al.  Community Detection on the GPU , 2017, 2017 IEEE International Parallel and Distributed Processing Symposium (IPDPS).

[10]  William Song,et al.  Streaming graph challenge: Stochastic block partition , 2017, 2017 IEEE High Performance Extreme Computing Conference (HPEC).

[11]  Jean-Loup Guillaume,et al.  Fast unfolding of communities in large networks , 2008, 0803.0476.

[12]  David A. Bader,et al.  Parallel Community Detection for Massive Graphs , 2011, PPAM.

[13]  M E J Newman,et al.  Finding and evaluating community structure in networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[14]  James Parker,et al.  on Knowledge and Data Engineering, , 1990 .

[15]  Mark E. J. Newman,et al.  The Structure and Function of Complex Networks , 2003, SIAM Rev..

[16]  Rob H. Bisseling,et al.  Graph coarsening and clustering on the GPU , 2012, Graph Partitioning and Graph Clustering.