High-Quality Shared-Memory Graph Partitioning

Partitioning graphs into blocks of roughly equal size such that few edges run between blocks is a frequently needed operation in processing graphs. Recently, size, variety, and structural complexity of these networks has grown dramatically. Unfortunately, previous approaches to parallel graph partitions have problems in this context since they often show a negative trade-off between speed and quality. We present an approach to multi-level shared-memory parallel graph partitioning that guarantees balanced solutions, shows high speedups for a variety of large graphs and yields very good quality independently of the number of cores used. For example, our algorithm partitions a graph with 2 billions edges using 16 cores in 53 seconds producing a solution that cuts two times less edges than our main competitor which runs 33 seconds. Important ingredients include parallel label propagation for both coarsening and improvement, parallel initial partitioning, a simple yet effective approach to parallel localized local search, and fast locality preserving hash tables.

[1]  Peter Sanders,et al.  Concurrent hash tables: fast and general?(!) , 2016, PPoPP.

[2]  Mikkel Thorup,et al.  The power of simple tabulation hashing , 2010, STOC.

[3]  Vitaly Osipov,et al.  n-Level Graph Partitioning , 2010, ESA.

[4]  Timothy A. Davis,et al.  The university of Florida sparse matrix collection , 2011, TOMS.

[5]  Curt Jones,et al.  Finding Good Approximate Vertex and Edge Partitions is NP-Hard , 1992, Inf. Process. Lett..

[6]  Amin Vahdat,et al.  Hyperbolic Geometry of Complex Networks , 2010, Physical review. E, Statistical, nonlinear, and soft matter physics.

[7]  Réka Albert,et al.  Near linear time algorithm to detect community structures in large-scale networks. , 2007, Physical review. E, Statistical, nonlinear, and soft matter physics.

[8]  John E. Savage,et al.  Parallelism in Graph-Partitioning , 1991, J. Parallel Distributed Comput..

[9]  Henning Meyerhenke,et al.  Shape optimizing load balancing for MPI-parallel adaptive numerical simulations , 2012, Graph Partitioning and Graph Clustering.

[10]  Peter Sanders,et al.  Engineering a scalable high quality graph partitioner , 2009, 2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS).

[11]  Ümit V. Çatalyürek,et al.  Hypergraph-Partitioning-Based Decomposition for Parallel Sparse-Matrix Vector Multiplication , 1999, IEEE Trans. Parallel Distributed Syst..

[12]  Vipin Kumar,et al.  Parallel Multilevel k-way Partitioning Scheme for Irregular Graphs , 1996, Proceedings of the 1996 ACM/IEEE Conference on Supercomputing.

[13]  Peter Sanders,et al.  Recent Advances in Graph Partitioning , 2013, Algorithm Engineering.

[14]  Guy E. Blelloch,et al.  Reducing contention through priority updates , 2013, PPoPP '13.

[15]  Vipin Kumar,et al.  A Fast and High Quality Multilevel Scheme for Partitioning Irregular Graphs , 1998, SIAM J. Sci. Comput..

[16]  R. M. Mattheyses,et al.  A Linear-Time Heuristic for Improving Network Partitions , 1982, 19th Design Automation Conference.

[17]  John D. Hunter,et al.  Matplotlib: A 2D Graphics Environment , 2007, Computing in Science & Engineering.

[18]  Peter Sanders,et al.  Partitioning Complex Networks via Size-Constrained Clustering , 2014, SEA.

[19]  Peter Sanders,et al.  MCSTL: the multi-core standard template library , 2007, PPOPP.

[20]  Peter Sanders,et al.  Engineering Multilevel Graph Partitioning Algorithms , 2010, ESA.

[21]  David A. Bader,et al.  Benchmarking for Graph Clustering and Partitioning , 2014, Encyclopedia of Social Network Analysis and Mining.

[22]  Henning Meyerhenke,et al.  Generating Random Hyperbolic Graphs in Subquadratic Time , 2015, ISAAC.

[23]  Peter Sanders,et al.  High-Quality Shared-Memory Graph Partitioning , 2020, IEEE Transactions on Parallel and Distributed Systems.

[24]  Lars Backstrom,et al.  Balanced label propagation for partitioning massive graphs , 2013, WSDM.

[25]  Christian Staudt,et al.  Engineering Parallel Algorithms for Community Detection in Massive Networks , 2013, IEEE Transactions on Parallel and Distributed Systems.

[26]  James Reinders,et al.  Intel® threading building blocks , 2008 .

[27]  Peter Sanders,et al.  In-place Parallel Super Scalar Samplesort (IPSSSSo) , 2017, ESA.

[28]  Robert van Engelen,et al.  Graph Partitioning for High Performance Scienti c Simulations , 2000 .

[29]  C. Walshaw JOSTLE : parallel multilevel graph-partitioning software – an overview , 2008 .

[30]  Alexey Kukanov,et al.  The Foundations for Scalable Multicore Software in Intel Threading Building Blocks , 2007 .

[31]  Chris Walshaw,et al.  Mesh Partitioning: A Multilevel Balancing and Refinement Algorithm , 2000, SIAM J. Sci. Comput..

[32]  George Karypis,et al.  Multi-threaded Graph Partitioning , 2013, 2013 IEEE 27th International Symposium on Parallel and Distributed Processing.

[33]  Kurt Mehlhorn,et al.  Review of algorithms and data structures: the basic toolbox by Kurt Mehlhorn and Peter Sanders , 2011, SIGA.

[34]  Sebastiano Vigna,et al.  The webgraph framework I: compression techniques , 2004, WWW '04.

[35]  George Karypis,et al.  A Parallel Hill-Climbing Refinement Algorithm for Graph Partitioning , 2016, 2016 45th International Conference on Parallel Processing (ICPP).