Complex Network Partitioning Using Label Propagation

We present PuLP (partitioning using label propagation), a parallel and memory-efficient graph partitioning method specifically designed to partition low-diameter networks with skewed degree distributions on shared-memory multicore platforms. Graph partitioning is an important problem in scientific computing because it impacts the execution time and energy efficiency of computations on distributed-memory platforms. Partitioning determines the in-memory layout of a graph, which affects locality, intertask load balance, communication time, and overall memory utilization. A novel feature of our PuLP method is that it optimizes for multiple objective metrics simultaneously, while satisfying multiple partitioning constraints. Using our method, we are able to partition a web crawl with billions of edges on a single compute server in under a minute. For a collection of test graphs, we show that PuLP uses up to $7.8\times$ less memory than state-of-the-art partitioners and is $5.0\times$ faster, on average, than a...

[1]  Réka Albert,et al.  Near linear time algorithm to detect community structures in large-scale networks. , 2007, Physical review. E, Statistical, nonlinear, and soft matter physics.

[2]  Marco Rosa,et al.  Layered label propagation: a multiresolution coordinate-free ordering for compressing social networks , 2010, WWW.

[3]  Nancy M. Amato,et al.  Scaling Techniques for Massive Scale-Free Graphs in Distributed (External) Memory , 2013, 2013 IEEE 27th International Symposium on Parallel and Distributed Processing.

[4]  Haixun Wang,et al.  Trinity: a distributed graph engine on a memory cloud , 2013, SIGMOD '13.

[5]  Félix Cuadrado,et al.  xDGP: A Dynamic Graph Processing System with Adaptive Partitioning , 2013, ArXiv.

[6]  Sivasankaran Rajamanickam,et al.  BFS and Coloring-Based Parallel Algorithms for Strongly Connected Components and Related Problems , 2014, 2014 IEEE 28th International Parallel and Distributed Processing Symposium.

[7]  Bruce Hendrickson,et al.  Partitioning for complex objectives , 2001, Proceedings 15th International Parallel and Distributed Processing Symposium. IPDPS 2001.

[8]  David A. Bader,et al.  Benchmarking for Graph Clustering and Partitioning , 2014, Encyclopedia of Social Network Analysis and Mining.

[9]  Sebastiano Vigna,et al.  The webgraph framework I: compression techniques , 2004, WWW '04.

[10]  Joseph Gonzalez,et al.  PowerGraph: Distributed Graph-Parallel Computation on Natural Graphs , 2012, OSDI.

[11]  Lars Backstrom,et al.  Balanced label propagation for partitioning massive graphs , 2013, WSDM.

[12]  Sivasankaran Rajamanickam,et al.  PuLP: Scalable multi-objective multi-constraint partitioning for small-world networks , 2014, 2014 IEEE International Conference on Big Data (Big Data).

[13]  Peter Sanders,et al.  Partitioning Complex Networks via Size-Constrained Clustering , 2014, SEA.

[14]  Sebastiano Vigna,et al.  UbiCrawler: a scalable fully distributed Web crawler , 2004, Softw. Pract. Exp..

[15]  Mehmet Deveci,et al.  UMPa: A multi-objective, multi-level partitioner for communication minimization , 2012, Graph Partitioning and Graph Clustering.

[16]  Jérôme Kunegis,et al.  KONECT: the Koblenz network collection , 2013, WWW.

[17]  Vipin Kumar,et al.  A Fast and High Quality Multilevel Scheme for Partitioning Irregular Graphs , 1998, SIAM J. Sci. Comput..

[18]  Vipin Kumar,et al.  Parallel Multilevel k-way Partitioning Scheme for Irregular Graphs , 1996, Proceedings of the 1996 ACM/IEEE Conference on Supercomputing.

[19]  Claudio Martella,et al.  Spinner: Scalable Graph Partitioning in the Cloud , 2014, 2017 IEEE 33rd International Conference on Data Engineering (ICDE).

[20]  David Hardcastle,et al.  Using Pregel-like Large Scale Graph Processing Frameworks for Social Network Analysis , 2012, 2012 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining.

[21]  汪卫 How to partition a billion-Node graph , 2014 .

[22]  J. Pasciak,et al.  Computer solution of large sparse positive definite systems , 1982 .

[23]  Yong Guo Towards Benchmarking Graph-Processing Platforms , 2013 .

[24]  Sivasankaran Rajamanickam,et al.  Scalable matrix computations on large scale-free graphs using 2D graph partitioning , 2013, 2013 SC - International Conference for High Performance Computing, Networking, Storage and Analysis (SC).

[25]  Christos Faloutsos,et al.  PEGASUS: A Peta-Scale Graph Mining System Implementation and Observations , 2009, 2009 Ninth IEEE International Conference on Data Mining.

[26]  Bora Uçar,et al.  Encapsulating Multiple Communication-Cost Metrics in Partitioning Sparse Rectangular Matrices for Parallel Matrix-Vector Multiplies , 2004, SIAM J. Sci. Comput..

[27]  Vipin Kumar,et al.  Multilevel Algorithms for Multi-Constraint Graph Partitioning , 1998, Proceedings of the IEEE/ACM SC98 Conference.

[28]  Timothy A. Davis,et al.  The university of Florida sparse matrix collection , 2011, TOMS.

[29]  Charles M. Fiduccia,et al.  A linear-time heuristic for improving network partitions , 1988, 25 years of DAC.

[30]  Vipin Kumar,et al.  Parallel Multilevel Algorithms for Multi-constraint Graph Partitioning (Distinguished Paper) , 2000, Euro-Par.

[31]  Jure Leskovec,et al.  Community Structure in Large Networks: Natural Cluster Sizes and the Absence of Large Well-Defined Clusters , 2008, Internet Math..