Argo: Architecture-aware graph partitioning

The increasing popularity and ubiquity of various large graph datasets has caused renewed interest for graph partitioning. Existing graph partitioners either scale poorly against large graphs or disregard the impact of the underlying hardware topology. A few solutions have shown that the nonuniform network communication costs may affect the performance greatly. However, none of them considers the impact of resource contention on the memory subsystems (e.g., LLC and Memory Controller) of modern multicore clusters. They all neglect the fact that the bandwidth of modern high-speed networks (e.g., Infiniband) has become comparable to that of the memory subsystems. In this paper, we provide an in-depth analysis, both theoretically and experimentally, on the contention issue for distributed workloads. We found that the slowdown caused by the contention can be as high as 11x. We then design an architecture-aware graph partitioner, Argo, to allow the full use of all cores of multicore machines without suffering from either the contention or the communication heterogeneity issue. Our experimental study showed (1) the effectiveness of Argo, achieving up to 12x speedups on three classic workloads: Breadth First Search, Single Source Shortest Path, and PageRank; and (2) the scalability of Argo in terms of both graph size and the number of partitions on two billion-edge real-world graphs.

[1]  Alexandros Labrinidis,et al.  Planar: Parallel lightweight architecture-aware adaptive graph repartitioning , 2016, 2016 IEEE 32nd International Conference on Data Engineering (ICDE).

[2]  Vipin Kumar,et al.  Multilevel Graph Partitioning Schemes , 1995, ICPP.

[3]  Panos Kalnis,et al.  Mizan: a system for dynamic load balancing in large-scale graph processing , 2013, EuroSys '13.

[4]  Margo I. Seltzer,et al.  A Scalable Distributed Graph Partitioner , 2015, Proc. VLDB Endow..

[5]  Yafei Dai,et al.  When computing meets heterogeneous cluster: Workload assignment in graph computation , 2015, 2015 IEEE International Conference on Big Data (Big Data).

[6]  Hamamache Kheddouci,et al.  Fractional greedy and partial restreaming partitioning: New methods for massive graph partitioning , 2014, 2014 IEEE International Conference on Big Data (Big Data).

[7]  Joel Nishimura,et al.  Restreaming graph partitioning: simple versatile algorithms for advanced balancing , 2013, KDD.

[8]  Jonathan W. Berry,et al.  Challenges in Parallel Graph Processing , 2007, Parallel Process. Lett..

[9]  Tamara G. Kolda,et al.  Graph partitioning models for parallel computing , 2000, Parallel Comput..

[10]  Fabio Petroni,et al.  HDRF: Stream-Based Partitioning for Power-Law Graphs , 2015, CIKM.

[11]  Charalampos E. Tsourakakis,et al.  FENNEL: streaming graph partitioning for massive scale graphs , 2014, WSDM.

[12]  Zhihua Zhang,et al.  Distributed Power-law Graph Computing: Theoretical and Empirical Analysis , 2014, NIPS.

[13]  Rishan Chen,et al.  Improving large graph processing on partitioned graphs in the cloud , 2012, SoCC '12.

[14]  Wilfred Ng,et al.  Blogel: A Block-Centric Framework for Distributed Computation on Real-World Graphs , 2014, Proc. VLDB Endow..

[15]  Zi Huang,et al.  Heterogeneous Environment Aware Streaming Graph Partitioning , 2015, IEEE Transactions on Knowledge and Data Engineering.

[16]  Christos Faloutsos,et al.  Power-Hop: A Pervasive Observation for Real Complex Networks , 2016, PloS one.

[17]  Joseph M. Hellerstein,et al.  GraphLab: A New Framework For Parallel Machine Learning , 2010, UAI.

[18]  Robert van Engelen,et al.  Graph Partitioning for High Performance Scienti c Simulations , 2000 .

[19]  Gabriel Kliot,et al.  Streaming graph partitioning for large distributed graphs , 2012, KDD.

[20]  Shirish Tatikonda,et al.  From "Think Like a Vertex" to "Think Like a Graph" , 2013, Proc. VLDB Endow..

[21]  Carsten Binnig,et al.  The End of Slow Networks: It's Time for a Redesign , 2015, Proc. VLDB Endow..

[22]  Guillaume Mercier,et al.  Cache-Efficient, Intranode, Large-Message MPI Communication with MPICH2-Nemesis , 2009, 2009 International Conference on Parallel Processing.

[23]  Kurt Rothermel,et al.  GrapH: Heterogeneity-Aware Graph Computation with Adaptive Partitioning , 2016, 2016 IEEE 36th International Conference on Distributed Computing Systems (ICDCS).

[24]  Sayantan Sur,et al.  RDMA read based rendezvous protocol for MPI over InfiniBand: design alternatives and benefits , 2006, PPoPP '06.

[25]  Michalis Faloutsos,et al.  On power-law relationships of the Internet topology , 1999, SIGCOMM '99.

[26]  Yi Lu,et al.  Large-Scale Distributed Graph Computing Systems: An Experimental Evaluation , 2014, Proc. VLDB Endow..

[27]  Sayantan Sur,et al.  LiMIC: support for high-performance MPI intra-node communication on Linux cluster , 2005, 2005 International Conference on Parallel Processing (ICPP'05).

[28]  Alexandros Labrinidis,et al.  PARAGON: Parallel Architecture-Aware Graph Partition Refinement Algorithm , 2016, EDBT.

[29]  Kamesh Madduri,et al.  Parallel breadth-first search on distributed memory systems , 2011, 2011 International Conference for High Performance Computing, Networking, Storage and Analysis (SC).

[30]  Aart J. C. Bik,et al.  Pregel: a system for large-scale graph processing , 2010, SIGMOD Conference.

[31]  Joseph Gonzalez,et al.  PowerGraph: Distributed Graph-Parallel Computation on Natural Graphs , 2012, OSDI.

[32]  Yogesh L. Simmhan,et al.  GoFFish: A Sub-graph Centric Framework for Large-Scale Graph Analytics , 2013, Euro-Par.

[33]  Rupak Biswas,et al.  Performance impact of resource contention in multicore systems , 2010, 2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS).

[34]  Konstantin Andreev,et al.  Balanced Graph Partitioning , 2004, SPAA '04.