Towards Fast Large-scale Graph Analysis via Two-dimensional Balanced Partitioning

Distributed graph systems often leverage a cluster of machines by partitioning a large graph into multiple small-size subgraphs. Thus, graph partition usually has a significant impact on the performance of distributed graph systems. However, existing widely used partition schemes in practical graph systems can realize a good balance only in one dimension, e.g., either the number of vertices or the number of edges, and they may also incur lots of edge cuts. To address the problem, we develop BPart, which adopts a two-phase partition scheme to realize two-dimensional balance for both vertices and edges. Its core idea is to first partition the original graph into more small pieces than the cluster scale, and combine the partition to realize desired properties, then selectively combine the small pieces to construct larger subgraphs to generate two-dimensional balanced partition. We implement BPart into two open-source distributed graph systems, Gemini [58] and KnightKing [57]. Results show that BPart realizes good balance in both dimensions, and also significantly reduces the number of edge cuts. As a result, BPart reduces the total running time of various graph applications by 5% - 70%, compared to multiple existing partition schemes, e.g., Chunk-V, Chunk-E, Fennel, and Hash.

[1]  Ping Lu,et al.  Graph Algorithms With Partition Transparency , 2023, IEEE Transactions on Knowledge and Data Engineering.

[2]  Andreas Haeberlen,et al.  Mycelium: Large-Scale Distributed Graph Queries with Differential Privacy , 2021, SOSP.

[3]  Ge Yu,et al.  HBP: Hotness Balanced Partition for Prioritized Iterative Graph Computations , 2020, 2020 IEEE 36th International Conference on Data Engineering (ICDE).

[4]  Xiaosong Ma,et al.  KnightKing: a fast distributed graph random walk engine , 2019, SOSP.

[5]  Wenguang Chen,et al.  LiveGraph , 2019, Proc. VLDB Endow..

[6]  Wentong Cai,et al.  Distributed Edge Partitioning for Trillion-edge Graphs , 2019, Proc. VLDB Endow..

[7]  Grigory Yaroslavtsev,et al.  Multi-Dimensional Balanced Graph Partitioning via Projected Gradient Descent , 2019, Proc. VLDB Endow..

[8]  Binyu Zang,et al.  PowerLyra: Differentiated Graph Computation and Partitioning on Skewed Graphs , 2019, TOPC.

[9]  Philippe Cudré-Mauroux,et al.  Are Meta-Paths Necessary?: Revisiting Heterogeneous Graph Embeddings , 2018, CIKM.

[10]  Vladimir Vlassov,et al.  Streaming Graph Partitioning: An Experimental Study , 2018, Proc. VLDB Endow..

[11]  James Cheng,et al.  G-Miner: an efficient task-oriented graph mining system , 2018, EuroSys.

[12]  Peter Sanders,et al.  High-Quality Shared-Memory Graph Partitioning , 2017, IEEE Transactions on Parallel and Distributed Systems.

[13]  Weimin Zheng,et al.  Squeezing out All the Value of Loaded Data: An Out-of-core Graph Processing System with Reduced Disk I/O , 2017, USENIX Annual Technical Conference.

[14]  H. Howie Huang,et al.  Graphene: Fine-Grained IO Management for Graph Computing , 2017, FAST.

[15]  Arijit Khan,et al.  On Smart Query Routing: For Distributed Graph Querying with Decoupled Storage , 2016, USENIX Annual Technical Conference.

[16]  Wenguang Chen,et al.  Gemini: A Computation-Centric Distributed Graph Processing System , 2016, OSDI.

[17]  Jure Leskovec,et al.  node2vec: Scalable Feature Learning for Networks , 2016, KDD.

[18]  Rajiv Gupta,et al.  Load the Edges You Need: A Generic I/O Optimization for Disk-based Graph Processing , 2016, USENIX Annual Technical Conference.

[19]  Fabio Petroni,et al.  HDRF: Stream-Based Partitioning for Power-Law Graphs , 2015, CIKM.

[20]  Wenguang Chen,et al.  GridGraph: Large-Scale Graph Processing on a Single Machine Using 2-Level Hierarchical Partitioning , 2015, USENIX ATC.

[21]  Zhihua Zhang,et al.  Distributed Power-law Graph Computing: Theoretical and Empirical Analysis , 2014, NIPS.

[22]  Reynold Xin,et al.  GraphX: Graph Processing in a Distributed Dataflow Framework , 2014, OSDI.

[23]  Marc Lelarge,et al.  Balanced graph edge partition , 2014, KDD.

[24]  Claudio Martella,et al.  Spinner: Scalable Graph Partitioning in the Cloud , 2014, 2017 IEEE 33rd International Conference on Data Engineering (ICDE).

[25]  Steven Skiena,et al.  DeepWalk: online learning of social representations , 2014, KDD.

[26]  Hong Cheng,et al.  Random-walk domination in large graphs , 2014, 2014 IEEE 30th International Conference on Data Engineering.

[27]  Lu Wang,et al.  How to partition a billion-node graph , 2014, 2014 IEEE 30th International Conference on Data Engineering.

[28]  Charalampos E. Tsourakakis,et al.  FENNEL: streaming graph partitioning for massive scale graphs , 2014, WSDM.

[29]  Willy Zwaenepoel,et al.  X-Stream: edge-centric graph processing using streaming partitions , 2013, SOSP.

[30]  Shirish Tatikonda,et al.  From "Think Like a Vertex" to "Think Like a Graph" , 2013, Proc. VLDB Endow..

[31]  Aapo Kyrola,et al.  DrunkardMob: billions of random walks on just a PC , 2013, RecSys.

[32]  Jennifer Widom,et al.  GPS: a graph processing system , 2013, SSDBM.

[33]  Carlos Guestrin,et al.  Usenix Association 10th Usenix Symposium on Operating Systems Design and Implementation (osdi '12) 31 Graphchi: Large-scale Graph Computation on Just a Pc , 2022 .

[34]  Gabriel Kliot,et al.  Streaming graph partitioning for large distributed graphs , 2012, KDD.

[35]  Peter Sanders,et al.  Engineering Multilevel Graph Partitioning Algorithms , 2010, ESA.

[36]  Aart J. C. Bik,et al.  Pregel: a system for large-scale graph processing , 2010, SIGMOD Conference.

[37]  Peter Sanders,et al.  Engineering a scalable high quality graph partitioner , 2009, 2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS).

[38]  Andrew V. Goldberg,et al.  Quincy: fair scheduling for distributed computing clusters , 2009, SOSP '09.

[39]  Kesheng Wu,et al.  Fast connected-component labeling , 2009, Pattern Recognit..

[40]  François Pellegrini,et al.  PT-Scotch: A tool for efficient parallel graph ordering , 2008, Parallel Comput..

[41]  Injong Rhee,et al.  Z-MAC: a hybrid MAC for wireless sensor networks , 2005, SenSys '05.

[42]  Dániel Fogaras,et al.  Towards Scaling Fully Personalized PageRank: Algorithms, Lower Bounds, and Experiments , 2005, Internet Math..

[43]  Igor Jurisica,et al.  Modeling interactome: scale-free or geometric? , 2004, Bioinform..

[44]  Jennifer Widom,et al.  SimRank: a measure of structural-context similarity , 2002, KDD.

[45]  Rajeev Motwani,et al.  The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.

[46]  Vipin Kumar,et al.  Parallel Multilevel series k-Way Partitioning Scheme for Irregular Graphs , 1999, SIAM Rev..

[47]  Raj Jain,et al.  A Quantitative Measure Of Fairness And Discrimination For Resource Allocation In Shared Computer Systems , 1998, ArXiv.

[48]  Paul Hudak,et al.  Memory coherence in shared virtual memory systems , 1989, TOCS.

[49]  A. Hoffman,et al.  Lower bounds for the partitioning of graphs , 1973 .

[50]  Brian W. Kernighan,et al.  An efficient heuristic procedure for partitioning graphs , 1970, Bell Syst. Tech. J..

[51]  Zhengping Qian,et al.  GraphScope: A Unified Engine For Big Graph Processing , 2021, Proc. VLDB Endow..

[52]  John C.S. Lui,et al.  GraphWalker: An I/O-Efficient and Resource-Friendly Graph Analytic System for Fast and Scalable Random Walks , 2020, USENIX Annual Technical Conference.

[53]  N. Metropolis,et al.  An Efficient Heuristic Procedure for Partitioning Graphs , 2017 .

[54]  Carlos Guestrin,et al.  PowerGraph: Distributed Graph-Parallel Computation on Natural Graphs , 2012 .

[55]  R. M. Mattheyses,et al.  A Linear-Time Heuristic for Improving Network Partitions , 1982, 19th Design Automation Conference.