DFOGraph: an I/O- and communication-efficient system for distributed fully-out-of-core graph processing

With the magnitude of graph-structured data continually increasing, graph processing systems that can scale-out and scale-up are needed to handle extreme-scale datasets. While existing distributed out-of-core solutions have made it possible, they suffer from limited performance due to excessive I/O and communication costs. We present DFOGraph, a distributed fully-out-of-core graph processing system that applies and assembles multiple techniques to enable I/Oand communication-efficient processing. DFOGraph builds upon two-level column-oriented partition with adaptive compressed representations to allow fine-grained selective computation and communication, and it only issues necessary disk and network requests. Our evaluation shows DFOGraph achieves performance comparable to GridGraph and FlashGraph (>2.52× and 1.06×) on a single machine and outperforms Chaos and HybridGraph significantly (>12.94× and >10.82×) when scaling out.

[1]  Wenguang Chen,et al.  GridGraph: Large-Scale Graph Processing on a Single Machine Using 2-Level Hierarchical Partitioning , 2015, USENIX ATC.

[2]  Binyu Zang,et al.  Computation and communication efficient graph processing with distributed immutable view , 2014, HPDC '14.

[3]  Alexander S. Szalay,et al.  FlashGraph: Processing Billion-Node Graphs on an Array of Commodity SSDs , 2014, FAST.

[4]  Mohan Kumar,et al.  Mosaic: Processing a Trillion-Edge Graph on a Single Machine , 2017, EuroSys.

[5]  Rajeev Motwani,et al.  The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.

[6]  Christos Faloutsos,et al.  Realistic, Mathematically Tractable Graph Generation and Evolution, Using Kronecker Multiplication , 2005, PKDD.

[7]  Michael J. Carey,et al.  Pregelix: Big(ger) Graph Analytics on a Dataflow Engine , 2014, Proc. VLDB Endow..

[8]  Wei Li,et al.  Tux2: Distributed Graph Computation for Machine Learning , 2017, NSDI.

[9]  Michael Isard,et al.  Scalability! But at what COST? , 2015, HotOS.

[10]  Marco Rosa,et al.  Layered label propagation: a multiresolution coordinate-free ordering for compressing social networks , 2010, WWW.

[11]  Jinha Kim,et al.  TurboGraph: a fast parallel graph engine handling billion-scale graphs in a single PC , 2013, KDD.

[12]  Sebastiano Vigna,et al.  The webgraph framework I: compression techniques , 2004, WWW '04.

[13]  Reynold Xin,et al.  GraphX: Graph Processing in a Distributed Dataflow Framework , 2014, OSDI.

[14]  Yu Wang,et al.  NXgraph: An efficient graph processing system on a single machine , 2015, 2016 IEEE 32nd International Conference on Data Engineering (ICDE).

[15]  Wenguang Chen,et al.  Gemini: A Computation-Centric Distributed Graph Processing System , 2016, OSDI.

[16]  Keval Vora,et al.  LUMOS: Dependency-Driven Disk-based Graph Processing , 2019, USENIX ATC.

[17]  Weimin Zheng,et al.  Clip: A Disk I/O Focused Parallel Out-of-Core Graph Processing System , 2019, IEEE Transactions on Parallel and Distributed Systems.

[18]  Weimin Zheng,et al.  Squeezing out All the Value of Loaded Data: An Out-of-core Graph Processing System with Reduced Disk I/O , 2017, USENIX Annual Technical Conference.

[19]  Ge Yu,et al.  Hybrid Pulling/Pushing for I/O-Efficient Distributed and Iterative Graph Computing , 2016, SIGMOD Conference.

[20]  Guy E. Blelloch,et al.  GraphChi: Large-Scale Graph Computation on Just a PC , 2012, OSDI.

[21]  Guy E. Blelloch,et al.  Ligra: a lightweight graph processing framework for shared memory , 2013, PPoPP '13.

[22]  Peter A. Boncz LDBC: benchmarks for graph and RDF data management , 2013, IDEAS '13.

[23]  H. Howie Huang,et al.  Graphene: Fine-Grained IO Management for Graph Computing , 2017, FAST.

[24]  Keshav Pingali,et al.  A lightweight infrastructure for graph analytics , 2013, SOSP.

[25]  Chengcui Zhang,et al.  GraphD: Distributed Vertex-Centric Graph Processing Beyond the Memory Limit , 2018, IEEE Transactions on Parallel and Distributed Systems.

[26]  Wencong Xiao,et al.  GraM: scaling graph computation to the trillions , 2015, SoCC.

[27]  Hosung Park,et al.  What is Twitter, a social network or a news media? , 2010, WWW '10.

[28]  Binyu Zang,et al.  PowerLyra: Differentiated Graph Computation and Partitioning on Skewed Graphs , 2019, TOPC.

[29]  Willy Zwaenepoel,et al.  X-Stream: edge-centric graph processing using streaming partitions , 2013, SOSP.

[30]  Lei Liu,et al.  Cacheap: Portable and Collaborative I/O Optimization for Graph Processing , 2019, Journal of Computer Science and Technology.

[31]  Guy E. Blelloch,et al.  Smaller and Faster: Parallel Processing of Compressed Graphs with Ligra+ , 2015, 2015 Data Compression Conference.

[32]  Wook-Shin Han,et al.  TurboGraph++: A Scalable and Fast Graph Analytics System , 2018, SIGMOD Conference.

[33]  Rajiv Gupta,et al.  Load the Edges You Need: A Generic I/O Optimization for Disk-based Graph Processing , 2016, USENIX Annual Technical Conference.

[34]  Jure Leskovec,et al.  node2vec: Scalable Feature Learning for Networks , 2016, KDD.

[35]  Carlos Guestrin,et al.  Distributed GraphLab : A Framework for Machine Learning and Data Mining in the Cloud , 2012 .

[36]  Marco Rosa,et al.  Four degrees of separation , 2011, WebSci '12.

[37]  Christos Faloutsos,et al.  R-MAT: A Recursive Model for Graph Mining , 2004, SDM.

[38]  Willy Zwaenepoel,et al.  Chaos: scale-out graph processing from secondary storage , 2015, SOSP.

[39]  Aart J. C. Bik,et al.  Pregel: a system for large-scale graph processing , 2010, SIGMOD Conference.

[40]  Sebastiano Vigna,et al.  BUbiNG: massive crawling for the masses , 2014, WWW.

[41]  Haibo Chen,et al.  NUMA-aware graph-structured analytics , 2015, PPoPP.

[42]  Joseph Gonzalez,et al.  PowerGraph: Distributed Graph-Parallel Computation on Natural Graphs , 2012, OSDI.

[43]  Wenguang Chen,et al.  ShenTu: Processing Multi-Trillion Edge Graphs on Millions of Cores in Seconds , 2018, SC18: International Conference for High Performance Computing, Networking, Storage and Analysis.

[44]  Scott Shenker,et al.  Spark: Cluster Computing with Working Sets , 2010, HotCloud.

[45]  Sam H. Noh,et al.  Pre-Select Static Caching and Neighborhood Ordering for BFS-like Algorithms on Disk-based Graph Engines , 2019, USENIX Annual Technical Conference.

[46]  Haibo Chen,et al.  SYNC or ASYNC: time to fuse for distributed graph-parallel computation , 2015, PPoPP.