DH-Falcon: A Language for Large-Scale Graph Processing on Distributed Heterogeneous Systems

Graph models of social information systems typically contain trillions of edges. Such big graphs cannot beprocessed on a single machine. The graph object must bepartitioned and distributed among machines and processedin parallel on a computer cluster. Programming such systemsis very challenging. In this work, we present DH-Falcon, a graph DSL (domain-specific language) which can be usedto implement parallel algorithms for large-scale graphs, tar-geting Distributed Heterogeneous (CPU and GPU) clusters. DH-Falcon compiler is built on top of the Falcon compiler, which targets single node devices with CPU and multipleGPUs. An important facility provided by DH-Falcon is that itsupports mutation of graph objects, which allows programmerto write dynamic graph algorithms. Experimental evaluationshows that DH-Falcon matches or outperforms state-of-the-art frameworks and gains a speedup of up to 13×.

[1]  David A. Bader,et al.  GTgraph : A Synthetic Graph Generator Suite , 2006 .

[2]  Jie Yan,et al.  Graphine: Programming Graph-Parallel Computation of Large Natural Graphs for Multicore Clusters , 2016, IEEE Transactions on Parallel and Distributed Systems.

[3]  Haixun Wang,et al.  Trinity: a distributed graph engine on a memory cloud , 2013, SIGMOD '13.

[4]  Matei Ripeanu,et al.  A yoke of oxen and a thousand chickens for heavy lifting graph processing , 2012, 2012 21st International Conference on Parallel Architectures and Compilation Techniques (PACT).

[5]  Guy E. Blelloch,et al.  Ligra: a lightweight graph processing framework for shared memory , 2013, PPoPP '13.

[6]  Guy E. Blelloch,et al.  GraphChi: Large-Scale Graph Computation on Just a PC , 2012, OSDI.

[7]  Kevin J. Lang Finding good nearly balanced cuts in power law graphs , 2004 .

[8]  Kunle Olukotun,et al.  Green-Marl: a DSL for easy and efficient graph analysis , 2012, ASPLOS XVII.

[9]  Claudio Martella,et al.  Practical Graph Analytics with Apache Giraph , 2015, Apress.

[10]  Rupesh Nasre,et al.  LightHouse: An Automatic Code Generator for Graph Algorithms on GPUs , 2016, LCPC.

[11]  Douglas P. Gregor,et al.  The Parallel BGL : A Generic Library for Distributed Graph Computations , 2005 .

[12]  Aart J. C. Bik,et al.  Pregel: a system for large-scale graph processing , 2010, SIGMOD Conference.

[13]  Jennifer Widom,et al.  GPS: a graph processing system , 2013, SSDBM.

[14]  Joseph Gonzalez,et al.  PowerGraph: Distributed Graph-Parallel Computation on Natural Graphs , 2012, OSDI.

[15]  Y. N. Srikant,et al.  Falcon: A Graph Manipulation Language for Heterogeneous Systems , 2016, ACM Trans. Archit. Code Optim..

[16]  Carlos Guestrin,et al.  Distributed GraphLab : A Framework for Machine Learning and Data Mining in the Cloud , 2012 .

[17]  Binyu Zang,et al.  PowerLyra: Differentiated Graph Computation and Partitioning on Skewed Graphs , 2019, TOPC.

[18]  Jure Leskovec,et al.  Defining and Evaluating Network Communities Based on Ground-Truth , 2012, ICDM.

[19]  Kunle Olukotun,et al.  Simplifying Scalable Graph Processing with a Domain-Specific Language , 2014, CGO '14.

[20]  Willy Zwaenepoel,et al.  X-Stream: edge-centric graph processing using streaming partitions , 2013, SOSP.

[21]  L. Paul Chew,et al.  Parallel Constrained Delaunay Meshing , 2007 .

[22]  Karsten Schwan,et al.  GraphIn: An Online High Performance Incremental Graph Processing Framework , 2016, Euro-Par.

[23]  Sebastiano Vigna,et al.  UbiCrawler: a scalable fully distributed Web crawler , 2004, Softw. Pract. Exp..

[24]  Silvio Lattanzi,et al.  On compressing social networks , 2009, KDD.

[25]  Michael D. Ernst,et al.  HaLoop , 2010, Proc. VLDB Endow..

[26]  Sebastiano Vigna,et al.  A large time-aware web graph , 2008, SIGF.

[27]  L. Paul Chew,et al.  Guaranteed-quality mesh generation for curved surfaces , 1993, SCG '93.

[28]  Panos Kalnis,et al.  Mizan: a system for dynamic load balancing in large-scale graph processing , 2013, EuroSys '13.

[29]  Tom White,et al.  Hadoop: The Definitive Guide , 2009 .

[30]  Geoffrey C. Fox,et al.  Twister: a runtime for iterative MapReduce , 2010, HPDC '10.

[31]  Anne Condon,et al.  Parallel implementation of Bouvka's minimum spanning tree algorithm , 1996, Proceedings of International Conference on Parallel Processing.

[32]  Hosung Park,et al.  What is Twitter, a social network or a news media? , 2010, WWW '10.

[33]  Leslie G. Valiant,et al.  A bridging model for parallel computation , 1990, CACM.