Grafs: declarative graph analytics

Graph analytics elicits insights from large graphs to inform critical decisions for business, safety and security. Several large-scale graph processing frameworks feature efficient runtime systems; however, they often provide programming models that are low-level and subtly different from each other. Therefore, end users can find implementation and specially optimization of graph analytics error-prone and time-consuming. This paper regards the abstract interface of the graph processing frameworks as the instruction set for graph analytics, and presents Grafs, a high-level declarative specification language for graph analytics and a synthesizer that automatically generates efficient code for five high-performance graph processing frameworks. It features novel semantics-preserving fusion transformations that optimize the specifications and reduce them to three primitives: reduction over paths, mapping over vertices and reduction over vertices. Reductions over paths are commonly calculated based on push or pull models that iteratively apply kernel functions at the vertices. This paper presents conditions, parametric in terms of the kernel functions, for the correctness and termination of the iterative models, and uses these conditions as specifications to automatically synthesize the kernel functions. Experimental results show that the generated code matches or outperforms handwritten code, and that fusion accelerates execution.

[1]  Bettina Könighofer,et al.  Synthesis of synchronization using uninterpreted functions , 2014, 2014 Formal Methods in Computer-Aided Design (FMCAD).

[2]  Keval Vora,et al.  GraphBolt: Dependency-Driven Synchronous Processing of Streaming Graphs , 2019, EuroSys.

[3]  Joseph M. Hellerstein,et al.  GraphLab: A New Framework For Parallel Machine Learning , 2010, UAI.

[4]  Kunle Olukotun,et al.  EmptyHeaded: A Relational Engine for Graph Processing , 2015, ACM Trans. Database Syst..

[5]  Shoaib Kamil,et al.  GraphIt: a high-performance graph DSL , 2018, Proc. ACM Program. Lang..

[6]  Ryan Newton,et al.  Sound, fine-grained traversal fusion for heterogeneous trees , 2019, PLDI.

[7]  Zhenjiang Hu,et al.  Palgol: A High-Level DSL for Vertex-Centric Graph Processing with Remote Data Access , 2017, APLAS.

[8]  Simon L. Peyton Jones,et al.  A short cut to deforestation , 1993, FPCA '93.

[9]  Milind Kulkarni,et al.  TreeFuser: a framework for analyzing and fusing general recursive tree traversals , 2017, Proc. ACM Program. Lang..

[10]  H. Massalin Superoptimizer: a look at the smallest program , 1987, ASPLOS.

[11]  Wei-Ngan Chin Safe fusion of functional expressions , 1992, LFP '92.

[12]  Ken Kennedy,et al.  Profitable loop fusion and tiling using model-driven empirical search , 2006, ICS '06.

[13]  Marko A. Rodriguez,et al.  The Gremlin graph traversal machine and language (invited talk) , 2015, DBPL.

[14]  Emina Torlak,et al.  Toward tool support for interactive synthesis , 2015, Onward!.

[15]  Sanjit A. Seshia,et al.  Combinatorial sketching for finite programs , 2006, ASPLOS XII.

[16]  Leonid Ryzhyk,et al.  From non-preemptive to preemptive scheduling using synchronization synthesis , 2015, CAV.

[17]  Alexander Aiken,et al.  Automatic generation of peephole superoptimizers , 2006, ASPLOS XII.

[18]  Keith H. Randall,et al.  Denali: a goal-directed superoptimizer , 2002, PLDI '02.

[19]  Akimasa Morihata,et al.  Optimizing Declarative Parallel Distributed Graph Processing by Using Constraint Solvers , 2018, FLOPS.

[20]  Carlos Guestrin,et al.  Distributed GraphLab : A Framework for Machine Learning and Data Mining in the Cloud , 2012 .

[21]  Leonid Ryzhyk,et al.  Regression-free Synthesis for Concurrency , 2014, CAV.

[22]  John Launchbury,et al.  Practical Aspects of Declarative Languages , 2011, Lecture Notes in Computer Science.

[23]  Chenglong Wang,et al.  Synthesizing highly expressive SQL queries from input-output examples , 2017, PLDI.

[24]  Robert J. Harrison,et al.  On fusing recursive traversals of K-d trees , 2016, CC.

[25]  Andrew D. Gordon,et al.  A Declarative Approach to Automated Configuration , 2012, LISA.

[26]  Seth Copen Goldstein,et al.  Declarative coordination of graph-based parallel programs , 2016, PPoPP.

[27]  Akimasa Morihata,et al.  Think like a vertex, behave like a function! a functional DSL for vertex-centric big graph processing , 2016, ICFP.

[28]  Wenguang Chen,et al.  Gemini: A Computation-Centric Distributed Graph Processing System , 2016, OSDI.

[29]  Armando Solar-Lezama,et al.  MSL: A Synthesis Enabled Language for Distributed Implementations , 2014, SC14: International Conference for High Performance Computing, Networking, Storage and Analysis.

[30]  S. Zdancewic,et al.  Type-and-example-directed program synthesis , 2015, PLDI.

[31]  Ken Kennedy,et al.  Maximizing Loop Parallelism and Improving Data Locality via Loop Fusion and Distribution , 1993, LCPC.

[32]  Norman Ramsey,et al.  Declarative Composition of Stack Frames , 2004, CC.

[33]  Aart J. C. Bik,et al.  Pregel: a system for large-scale graph processing , 2010, SIGMOD Conference.

[34]  Kunle Olukotun,et al.  Green-Marl: a DSL for easy and efficient graph analysis , 2012, ASPLOS XVII.

[35]  Keval Vora,et al.  DZiG: sparsity-aware incremental processing of streaming graphs , 2021, EuroSys.

[36]  Yunhong Zhou,et al.  Denali: A practical algorithm for generating optimal code , 2006, TOPL.

[37]  Leonid Ryzhyk,et al.  Efficient Synthesis for Concurrency by Semantics-Preserving Transformations , 2013, CAV.

[38]  Sungpack Hong,et al.  PGQL: a property graph query language , 2016, GRADES '16.

[39]  Y. N. Srikant,et al.  DH-Falcon: A Language for Large-Scale Graph Processing on Distributed Heterogeneous Systems , 2017, 2017 IEEE International Conference on Cluster Computing (CLUSTER).

[40]  Wenguang Chen,et al.  GridGraph: Large-Scale Graph Processing on a Single Machine Using 2-Level Hierarchical Partitioning , 2015, USENIX ATC.

[41]  Joseph Gonzalez,et al.  PowerGraph: Distributed Graph-Parallel Computation on Natural Graphs , 2012, OSDI.

[42]  Rajiv Gupta,et al.  KickStarter: Fast and Accurate Computations on Streaming Graphs via Trimmed Approximations , 2017, ASPLOS.

[43]  Guy E. Blelloch,et al.  Ligra: a lightweight graph processing framework for shared memory , 2013, PPoPP '13.

[44]  Alexander Aiken,et al.  Stochastic superoptimization , 2012, ASPLOS '13.

[45]  Jennifer Golbeck,et al.  Computing and Applying Trust in Web-based Social Networks , 2005 .

[46]  Keshav Pingali,et al.  Abelian: A Compiler for Graph Analytics on Distributed, Heterogeneous Platforms , 2018, Euro-Par.

[47]  Sumit Gulwani,et al.  Spreadsheet data manipulation using examples , 2012, CACM.

[48]  Kiminori Matsuzaki,et al.  s6raph: vertex-centric graph processing framework with functional interface , 2016, FHPC@ICFP.

[49]  Eelco Visser,et al.  Warm fusion in Stratego: A case study in generation of program transformation systems , 2004, Annals of Mathematics and Artificial Intelligence.

[50]  Rajeev Alur,et al.  TRANSIT: specifying protocols with concolic snippets , 2013, PLDI.

[51]  Keshav Pingali,et al.  A round-efficient distributed betweenness centrality algorithm , 2019, PPoPP.

[52]  Uday Bondhugula,et al.  A practical automatic polyhedral parallelizer and locality optimizer , 2008, PLDI '08.

[53]  Aws Albarghouthi,et al.  MapReduce program synthesis , 2016, PLDI.

[54]  Joseph M. Hellerstein,et al.  Distributed GraphLab: A Framework for Machine Learning in the Cloud , 2012, Proc. VLDB Endow..

[55]  Rajeev Alur,et al.  Syntax-guided synthesis , 2013, 2013 Formal Methods in Computer-Aided Design.

[56]  Willy Zwaenepoel,et al.  X-Stream: edge-centric graph processing using streaming partitions , 2013, SOSP.

[57]  Percy Liang,et al.  FrAngel: component-based synthesis with control structures , 2018, Proc. ACM Program. Lang..

[58]  Rajeev Alur,et al.  TRANSIT: specifying protocols with concolic snippets , 2013, PLDI.

[59]  Sumit Gulwani,et al.  From program verification to program synthesis , 2010, POPL '10.

[60]  Zhe Wu,et al.  Using Domain-Specific Languages For Analytic Graph Databases , 2016, Proc. VLDB Endow..

[61]  Sumit Gulwani Automating string processing in spreadsheets using input-output examples , 2011, POPL.

[62]  Seth Copen Goldstein,et al.  A Linear Logic Programming Language for Concurrent Programming over Graph Structures , 2014, Theory and Practice of Logic Programming.

[63]  Alex Brooks,et al.  Gluon: a communication-optimizing substrate for distributed heterogeneous graph analytics , 2018, PLDI.

[64]  Keshav Pingali,et al.  Elixir: a system for synthesizing concurrent graph programs , 2012, OOPSLA '12.

[65]  Keval Vora,et al.  LUMOS: Dependency-Driven Disk-based Graph Processing , 2019, USENIX ATC.

[66]  Mohsen Lesani,et al.  Hamsaz: replication coordination analysis and synthesis , 2019, Proc. ACM Program. Lang..

[67]  Robert J. Harrison,et al.  A Domain-Specific Compiler for a Parallel Multiresolution Adaptive Numerical Simulation Environment , 2016, SC16: International Conference for High Performance Computing, Networking, Storage and Analysis.

[68]  Armando Solar-Lezama,et al.  Programming by sketching for bit-streaming programs , 2005, PLDI.

[69]  Armando Solar-Lezama,et al.  Programming by sketching for bit-streaming programs , 2005, PLDI '05.

[70]  Sumit Gulwani,et al.  Oracle-guided component-based program synthesis , 2010, 2010 ACM/IEEE 32nd International Conference on Software Engineering.

[71]  Armando Solar-Lezama,et al.  Program synthesis from polymorphic refinement types , 2015, PLDI.

[72]  Dave Cunningham,et al.  Keep Off the Grass: Locking the Right Path for Atomicity , 2008, CC.

[73]  Keshav Pingali,et al.  A lightweight infrastructure for graph analytics , 2013, SOSP.

[74]  Keshav Pingali,et al.  Synthesizing parallel graph programs via automated planning , 2015, PLDI.

[75]  Clark Verbrugge,et al.  Component-Based Lock Allocation , 2007, 16th International Conference on Parallel Architecture and Compilation Techniques (PACT 2007).

[76]  Alain Darte On the Complexity of Loop Fusion , 2000, Parallel Comput..

[77]  Philip Wadler,et al.  Deforestation: Transforming Programs to Eliminate Trees , 1988, Theoretical Computer Science.

[78]  Rupesh Nasre,et al.  LightHouse: An Automatic Code Generator for Graph Algorithms on GPUs , 2016, LCPC.