Language and hardware acceleration backend for graph processing

Graphs are important in many applications however their analysis on conventional computer architectures is generally inefficient because it involves highly irregular access to memory when traversing vertices and edges. As an example, when finding a path from a source vertex to a target one the performance is typically limited by the memory bottleneck whereas the actual computation is trivial. This paper presents a methodology for embedding graphs into silicon, where graph vertices become finite state machines communicating via the graph edges. With this approach many common graph analysis tasks can be performed by propagating signals through the physical graph and measuring signal propagation time using the on-chip clock distribution network. This eliminates the memory bottleneck and allows thousands of vertices to be processed in parallel. We present a domain-specific language for graph description and transformation, and demonstrate how it can be used to translate application graphs into an FPGA board, where they can be analysed up to 1000× faster than on a conventional computer.

[1]  Andrey Mokhov Algebraic graphs with class (functional pearl) , 2017, Haskell.

[2]  John Wawrzynek,et al.  High-throughput bayesian computing machine with reconfigurable hardware , 2010, FPGA '10.

[3]  P. J. Narayanan,et al.  Accelerating Large Graph Algorithms on the GPU Using CUDA , 2007, HiPC.

[4]  Miran Lipovaca,et al.  Learn You a Haskell for Great Good!: A Beginner's Guide , 2011 .

[5]  Nachiket Kapre Custom FPGA-based soft-processors for sparse graph acceleration , 2015, 2015 IEEE 26th International Conference on Application-specific Systems, Architectures and Processors (ASAP).

[6]  Pradeep Dubey,et al.  Large-scale energy-efficient graph traversal: A path to efficient data-intensive supercomputing , 2012, 2012 International Conference for High Performance Computing, Networking, Storage and Analysis.

[7]  Rui Chen,et al.  Systems biology: personalized medicine for the future? , 2012, Current opinion in pharmacology.

[8]  Andrey Mokhov,et al.  Conditional Partial Order Graphs , 2009 .

[9]  Alexandre Yakovlev,et al.  Verification of conditional partial order graphs , 2008, 2008 8th International Conference on Application of Concurrency to System Design.

[10]  Kunle Olukotun,et al.  Efficient Parallel Graph Exploration on Multi-Core CPU and GPU , 2011, 2011 International Conference on Parallel Architectures and Compilation Techniques.

[11]  E. Schadt Molecular networks as sensors and drivers of common human diseases , 2009, Nature.

[12]  Martín Abadi,et al.  TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems , 2016, ArXiv.

[13]  Alan V. Whitmore,et al.  Chapter 3:Drug Molecules and Biology: Network and Systems Aspects , 2012 .

[14]  Tianhai Tian,et al.  The origins of cancer robustness and evolvability. , 2011, Integrative biology : quantitative biosciences from nano to macro.

[15]  James C. Hoe,et al.  GraphGen: An FPGA Framework for Vertex-Centric Graph Computation , 2014, 2014 IEEE 22nd Annual International Symposium on Field-Programmable Custom Computing Machines.

[16]  Philip Wadler,et al.  Monads for functional programming , 1995, NATO ASI PDC.

[17]  Alexandre Yakovlev,et al.  Conditional Partial Order Graphs: Model, Synthesis, and Application , 2010, IEEE Transactions on Computers.

[18]  John D. Owens,et al.  Gunrock: a high-performance graph processing library on the GPU , 2015, PPoPP.

[19]  Xin-She Yang,et al.  Introduction to Algorithms , 2021, Nature-Inspired Optimization Algorithms.

[20]  H. Kitano A robustness-based approach to systems-oriented drug design , 2007, Nature Reviews Drug Discovery.

[21]  D S Callaway,et al.  Network robustness and fragility: percolation on random graphs. , 2000, Physical review letters.

[22]  Albert-László Barabási,et al.  Statistical mechanics of complex networks , 2001, ArXiv.

[23]  R. Sharan,et al.  Protein networks in disease. , 2008, Genome research.

[24]  Paul Hudak,et al.  Building domain-specific embedded languages , 1996, CSUR.

[25]  Wayne Luk,et al.  A framework for FPGA acceleration of large graph problems: Graphlet counting case study , 2011, 2011 International Conference on Field-Programmable Technology.

[26]  V. Latora,et al.  Efficiency of scale-free networks: error and attack tolerance , 2002, cond-mat/0205601.

[27]  James C. Hoe,et al.  GraphGen: An FPGA Framework for Vertex-Centric Graph Computation , 2014, FCCM 2014.

[28]  Albert-László Barabási,et al.  Error and attack tolerance of complex networks , 2000, Nature.