High Performance and Scalable Graph Computation on GPUs

High compute power provided by the many-threaded SIMT model of Graphics Processing Units (GPUs) accompanied with the recent advancements in their programmability has allowed expression of massively parallel computations. Graph processing is one of the applications that expose such parallelism, and hence, candidates GPUs as attractive execution platforms. However, irregularities in large real-world graphs makes effective and scalable utilization of symmetric GPU architecture a challenging task. While degree distribution in graphs extracted from real-world origins is usually power law, GPUs demand homogeneous computation patterns on consecutive data elements. This article summarizes recent research advancements to overcome this challenge. We first overview the main concepts in the field of graph processing on GPUs . Then, we introduce novel graph representations that, unlike conventional storage formats, are a better match for GPUs . We then present a GPU-friendly decomposition scheme that provides balanced thread to task assignment and enhances the scalability and the execution performance. Finally, we discuss a set of techniques that allow scaling the computation over multiple GPUs efficiently.

[1]  P. J. Narayanan,et al.  Accelerating Large Graph Algorithms on the GPU Using CUDA , 2007, HiPC.

[2]  Matei Ripeanu,et al.  A yoke of oxen and a thousand chickens for heavy lifting graph processing , 2012, 2012 21st International Conference on Parallel Architectures and Compilation Techniques (PACT).

[3]  Laxmi N. Bhuyan,et al.  Enabling Work-Efficiency for High Performance Vertex-Centric Graph Analytics on GPUs , 2017, IA3@SC.

[4]  Laxmi N. Bhuyan,et al.  Scalable SIMD-Efficient Graph Processing on GPUs , 2015, 2015 International Conference on Parallel Architecture and Compilation (PACT).

[5]  Kunle Olukotun,et al.  Accelerating CUDA graph algorithms at maximum warp , 2011, PPoPP '11.

[6]  Guy E. Blelloch,et al.  GraphChi: Large-Scale Graph Computation on Just a PC , 2012, OSDI.

[7]  Keval Vora,et al.  CuSha: vertex-centric graph processing on GPUs , 2014, HPDC '14.