Parallel Graph Processing on Graphics Processors Made Easy

This paper demonstrates Medusa, a programming framework for parallel graph processing on graphics processors (GPUs). Medusa enables developers to leverage the massive parallelism and other hardware features of GPUs by writing sequential C/C++ code for a small set of APIs. This simplifies the implementation of parallel graph processing on the GPU. The runtime system of Medusa automatically executes the user-defined APIs in parallel on the GPU, with a series of graph-centric optimizations based on the architecture features of GPUs. We will demonstrate the steps of developing GPU-based graph processing algorithms with Medusa, and the superior performance of Medusa with both real-world and synthetic datasets.

[1]  Joseph E. Gonzalez,et al.  GraphLab: A New Parallel Framework for Machine Learning , 2010 .

[2]  P. J. Narayanan,et al.  Accelerating Large Graph Algorithms on the GPU Using CUDA , 2007, HiPC.

[3]  Jianlong Zhong,et al.  Medusa: Simplified Graph Processing on GPUs , 2014, IEEE Transactions on Parallel and Distributed Systems.

[4]  Shubhabrata Sengupta,et al.  Efficient Parallel Scan Algorithms for GPUs , 2011 .

[5]  Jianlong Zhong,et al.  Kernelet: High-Throughput GPU Kernel Executions with Dynamic Slicing and Scheduling , 2013, IEEE Transactions on Parallel and Distributed Systems.

[6]  Hong Chen,et al.  Parallel SimRank computation on large graphs with iterative aggregation , 2010, KDD.

[7]  Michael Shebanow Pervasive massively multithreaded GPU processors , 2009, CF '09.

[8]  Jimmy J. Lin,et al.  Design patterns for efficient graph algorithms in MapReduce , 2010, MLG '10.

[9]  Jonathan W. Berry,et al.  Software and Algorithms for Graph Queries on Multithreaded Architectures , 2007, 2007 IEEE International Parallel and Distributed Processing Symposium.

[10]  Jianlong Zhong,et al.  GViewer: GPU-Accelerated Graph Visualization and Mining , 2011, SocInfo.

[11]  Aart J. C. Bik,et al.  Pregel: a system for large-scale graph processing , 2010, SIGMOD Conference.

[12]  Naga K. Govindaraju,et al.  Mars: A MapReduce Framework on graphics processors , 2008, 2008 International Conference on Parallel Architectures and Compilation Techniques (PACT).

[13]  Kunle Olukotun,et al.  Accelerating CUDA graph algorithms at maximum warp , 2011, PPoPP '11.

[14]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.