Efficient Parallel Implementation of Single Source Shortest Path Algorithm on GPU Using CUDA

In today‟s world there are number of applications like routing in telephone networks, traveller information system, robotic path selection etc., where data can be represented as a graph and different graph algorithms are executed on it to fulfil the requirements of the application. Data related to these applications are growing every day, but we still need quick and real time responses from them, as performance is a critical limiting factor. Parallel implementation is the most widely used method of performance improvement for most algorithms. This paper proposes two improved and more efficient versions constraint based single source shortest path (SSSP) calculation algorithm for Graphics Processing Unit (GPU) based machine using CUDA. First implementation creates one CUDA thread for each node and second creates one CUDA thread for each edge of graph. Analysis of inconsistencies present in both implementations and their effects are discussed in detail. Results of proposed implementations are compared with previously implemented constraint based parallel Bellman Ford SSSP algorithm on a GPU, as it shows best result among all previous parallel GPU based SSSP algorithm. Nvidia‟s Tesla C2075 and GeForce GTS 450 GPUs are used to run the parallel implementation of evaluated algorithms. We obtain a 600-fold speed increase in proposed implementation compared to a simple parallel Bellman Ford algorithm and 2. 6 times performance gain over the previously implemented constraint based parallel SSSP algorithm on a GPU.

[1]  Jie Cheng,et al.  Programming Massively Parallel Processors. A Hands-on Approach , 2010, Scalable Comput. Pract. Exp..

[2]  P. J. Narayanan,et al.  Accelerating Large Graph Algorithms on the GPU Using CUDA , 2007, HiPC.

[3]  Mariusz Boryczka,et al.  The Parallel Ant Vehicle Navigation System with CUDA Technology , 2011, ICCCI.

[4]  Qian Zhang,et al.  Shortest path routing in partially connected ad hoc networks , 2003, GLOBECOM '03. IEEE Global Telecommunications Conference (IEEE Cat. No.03CH37489).

[5]  Sumit Kumar,et al.  A modified parallel approach to Single Source Shortest Path Problem for massively dense graphs using CUDA , 2011, 2011 2nd International Conference on Computer and Communication Technology (ICCCT-2011).

[6]  Jesper Larsson Träff,et al.  A parallel priority data structure with applications , 1997, Proceedings 11th International Parallel Processing Symposium.

[7]  Veysi Isler,et al.  A Parallel Algorithm for UAV Flight Route Planning on GPU , 2011, International Journal of Parallel Programming.

[8]  Stanislav G. Sedukhin,et al.  Blocked All-Pairs Shortest Paths Algorithm for Hybrid CPU-GPU System , 2011, 2011 IEEE International Conference on High Performance Computing and Communications.

[9]  Yvon Savaria,et al.  Optimal design of synchronous circuits using software pipelining techniques , 2001, TODE.

[10]  Pedro J. Martín,et al.  CUDA Solutions for the SSSP Problem , 2009, ICCS.

[11]  Quoc-Nam Tran Designing Efficient Many-Core Parallel Algorithms for All-Pairs Shortest-Paths Using CUDA , 2010, 2010 Seventh International Conference on Information Technology: New Generations.

[12]  Reinhard Klette,et al.  An approximate algorithm for solving shortest path problems for mobile robots or driver assistance , 2009, 2009 IEEE Intelligent Vehicles Symposium.

[13]  Joseph T. Kider,et al.  All-pairs shortest-paths for large graphs on the GPU , 2008, GH '08.

[14]  Nuno M. Garcia,et al.  On the Performance of Shortest Path Routing Algorithms for Modeling and Simulation of Static Source Routed Networks -- an Extension to the Dijkstra Algorithm , 2007, 2007 Second International Conference on Systems and Networks Communications (ICSNC 2007).

[15]  Gaurav Trivedi,et al.  Application of DC Analyzer to Combinatorial Optimization Problems , 2007, 20th International Conference on VLSI Design held jointly with 6th International Conference on Embedded Systems (VLSID'07).

[16]  John R. Gilbert,et al.  Solving path problems on the GPU , 2010, Parallel Comput..

[17]  Hu Chen,et al.  A Parallel Shortest Path Algorithm Based on Graph-Partitioning and Iterative Correcting , 2008, 2008 10th IEEE International Conference on High Performance Computing and Communications.

[18]  F. Benjamin Zhan,et al.  Shortest Path Algorithms: An Evaluation Using Real Road Networks , 1998, Transp. Sci..

[19]  Shashi Shekhar,et al.  A performance analysis of hierarchical shortest path algorithms , 1997, Proceedings Ninth IEEE International Conference on Tools with Artificial Intelligence.

[20]  Ulf Assarsson,et al.  Fast parallel GPU-sorting using a hybrid algorithm , 2008, J. Parallel Distributed Comput..

[21]  Kurt Mehlhorn,et al.  A Parallelization of Dijkstra's Shortest Path Algorithm , 1998, MFCS.

[22]  D. R. Fulkerson,et al.  Flows in Networks. , 1964 .

[23]  William J. Dally,et al.  The GPU Computing Era , 2010, IEEE Micro.

[24]  Edsger W. Dijkstra,et al.  A note on two problems in connexion with graphs , 1959, Numerische Mathematik.

[25]  Keechul Jung,et al.  Neural Network Implementation Using CUDA and OpenMP , 2008, 2008 Digital Image Computing: Techniques and Applications.

[26]  S. Dashora,et al.  Implementation of graph algorithms over GPU: A comparative analysis , 2012, 2012 IEEE Students' Conference on Electrical, Electronics and Computer Science.

[27]  Richard Bellman,et al.  ON A ROUTING PROBLEM , 1958 .

[28]  David A. Bader,et al.  Advanced Shortest Paths Algorithms on a Massively-Multithreaded Architecture , 2007, 2007 IEEE International Parallel and Distributed Processing Symposium.