Scalable Single Source Shortest Path Algorithms for Massively Parallel Systems

In the single-source shortest path (SSSP) problem, we have to find the shortest paths from a source vertex v to all other vertices in a graph. In this paper, we introduce a novel parallel algorithm, derived from the Bellman-Ford and Delta-stepping algorithms. We employ various pruning techniques, such as edge classification and direction-optimization, to dramatically reduce inter-node communication traffic, and we propose load balancing strategies to handle higher-degree vertices. The extensive performance analysis shows that our algorithms work well on scale-free and real-world graphs. In the largest tested configuration, an R-MAT graph with 238 vertices and 242 edges on 32,768 Blue Gene/Q nodes, we have achieved a processing rate of three Trillion Edges Per Second (TTEPS), a four orders of magnitude improvement over the best published results.

[1]  Michael Gschwind,et al.  The IBM Blue Gene/Q Compute Chip , 2012, IEEE Micro.

[2]  Peter Sanders,et al.  [Delta]-stepping: a parallelizable shortest path algorithm , 2003, J. Algorithms.

[3]  Éva Tardos,et al.  Algorithm design , 2005 .

[4]  U. Brandes A faster algorithm for betweenness centrality , 2001 .

[5]  David A. Bader,et al.  An Experimental Study of A Parallel Shortest Path Algorithm for Solving Large-Scale Graph Instances , 2007, ALENEX.

[6]  Philip Heidelberger,et al.  The IBM Blue Gene/Q interconnection network and message unit , 2011, 2011 International Conference for High Performance Computing, Networking, Storage and Analysis (SC).

[7]  Dorothea Wagner,et al.  Geometric Speed-Up Techniques for Finding Shortest Paths in Large Sparse Graphs , 2003, ESA.

[8]  Joseph Gonzalez,et al.  PowerGraph: Distributed Graph-Parallel Computation on Natural Graphs , 2012, OSDI.

[9]  Burkhard D. Steinmacher-Burow,et al.  The IBM Blue Gene/Q Interconnection Fabric , 2012, IEEE Micro.

[10]  L. R. Ford,et al.  NETWORK FLOW THEORY , 1956 .

[11]  Robert E. Tarjan,et al.  Relaxed heaps: an alternative to Fibonacci heaps with applications to parallel computation , 1988, CACM.

[12]  Fabio Checconi,et al.  Breaking the speed and scalability Barriers for Graph exploration on distributed-memory machines , 2012, 2012 International Conference for High Performance Computing, Networking, Storage and Analysis.

[13]  Christos Faloutsos,et al.  R-MAT: A Recursive Model for Graph Mining , 2004, SDM.

[14]  Richard Bellman,et al.  ON A ROUTING PROBLEM , 1958 .

[15]  Fabio Checconi,et al.  Traversing Trillions of Edges in Real Time: Graph Exploration on Large-Scale Parallel Machines , 2014, 2014 IEEE 28th International Parallel and Distributed Processing Symposium.

[16]  Andrew Lumsdaine,et al.  Single-Source Shortest Paths with the Parallel Boost Graph Library , 2006, The Shortest Path Problem.

[17]  P. J. Narayanan,et al.  Accelerating Large Graph Algorithms on the GPU Using CUDA , 2007, HiPC.

[18]  Clyde P. Kruskal,et al.  Parallel Algorithms for Shortest Path Problems , 1985, ICPP.

[19]  Ulrich Meyer,et al.  [Delta]-stepping: a parallelizable shortest path algorithm , 2003, J. Algorithms.

[20]  Leonard M. Freeman,et al.  A set of measures of centrality based upon betweenness , 1977 .

[21]  Andrew V. Goldberg,et al.  PHAST: Hardware-Accelerated Shortest Path Trees , 2011, 2011 IEEE International Parallel & Distributed Processing Symposium.

[22]  Steven J. Plimpton,et al.  MapReduce in MPI for Large-scale graph algorithms , 2011, Parallel Comput..

[23]  F. Glover,et al.  A computational analysis of alternative algorithms and labeling techniques for finding shortest path trees , 1979, Networks.

[24]  Edsger W. Dijkstra,et al.  A note on two problems in connexion with graphs , 1959, Numerische Mathematik.

[25]  Peter Sanders,et al.  Contraction Hierarchies: Faster and Simpler Hierarchical Routing in Road Networks , 2008, WEA.

[26]  David A. Bader,et al.  Designing Multithreaded Algorithms for Breadth-First Search and st-connectivity on the Cray MTA-2 , 2006, 2006 International Conference on Parallel Processing (ICPP'06).

[27]  Sergei Vassilvitskii,et al.  Densest Subgraph in Streaming and MapReduce , 2012, Proc. VLDB Endow..

[28]  Jesper Larsson Träff,et al.  A parallel priority data structure with applications , 1997, Proceedings 11th International Parallel Processing Symposium.

[29]  David A. Patterson,et al.  Direction-optimizing breadth-first search , 2012, HiPC 2012.

[30]  Christos Faloutsos,et al.  Kronecker Graphs: An Approach to Modeling Networks , 2008, J. Mach. Learn. Res..

[31]  Jeffrey Xu Yu,et al.  Querying k-truss community in large and dynamic graphs , 2014, SIGMOD Conference.

[32]  Mikkel Thorup,et al.  Undirected single-source shortest paths with positive integer weights in linear time , 1999, JACM.

[33]  Nectarios Koziris,et al.  Employing Transactional Memory and Helper Threads to Speedup Dijkstra's Algorithm , 2009, 2009 International Conference on Parallel Processing.

[34]  Jens Vygen,et al.  A generalization of Dijkstra's shortest path algorithm with applications to VLSI routing , 2009, J. Discrete Algorithms.

[35]  Robert E. Tarjan,et al.  Fibonacci heaps and their uses in improved network optimization algorithms , 1984, JACM.

[36]  Yijie Han,et al.  Efficient parallel algorithms for computing all pair shortest paths in directed graphs , 1992, SPAA '92.