Employing Transactional Memory and Helper Threads to Speedup Dijkstra's Algorithm

In this paper we work on the parallelization of the inherently serial Dijkstra's algorithm on modern multicore platforms. Dijkstra's algorithm is a greedy algorithm that computes Single Source Shortest Paths for graphs with non-negative edges and is based on the iterative extraction of nodes from a priority queue. This property limits the explicit parallelism of the algorithm and any attempt to utilize the remaining parallelism results in significant slowdowns due to synchronization overheads. To deal with these problems, we employ the concept of Helper Threads (HT) to extract parallelism on a non-traditional fashion and Transactional Memory (TM) to efficiently orchestrate the concurrent threads' accesses to shared data structures. Results demonstrate that the proposed implementation is able to achieve performance speedups (reaching up to 1.84 for 14 threads), indicating that the two paradigms could be efficiently combined.

[1]  Ulrich Meyer,et al.  Delta-Stepping: A Parallel Single Source Shortest Path Algorithm , 1998, ESA.

[2]  Nectarios Koziris,et al.  Early experiences on accelerating Dijkstra's algorithm using transactional memory , 2009, 2009 IEEE International Symposium on Parallel & Distributed Processing.

[3]  Bratin Saha,et al.  McRT-STM: a high performance software transactional memory system for a multi-core runtime , 2006, PPoPP '06.

[4]  Andrew Lumsdaine,et al.  Single-Source Shortest Paths with the Parallel Boost Graph Library , 2006, The Shortest Path Problem.

[5]  Dean M. Tullsen,et al.  Mapping Out a Path from Hardware Transactional Memory to Speculative Multithreading , 2009, 2009 18th International Conference on Parallel Architectures and Compilation Techniques.

[6]  Mikel Luján,et al.  A Study of a Transactional Parallel Routing Algorithm , 2007, 16th International Conference on Parallel Architecture and Compilation Techniques (PACT 2007).

[7]  David A. Bader,et al.  Parallel Shortest Path Algorithms for Solving Large-Scale Instances , 2006, The Shortest Path Problem.

[8]  Michael F. Spear,et al.  Delaunay Triangulation with Transactions and Barriers , 2007, 2007 IEEE 10th International Symposium on Workload Characterization.

[9]  David A. Bader,et al.  An efficient transactional memory algorithm for computing minimum spanning forest of sparse graphs , 2009, PPoPP '09.

[10]  Maurice Herlihy,et al.  Transactional Memory: Architectural Support For Lock-free Data Structures , 1993, Proceedings of the 20th Annual International Symposium on Computer Architecture.

[11]  Maurice Herlihy,et al.  Transactional Memory: Architectural Support For Lock-free Data Structures , 1993, Proceedings of the 20th Annual International Symposium on Computer Architecture.

[12]  Srinivasan Parthasarathy,et al.  An Efficient Algorithm for Concurrent Priority Queue Heaps , 1996, Inf. Process. Lett..

[13]  Weifeng Zhang,et al.  An event-driven multithreaded dynamic optimization framework , 2005, 14th International Conference on Parallel Architectures and Compilation Techniques (PACT'05).

[14]  Christopher Hughes,et al.  Speculative precomputation: long-range prefetching of delinquent loads , 2001, ISCA 2001.

[15]  Fredrik Larsson,et al.  Simics: A Full System Simulation Platform , 2002, Computer.

[16]  Marc Tremblay,et al.  A Third-Generation 65nm 16-Core 32-Thread Plus 32-Scout-Thread CMT SPARC® Processor , 2008, 2008 IEEE International Solid-State Circuits Conference - Digest of Technical Papers.

[17]  Milo M. K. Martin,et al.  Multifacet's general execution-driven multiprocessor simulator (GEMS) toolset , 2005, CARN.

[18]  Christoforos E. Kozyrakis,et al.  Unlocking Concurrency , 2006, ACM Queue.

[19]  David A. Wood,et al.  LogTM-SE: Decoupling Hardware Transactional Memory from Caches , 2007, 2007 IEEE 13th International Symposium on High Performance Computer Architecture.

[20]  Jesper Larsson Träff,et al.  A Parallel Priority Queue with Constant Time Operations , 1998, J. Parallel Distributed Comput..

[21]  Robert B. Dial,et al.  Algorithm 360: shortest-path forest with topological ordering [H] , 1969, CACM.