Polynomial-Time Algorithms for Enforcing Sequential Consistency in SPMD Programs with Arrays

The simplest semantics for parallel shared memory programs is sequential consistency in which memory operations appear to take place in the order specified by the program. But many compiler optimizations and hardware features explicitly reorder memory operations or make use of overlapping memory operations which may violate this constraint. To ensure sequential consistency while allowing for these optimizations, traditional data dependence analysis is augmented with a parallel analysis called cycle detection. In this paper, we present new algorithms to enforce sequential consistency for the special case of the Single Program Multiple Data (SPMD) model of parallelism. First, we present an algorithm for the basic cycle detection problem, which lowers the running time from O(n 3 ) to O(n 2 ). Next, we present three polynomial-time methods that more accurately support programs with array accesses. These results are a step toward making sequentially consistent shared memory programming a practical model across a wide range of languages and hardware platforms.

[1]  Dror Eliezer Maydan Accurate analysis of array references , 1993 .

[2]  Chris Hankin,et al.  Abstract Interpretation of Declarative Languages , 1987 .

[3]  Richard M. Karp,et al.  The Organization of Computations for Uniform Recurrence Equations , 1967, JACM.

[4]  Edith Cohen,et al.  Strongly polynomial-time and NC algorithms for detecting cycles in periodic graphs , 1993, JACM.

[5]  Umesh Kumar,et al.  An Efficient Algorithm to Compute Delay Set in SPMD Programs , 2003, HiPC.

[6]  Dennis Shasha,et al.  Efficient and correct execution of parallel programs that share memory , 1988, TOPL.

[7]  Katherine A. Yelick,et al.  Analyses and Optimizations for Shared Address Space Programs , 1996, J. Parallel Distributed Comput..

[8]  Samuel P. Midkiff,et al.  Compiling programs with user parallelism , 1990 .

[9]  David A. Padua,et al.  Basic compiler algorithms for parallel programs , 1999, PPoPP '99.

[10]  Jaejin Lee,et al.  Hiding relaxed memory consistency with a compiler , 2001 .

[11]  Jaejin Lee,et al.  Hiding relaxed memory consistency with compilers , 2000, Proceedings 2000 International Conference on Parallel Architectures and Compilation Techniques (Cat. No.PR00622).

[12]  Leslie Lamport,et al.  How to Make a Multiprocessor Computer That Correctly Executes Multiprocess Programs , 2016, IEEE Transactions on Computers.

[13]  Katherine A. Yelick,et al.  Optimizing parallel programs with explicit synchronization , 1995, PLDI '95.

[14]  M. Hill,et al.  Weak ordering-a new definition , 1990, [1990] Proceedings. The 17th Annual International Symposium on Computer Architecture.

[15]  Kenneth Steiglitz,et al.  Testing for cycles in infinite graphs with periodic structure , 1987, STOC.

[16]  Martin C. Rinard,et al.  Symbolic bounds analysis of pointers, array indices, and accessed memory regions , 2005, TOPL.

[17]  Robert W. Numrich,et al.  Co-array Fortran for parallel programming , 1998, FORF.

[18]  Katherine Yelick,et al.  Titanium Language Reference Manual , 2001 .

[19]  Bernhard Steffen,et al.  Code motion for explicitly parallel programs , 1999, PPoPP '99.