Establishing Causality as a Desideratum for Memory Models and Transformations of Parallel Programs

In this paper, we establish a notion of causality that should be used as a desideratum for memory models and code transformations of parallel programs. We introduce a Causal Acyclic Consistency (CAC) model which is weak enough to allow various useful code transformations, yet still strong enough to prevent any execution that exhibits “causal cycles” that may be caused by the Java Memory Model (JMM) [18]. For memory models, we introduce a graph model called causality graph that can be used to analyze if a particular program execution violates causality. By using causality graph, we show that a popular memory model (such as the Java memory model) can lead to program executions that exhibit causality violations with respect to our notion of causality. For code transformations, we establish criteria to identify transformations that are causality-preserving which do not result in any execution that exhibits causality violation. We showed that the CAC model allows all the causality-preserving transformations. Finally, we present preliminary experimental results for a load elimination optimization to motivate the performance benefit of using the CAC model relative to the Sequential Consistency (SC) model which is the most basic memory model. For the benchmark program studied, the number of getfield operations performed was reduced by 37.9% by using the CAC model instead of the SC model, and the execution time on a 16-core processor was reduced by 46.2%.

[1]  Vivek Sarkar,et al.  Location Consistency-A New Memory Model and Cache Consistency Protocol , 2000, IEEE Trans. Computers.

[2]  Sarita V. Adve,et al.  Shared Memory Consistency Models: A Tutorial , 1996, Computer.

[3]  Gil Neiger,et al.  Causal memory: definitions, implementation, and programming , 1995, Distributed Computing.

[4]  Vivek Sarkar,et al.  Interprocedural Load Elimination for Dynamic Optimization of Parallel Programs , 2009, 2009 18th International Conference on Parallel Architectures and Compilation Techniques.

[5]  Mark D. Hill,et al.  A Unified Formalization of Four Shared-Memory Models , 1993, IEEE Trans. Parallel Distributed Syst..

[6]  Mark D. Hill,et al.  Weak ordering—a new definition , 1998, ISCA '98.

[7]  Vivek Sarkar,et al.  X10: an object-oriented approach to non-uniform cluster computing , 2005, OOPSLA '05.

[8]  Vivek Sarkar,et al.  On the Importance of an End-To-End View of Memory Consistency in Future Computer Systems , 1997, ISHPC.

[9]  Jeremy Manson,et al.  The Java memory model , 2005, POPL '05.

[10]  Ken Kennedy,et al.  Optimizing Compilers for Modern Architectures: A Dependence-based Approach , 2001 .

[11]  Leslie Lamport,et al.  How to Make a Multiprocessor Computer That Correctly Executes Multiprocess Programs , 2016, IEEE Transactions on Computers.

[12]  Hans-Juergen Boehm,et al.  Foundations of the C++ concurrency memory model , 2008, PLDI '08.

[13]  V AdveSarita,et al.  Weak orderinga new definition , 1990 .

[14]  Anoop Gupta,et al.  Memory consistency and event ordering in scalable shared-memory multiprocessors , 1990, [1990] Proceedings. The 17th Annual International Symposium on Computer Architecture.

[15]  Radha Jagadeesan,et al.  A theory of memory models , 2007, PPOPP.

[16]  David A. Padua,et al.  Basic compiler algorithms for parallel programs , 1999, PPoPP '99.

[17]  Bronis R. de Supinski,et al.  Complete Formal Specification of the OpenMP Memory Model , 2007, International Journal of Parallel Programming.

[18]  Daniel H. Linder,et al.  Access Graphs: A Model for Investigating Memory Consistency , 1994, IEEE Trans. Parallel Distributed Syst..

[19]  Edsger W. Dijkstra,et al.  Cooperating sequential processes , 2002 .

[20]  Dennis Shasha,et al.  Efficient and correct execution of parallel programs that share memory , 1988, TOPL.

[21]  Anoop Gupta,et al.  Memory consistency and event ordering in scalable shared-memory multiprocessors , 1990, ISCA '90.