Leveraging access mode declarations in a model for memory consistency in heterogeneous systems

Abstract On a system that exposes disjoint memory spaces to the software, a program has to address memory consistency issues and perform data transfers so that it always accesses valid data. Several approaches exist to ensure the consistency of the memory accessed. Here we are interested in the verification of a declarative approach where each component of a computation is annotated with an access mode declaring which part of the memory is read or written by the component. The programming framework uses the component annotations to guarantee the validity of the memory accesses. This is the mechanism used in VectorPU, a C++ library for programming CPU-GPU heterogeneous systems. This article proves the correctness of the software cache-coherence mechanism used in VectorPU. Beyond the scope of VectorPU, this article provides a simple and effective formalization of memory consistency mechanisms based on the explicit declaration of the effect of each component on each memory space. The formalism we propose also takes into account arrays for which a single validity status is stored for the whole array; additional mechanisms for dealing with overlapping arrays are also studied.

[1]  Leslie Lamport,et al.  Lazy caching in TLA , 1999, Distributed Computing.

[2]  P ? ? ? ? ? ? ? % ? ? ? ? , 1991 .

[3]  Pablo de la Fuente,et al.  Formal Verification of Coherence for a Shared Memory Multiprocessor Model , 2001, PaCT.

[4]  Christoph W. Kessler,et al.  Ensuring Memory Consistency in Heterogeneous Systems Based on Access Mode Declarations , 2018, 2018 International Conference on High Performance Computing & Simulation (HPCS).

[5]  Karl Crary,et al.  A Calculus for Relaxed Memory , 2015, POPL.

[6]  Einar Broch Johnsen,et al.  An operational semantics of cache coherent multicore architectures , 2016, SAC.

[7]  Christoph W. Kessler,et al.  VectorPU: A Generic and Efficient Data-container and Component Model for Transparent Data Transfer on GPU-based Heterogeneous Systems , 2017, PARMA-DITAM '17.

[8]  Flemming Nielson,et al.  Type and Effect Systems , 1999, Correct System Design.

[9]  Christoph W. Kessler,et al.  Smart Containers and Skeleton Programming for GPU-Based Systems , 2015, International Journal of Parallel Programming.

[10]  Einar Broch Johnsen,et al.  A Formal Model of Parallel Execution on Multicore Architectures with Multilevel Caches , 2017, FACS.

[11]  Einar Broch Johnsen,et al.  A Maude Framework for Cache Coherent Multicore Architectures , 2016, WRLA.

[12]  Cédric Augonnet,et al.  StarPU: a unified platform for task scheduling on heterogeneous multicore architectures , 2011, Concurr. Comput. Pract. Exp..

[13]  Rob Gerth Sequential consistency and the lazy caching algorithm , 1999, Distributed Computing.

[14]  Christoph W. Kessler,et al.  SkePU: a multi-backend skeleton programming library for multi-GPU systems , 2010, HLPP '10.

[15]  Michel Dubois,et al.  Verification techniques for cache coherence protocols , 1997, CSUR.