PipeCheck: Specifying and Verifying Microarchitectural Enforcement of Memory Consistency Models

We present PipeCheck, a methodology and automated tool for verifying that a particular micro architecture correctly implements the consistency model required by its architectural specification. PipeCheck adapts the notion of a "happens before" graph from architecture-level analysis techniques to the micro architecture space. Each node in the "micro architecturally happens before" (μhb) graph represents not only a memory instruction, but also a particular location (e.g., Pipeline stage) within the micro architecture. Architectural specifications such as "preserved program order" are then treated as propositions to be verified, rather than simply as assumptions. PipeCheck allows an architect to easily and rigorously test whether a micro architecture is stronger than, equal in strength to, or weaker than its architecturally-specified consistency model. We also specify and analyze the behavior of common micro architectural optimizations such as speculative load reordering which technically violate formal architecture-level definitions. We evaluate PipeCheck using a library of established litmus tests on a set of open-source pipelines. Using PipeCheck, we were able to validate the largest pipeline, the Open SPARC T2, in just minutes. We also identified a bug in the O3 pipeline of the gem5 simulator.

[1]  Robert Sims,et al.  Alpha architecture reference manual , 1992 .

[2]  Michel Dubois,et al.  Memory access buffering in multiprocessors , 1998, ISCA '98.

[3]  Francesco Zappa Nardelli,et al.  The semantics of power and ARM multiprocessor machine code , 2009, DAMP '09.

[4]  Jade Alglave,et al.  Understanding POWER multiprocessors , 2011, PLDI '11.

[5]  Phillip B. Gibbons,et al.  Testing Shared Memories , 1997, SIAM J. Comput..

[6]  Peter Sewell,et al.  Clarifying and compiling C/C++ concurrency: from C++11 to POWER , 2012, POPL '12.

[7]  Somayeh Sardashti,et al.  The gem5 simulator , 2011, CARN.

[8]  Peter Sewell,et al.  A Better x86 Memory Model: x86-TSO (Extended Version) , 2009 .

[9]  Dennis Shasha,et al.  Efficient and correct execution of parallel programs that share memory , 1988, TOPL.

[10]  Anoop Gupta,et al.  Two Techniques to Enhance the Performance of Memory Consistency Models , 1991, ICPP.

[11]  Hans-Juergen Boehm,et al.  Foundations of the C++ concurrency memory model , 2008, PLDI '08.

[12]  47th Annual IEEE/ACM International Symposium on Microarchitecture, MICRO 2014, Cambridge, United Kingdom, December 13-17, 2014 , 2014, MICRO.

[13]  Jade Alglave,et al.  Fences in Weak Memory Models , 2010, CAV.

[14]  Anoop Gupta,et al.  Memory consistency and event ordering in scalable shared-memory multiprocessors , 1990, ISCA '90.

[15]  Jade Alglave,et al.  Stability in Weak Memory Models , 2011, CAV.

[16]  Francesco Zappa Nardelli,et al.  Relaxed memory models must be rigorous , 2009 .

[17]  Michel Dubois,et al.  Correct memory operation of cache-based multiprocessors , 1987, ISCA '87.

[18]  Peter Sewell,et al.  A Better x86 Memory Model: x86-TSO , 2009, TPHOLs.

[19]  No License,et al.  Intel ® 64 and IA-32 Architectures Software Developer ’ s Manual Volume 3 A : System Programming Guide , Part 1 , 2006 .

[20]  Eran Yahav,et al.  Automatic inference of memory fences , 2010, Formal Methods in Computer Aided Design.

[21]  S. Tucker Taft,et al.  Information technology — Programming Languages — Ada , 2001 .

[22]  Michael Norrish,et al.  A Brief Overview of HOL4 , 2008, TPHOLs.

[23]  Arvind,et al.  Memory Model = Instruction Reordering + Store Atomicity , 2006, 33rd International Symposium on Computer Architecture (ISCA'06).

[24]  Richard L. Sites,et al.  Alpha Architecture Reference Manual , 1995 .

[25]  Leslie Lamport,et al.  Time, clocks, and the ordering of events in a distributed system , 1978, CACM.

[26]  Carl Ramey,et al.  Functional verification of a multiple-issue, out-of-order, superscalar Alpha processor-the DEC Alpha 21264 microprocessor , 1998, Proceedings 1998 Design and Automation Conference. 35th DAC. (Cat. No.98CH36175).

[27]  William W. Collier,et al.  Reasoning about parallel architectures , 1992 .

[28]  Rajeev Alur,et al.  An Axiomatic Memory Model for POWER Multiprocessors , 2012, CAV.

[29]  David L Weaver,et al.  The SPARC architecture manual : version 9 , 1994 .

[30]  Josep Torrellas,et al.  WeeFence: toward making fences free in TSO , 2013, ISCA.

[31]  Allon Adir,et al.  Information-Flow Models for Shared Memory with an Application to the PowerPC Architecture , 2003, IEEE Trans. Parallel Distributed Syst..

[32]  Jeremy Manson,et al.  The Java memory model , 2005, POPL '05.

[33]  Daniel Kroening,et al.  Partial Orders for Efficient Bounded Model Checking of Concurrent Software , 2013, CAV.

[34]  Jade Alglave,et al.  A formal hierarchy of weak memory models , 2012, Formal Methods in System Design.

[35]  Sridhar Narayanan,et al.  TSOtool: a program for verifying memory systems using the memory consistency model , 2004, Proceedings. 31st Annual International Symposium on Computer Architecture, 2004..

[36]  Emina Torlak,et al.  MemSAT: checking axiomatic specifications of memory models , 2010, PLDI '10.

[37]  Yue Yang,et al.  Analyzing the Intel Itanium Memory Ordering Rules Using Logic Programming and SAT , 2003, CHARME.

[38]  M. Hill,et al.  Weak ordering-a new definition , 1990, [1990] Proceedings. The 17th Annual International Symposium on Computer Architecture.

[39]  Leslie Lamport,et al.  How to Make a Multiprocessor Computer That Correctly Executes Multiprocess Programs , 2016, IEEE Transactions on Computers.

[40]  James R. Goodman,et al.  Cache Consistency and Sequential Consistency , 1991 .

[41]  Francisco Corella,et al.  Specification of the powerpc shared memory architecture , 1993 .