Understanding POWER multiprocessors

Exploiting today's multiprocessors requires high-performance and correct concurrent systems code (optimising compilers, language runtimes, OS kernels, etc.), which in turn requires a good understanding of the observable processor behaviour that can be relied on. Unfortunately this critical hardware/software interface is not at all clear for several current multiprocessors. In this paper we characterise the behaviour of IBM POWER multiprocessors, which have a subtle and highly relaxed memory model (ARM multiprocessors have a very similar architecture in this respect). We have conducted extensive experiments on several generations of processors: POWER G5, 5, 6, and 7. Based on these, on published details of the microarchitectures, and on discussions with IBM staff, we give an abstract-machine semantics that abstracts from most of the implementation detail but explains the behaviour of a range of subtle examples. Our semantics is explained in prose but defined in rigorous machine-processed mathematics; we also confirm that it captures the observable processor behaviour, or the architectural intent, for our examples with an executable checker. While not officially sanctioned by the vendor, we believe that this model gives a reasonable basis for reasoning about current POWER multiprocessors. Our work should bring new clarity to concurrent systems programming for these architectures, and is a necessary precondition for any analysis or verification. It should also inform the design of languages such as C and C++, where the language memory model is constrained by what can be efficiently compiled to such multiprocessors.

[1]  Francesco Zappa Nardelli,et al.  86-TSO : A Rigorous and Usable Programmer ’ s Model for x 86 Multiprocessors , 2010 .

[2]  Jade Alglave,et al.  Litmus: Running Tests against Hardware , 2011, TACAS.

[3]  Francesco Zappa Nardelli,et al.  Lem: A Lightweight Tool for Heavyweight Semantics , 2011, ITP.

[4]  Michel Cekleov,et al.  Formal Specification of Memory Models , 1992 .

[5]  K. Gharachodoo,et al.  Memory consistency models for shared memory multiprocessors , 1996 .

[6]  Janice M. Stone,et al.  Storage in the power PC , 1995, IEEE Micro.

[7]  Michel Dubois,et al.  Memory access buffering in multiprocessors , 1998, ISCA '98.

[8]  Francesco Zappa Nardelli,et al.  The semantics of power and ARM multiprocessor machine code , 2009, DAMP '09.

[9]  SewellPeter,et al.  Understanding POWER multiprocessors , 2011 .

[10]  William W. Collier,et al.  Reasoning about parallel architectures , 1992 .

[11]  Balaram Sinharoy,et al.  POWER5 system microarchitecture , 2005, IBM J. Res. Dev..

[12]  Leslie Lamport,et al.  Checking Cache-Coherence Protocols with TLA+ , 2003, Formal Methods Syst. Des..

[13]  Sarita V. Adve,et al.  Shared Memory Consistency Models: A Tutorial , 1996, Computer.

[14]  Jade Alglave,et al.  Fences in Weak Memory Models , 2010, CAV.

[15]  Peter Sewell,et al.  Mathematizing C++ concurrency , 2011, POPL '11.

[16]  Hans-Juergen Boehm,et al.  Foundations of the C++ concurrency memory model , 2008, PLDI '08.

[17]  Tom Ridge,et al.  The semantics of x86-CC multiprocessor machine code , 2009, POPL '09.

[18]  Francisco Corella,et al.  Specification of the powerpc shared memory architecture , 1993 .

[19]  Peter Sewell,et al.  A Better x86 Memory Model: x86-TSO , 2009, TPHOLs.

[20]  Balaram Sinharoy,et al.  POWER7: IBM's next generation server processor , 2010, 2009 IEEE Hot Chips 21 Symposium (HCS).

[21]  Ilijas Farah,et al.  UNIVERSITÉ PARIS 7 - DENIS DIDEROT UNIVERSITÁ DI TORINO Thèse de doctorat en logique mathématique APPLICATIONS OF THE PROPER FORCING AXIOM TO CARDINAL ARITHMETIC , 2006 .

[22]  Allon Adir,et al.  Information-Flow Models for Shared Memory with an Application to the PowerPC Architecture , 2003, IEEE Trans. Parallel Distributed Syst..

[23]  R. J. Joenk,et al.  IBM journal of research and development: information for authors , 1978 .

[24]  Samin Ishtiaq,et al.  Reasoning about the ARM weakly consistent memory model , 2008, MSPC '08.

[25]  Cathy May,et al.  The PowerPC Architecture: A Specification for a New Family of RISC Processors , 1994 .

[26]  Eric M. Schwarz,et al.  IBM POWER6 microarchitecture , 2007, IBM J. Res. Dev..

[27]  Yue Yang,et al.  Analyzing the Intel Itanium Memory Ordering Rules Using Logic Programming and SAT , 2003, CHARME.