Toward Large-Scale Shared Memory Multiprocessing

We are currently investigating two different approaches to scalable shared memory: Munin, a distributed shared memory (DSM) system implemented entirely in software, and Willow, a true shared memory multiprocessor with extensive hardware support for scalability. Munin allows parallel programs written for shared memory multiprocessors to be executed efficiently on distributed memory multiprocessors. Unlike existing DSM systems, which only provide a single mechanism for memory consistency, Munin provides multiple consistency protocols, matching protocol to data object based on the expected pattern of accesses to that object. We call this approach type-specific coherence. Munin also employs a relaxed consistency model to mask network latency and to minimize the number of messages required for keeping memory consistent. Willow is intended to be a true shared memory multiprocessor, providing memory capacity and performance capable of supporting over a thousand commercial microprocessors. These processors are arranged in cluster fashion, with a multi-level cache, I/O, synchronization, and memory hierarchy. Willow is distinguished from other shared memory multiprocessors by a layered memory organization that significantly reduces the impact of inclusion on the cache hierarchy and that exploits locality gradients. Willow also provides support for adaptive cache coherence, an approach similar to Munin’s type-specific coherence, whereby the consistency protocol used to manage each cache line is selected based on the expected or observed access behavior for the data stored in that line. Implementation of Munin is in progress; we are still designing Willow.

[1]  Brian N. Bershad,et al.  PRESTO: A system for object‐oriented parallel programming , 1988, Softw. Pract. Exp..

[2]  Abhinav Gupta,et al.  Analysis of cache invalidation patterns in multiprocessors , 1989, ASPLOS 1989.

[3]  Anoop Gupta,et al.  The directory-based cache coherence protocol for the DASH multiprocessor , 1990, ISCA '90.

[4]  Willy Zwaenepoel,et al.  Adaptive software cache management for distributed shared memory architectures , 1990, [1990] Proceedings. The 17th Annual International Symposium on Computer Architecture.

[5]  Elisa Bertino,et al.  Object-oriented database management systems: concepts and issues , 1991, Computer.

[6]  Paul Hudak,et al.  Memory coherence in shared virtual memory systems , 1986, PODC '86.

[7]  Anoop Gupta,et al.  Analysis of cache invalidation patterns in multiprocessors , 1989, ASPLOS III.

[8]  Shreekant S. Thakkar,et al.  The Symmetry Multiprocessor System , 1988, ICPP.

[9]  Jeffrey S. Chase,et al.  The Amber system: parallel programming on a network of multiprocessors , 1989, SOSP '89.

[10]  Anant Agarwal,et al.  Multiprocessor cache analysis using ATUM , 1988, ISCA '88.

[11]  Michel Dubois,et al.  Correct memory operation of cache-based multiprocessors , 1987, ISCA '87.

[12]  Willy Zwaenepoel,et al.  Munin: distributed shared memory based on type-specific memory coherence , 1990, PPOPP '90.

[13]  Randy H. Katz,et al.  The effect of sharing on the cache and bus performance of parallel programs , 1989, ASPLOS III.

[14]  Bjarne Stroustrup,et al.  C++ Programming Language , 1986, IEEE Softw..

[15]  Kai Li,et al.  Shared virtual memory on loosely coupled multiprocessors , 1986 .

[16]  Randy H. Katz,et al.  The effect of sharing on the cache and bus performance of parallel programs , 1989, ASPLOS 1989.

[17]  Anant Agarwal,et al.  APRIL: a processor architecture for multiprocessing , 1990, [1990] Proceedings. The 17th Annual International Symposium on Computer Architecture.

[18]  Mary K. Vernon,et al.  Efficient synchronization primitives for large-scale cache-coherent multiprocessors , 1989, ASPLOS 1989.

[19]  Anoop Gupta,et al.  Memory consistency and event ordering in scalable shared-memory multiprocessors , 1990, [1990] Proceedings. The 17th Annual International Symposium on Computer Architecture.

[20]  Philip J. Woest,et al.  The Wisconsin multicube: a new large-scale cache-coherent multiprocessor , 1988, ISCA '88.

[21]  Hendrik A. Goosen,et al.  Paradigm: a highly scalable shared-memory multicomputer architecture , 1991, Computer.

[22]  Leslie Lamport,et al.  How to Make a Multiprocessor Computer That Correctly Executes Multiprocess Programs , 2016, IEEE Transactions on Computers.

[23]  James R. Goodman,et al.  Cache Consistency and Sequential Consistency , 1991 .

[24]  Wen-Hann Wang,et al.  On the Inclusion Properties for Multi-Level Cache Hierarchies , 1988, ISCA.