Synchronization with multiprocessor caches

Introducing private caches in bus-based shared memory multiprocessors leads to the cache consistency problem since there may be multiple copies of shared data. However, the ability to snoop on the bus coupled with the fast broadcast capability allows the design of special hardware support for synchronization. We present a new lock-based cache scheme which incorporates synchronization into the cache coherency mechanism. With this scheme high-level synchronization primitives as well as low-level ones can be implemented without excessive overhead. Cost functions for well-known synchronization methods are derived for invalidation schemes, write update schemes, and our lock-based scheme. To accurately predict the performance implications of the new scheme, a new simulation model is developed embodying a widely accepted paradigm of parallel programming. It is shown that our lock-based protocol outperforms existing cache protocols.

[1]  Anoop Gupta,et al.  Memory-reference characteristics of multiprocessor applications under MACH , 1988, SIGMETRICS 1988.

[2]  Gurindar S. Sohi,et al.  Restricted Fetch and Φ operations for parallel processing , 1989, ICS '89.

[3]  Shreekant S. Thakkar,et al.  VLSI assist for a multiprocessor , 1987, ASPLOS.

[4]  Paul Feautrier,et al.  A New Solution to Coherence Problems in Multicache Systems , 1978, IEEE Transactions on Computers.

[5]  Kevin P. McAuliffe,et al.  RP3 Processor-Memory Element , 1985, ICPP.

[6]  Mary K. Vernon,et al.  Efficient synchronization primitives for large-scale cache-coherent multiprocessors , 1989, ASPLOS III.

[7]  Zhiyuan Li,et al.  A technique for reducing synchronization overhead in large scale multiprocessors , 1985, ISCA '85.

[8]  Anoop Gupta,et al.  Analysis of cache invalidation patterns in multiprocessors , 1989, ASPLOS III.

[9]  Alvin M. Despain,et al.  Multiprocessor cache synchronization: issues, innovations, evolution , 1986, ISCA '86.

[10]  Gordon Bell,et al.  C.mmp: a multi-mini-processor , 1972, AFIPS '72 (Fall, part II).

[11]  R.H. Katz,et al.  A characterization of sharing in parallel programs and its application to coherency protocol evaluation , 1988, [1988] The 15th Annual International Symposium on Computer Architecture. Conference Proceedings.

[12]  Shreekant Thakkar,et al.  VLSI assist for a multiprocessor , 1987, ASPLOS 1987.

[13]  King-Sun Fu,et al.  Data Coherence Problem in a Multicache System , 1985, IEEE Transactions on Computers.

[14]  Zhiyuan Li,et al.  A technique for reducing synchronization overhead in large scale multiprocessors , 1985, ISCA '85.

[15]  Abraham Silberschatz,et al.  Operating System Concepts , 1983 .

[16]  Nian-Feng Tzeng,et al.  Distributing Hot-Spot Addressing in Large-Scale Multiprocessors , 1987, IEEE Transactions on Computers.

[17]  Anoop Gupta,et al.  Memory-reference characteristics of multiprocessor applications under MACH , 1988, SIGMETRICS '88.

[18]  Constantine D. Polychronopoulos Static and Dynamic Loop Scheduling , 1988 .

[19]  Randy H. Katz,et al.  Implementing a cache consistency protocol , 1985, ISCA '85.

[20]  Philip Bitar,et al.  A Critique of Trace-Driven Simulation for Shared-Memory Multiprocessors , 1990 .

[21]  Michel Dubois,et al.  Synchronization, coherence, and event ordering in multiprocessors , 1988, Computer.

[22]  Constantine D. Polychronopoulos,et al.  Parallel programming and compilers , 1988 .

[23]  Larry Rudolph,et al.  Issues related to MIMD shared-memory computers: the NYU ultracomputer approach , 1985, ISCA '85.

[24]  James K. Archibald,et al.  Cache coherence protocols: evaluation using a multiprocessor simulation model , 1986, TOCS.

[25]  Lawrence C. Stewart,et al.  Firefly: a multiprocessor workstation , 1987, ASPLOS 1987.

[26]  Randy H. Katz,et al.  Evaluating The Performance Of Four Snooping Cache Coherency Protocols , 1989, The 16th Annual International Symposium on Computer Architecture.

[27]  Larry Rudolph,et al.  Issues Related to MIMD Shared-memory Computers: The NYU Ultracomputer Approach , 1985, ISCA.