Performance evaluation of link-based cache coherence schemes

The authors have evaluated the implementation and performance tradeoffs between three directory-based cache coherence protocols. They study two link-based approaches, called tree-based and linear-list protocols, and contrast their performance and implementation cost with that of a full-map protocol. Using program-driven simulation and a set of three benchmark programs, it was found that tree-based and linear-list protocols performed almost as well as full-map protocols but with a considerably lower implementation cost. However, if the sharing set is large, linear-list schemes may suffer because of the large write latency while tree-based protocols still perform well.<<ETX>>

[1]  Anoop Gupta,et al.  Comparative Performance Evaluation of Cache-Coherent NUMA and COMA Architectures , 1992, [1992] Proceedings the 19th Annual International Symposium on Computer Architecture.

[2]  Leslie Lamport,et al.  How to Make a Multiprocessor Computer That Correctly Executes Multiprocess Programs , 2016, IEEE Transactions on Computers.

[3]  Per Stenström,et al.  The Cachemire Test Bench A Flexible And Effective Approach For Simulation Of Multiprocessors , 1993, [1993] Proceedings 26th Annual Simulation Symposium.

[4]  Anoop Gupta,et al.  SPLASH: Stanford parallel applications for shared-memory , 1992, CARN.

[5]  Anoop Gupta,et al.  Performance evaluation of memory consistency models for shared-memory multiprocessors , 1991, ASPLOS IV.

[6]  P. Stenstrom A survey of cache coherence schemes for multiprocessors , 1990, Computer.

[7]  Paul Feautrier,et al.  A New Solution to Coherence Problems in Multicache Systems , 1978, IEEE Transactions on Computers.

[8]  Anant Agarwal,et al.  LimitLESS directories: A scalable cache coherence scheme , 1991, ASPLOS IV.

[9]  Per Stenström,et al.  The Scalable Tree Protocol-a cache coherence approach for large-scale multiprocessors , 1992, [1992] Proceedings of the Fourth IEEE Symposium on Parallel and Distributed Processing.

[10]  Kevin P. McAuliffe,et al.  RP3 Processor-Memory Element , 1985, ICPP.

[11]  James H. Patterson,et al.  Portable Programs for Parallel Processors , 1987 .

[12]  Anoop Gupta,et al.  Cache Invalidation Patterns in Shared-Memory Multiprocessors , 1992, IEEE Trans. Computers.

[13]  P. Stenstrom A Cache Consistency Protocol For Multiprocessors With Multistage Networks , 1989, The 16th Annual International Symposium on Computer Architecture.