Reducing coherence-related overhead in multiprocessor systems

The coherence problem is one of the critical issues designers have to cope with when they apply caching techniques to multiprocessor systems. The copies which most affect consistency are the shared ones, i.e. copies of memory blocks accessed by concurrent processes in a multiprogramming environment; nevertheless, a private data block of a process may become resident in more than one cache-and need to be treated as shared (useless shared copy) with respect to coherence-related operations-as a consequence of the migration of the owner process. These copies reduce the global performance of the system, since they involve a useless (time consuming) transaction on the shared bus on each write operation, to maintain consistency of all remote copies. In the paper, we introduce a hardware solution which can be successfully employed with any snooping protocol to eliminate useless shared copies. Finally, we show how this technique can be applied to a specific coherence protocol, in order to improve global system performance.<<ETX>>

[1]  Cosimo Antonio Prete,et al.  RST cache memory design for a highly coupled multiprocessor system , 1991, IEEE Micro.

[2]  James K. Archibald,et al.  Cache coherence protocols: evaluation using a multiprocessor simulation model , 1986, TOCS.

[3]  Anant Agarwal,et al.  Analysis of cache performance for operating systems and multiprogramming , 1989, The Kluwer international series in engineering and computer science.

[4]  Pradeep S. Sindhu,et al.  The Architecture of the Dragon , 1985, COMPCON.

[5]  Michel Dubois,et al.  Effects of Cache Coherency in Multiprocessors , 1982, IEEE Transactions on Computers.

[6]  Syed Masud Mahmud Comments on "Synthetic Traces for Trace-Driven Simulation of Cache Memories" , 1994, IEEE Trans. Computers.

[7]  Randy H. Katz,et al.  Simulation analysis of data-sharing in shared memory multiprocessors , 1989 .

[8]  James R. Goodman,et al.  Cache memory optimization to reduce processor/memory traffic , 1987 .

[9]  Cosimo Antonio Prete,et al.  A new solution of coherence protocol for tightly coupled multiprocessor systems , 1990, Microprocessing and Microprogramming.

[10]  Randy H. Katz,et al.  Implementing a cache consistency protocol , 1985, ISCA 1985.

[11]  Cosimo Antonio Prete,et al.  A process cache memory for tightly coupled multiprocessor systems , 1992, ACM Southeast Regional Conference.

[12]  Anna R. Karlin,et al.  Competitive snoopy caching , 1986, 27th Annual Symposium on Foundations of Computer Science (sfcs 1986).

[13]  Luigi M. Ricciardi,et al.  A Trace-Driven Simulator for Performance Evaluation of Cache-Based Multiprocessor Systems , 1995, IEEE Trans. Parallel Distributed Syst..

[14]  Bart C. Vashaw Address trace collection and trace driven simulation of bus based, shared memory multiprocessors , 1992 .

[15]  R. H. Katz,et al.  Evaluating the performance of four snooping cache coherency protocols , 1989, ISCA '89.

[16]  Lawrence C. Stewart,et al.  Firefly: a multiprocessor workstation , 1987, IEEE Trans. Computers.