Modeling the performance of limited pointers directories for cache coherence

Directory-hsed protocols have been proposed as an efficient means of implementing cache consistency in large-scale sharedmemory multiprocessors. One class of these protocols utilizes a limired pointers directory, which Stores the identities of a Small number of caches mntaining a given block of data. However. the performance potential of these directories in large-scale machines has been speculative at best. In this paper we introduce an analytic model that not only explains the behavior seen in small-scale simulation studies, but also allows us to extrapolate forward to evaluate the efficiency of limited pointers directories in large-scale systems. Our model shows that miss rates inherent to invalidation-based consistencyschemes are relatively high (typically 10% to 60%) for actively shared data, across a variety of workloads. We find that limited pointers schemes that resort to broadcasting invalidations when the pointers are exhausted perform very poorly in largescale machines, even if there are sufficient pointas most of the time. On the other hand, no-broadcast slrategies that limit the degree of caching to the number of pointers in an entry have only a modest impact on the cache miss rate and network traflic under a wide range of workloads. including those in which data blocks are actively accessed by a large number of processors.

[1]  James H. Patterson,et al.  Portable Programs for Parallel Processors , 1987 .

[2]  J. Mcdonald,et al.  Vectorization of a particle simulation method for hypersonic rarefied flow , 1988 .

[3]  A. Richard Newton,et al.  An empirical evaluation of two memory-efficient directory methods , 1990, ISCA '90.

[4]  Jonathan Rose LocusRoute: a parallel global router for standard cells , 1988, 25th ACM/IEEE, Design Automation Conference.Proceedings 1988..

[5]  David B. Gustavson,et al.  Scalable Coherent Interface , 1990, COMPEURO'90: Proceedings of the 1990 IEEE International Conference on Computer Systems and Software Engineering@m_Systems Engineering Aspects of Complex Computerized Systems.

[6]  Mark Horowitz Dynamic Pointer Allocation for Scalable Cache Coherence Directories , 1991 .

[7]  Paul Feautrier,et al.  A New Solution to Coherence Problems in Multicache Systems , 1978, IEEE Transactions on Computers.

[8]  Anoop Gupta,et al.  Reducing Memory and Traffic Requirements for Scalable Directory-Based Cache Coherence Schemes , 1990, ICPP.

[9]  Anant Agarwal,et al.  LimitLESS directories: A scalable cache coherence scheme , 1991, ASPLOS IV.

[10]  Anoop Gupta,et al.  Analysis of cache invalidation patterns in multiprocessors , 1989, ASPLOS III.

[11]  Anoop Gupta,et al.  Comparative evaluation of latency reducing and tolerating techniques , 1991, ISCA '91.

[12]  J. K. Archibald The cache coherence problem in shared-memory multiprocessors , 1987 .

[13]  Anant Agarwal,et al.  Directory-based cache coherence in large-scale multiprocessors , 1990, Computer.

[14]  Richard Simoni Implementing a Directory-Based Cache Consistency Protocol , 1990 .

[15]  Jr. Richard Thomas Simoni,et al.  Cache coherence directories for scalable multiprocessors , 1992 .

[16]  B. Delagi,et al.  Distributed-directory scheme: Stanford distributed-directory protocol , 1990, Computer.

[17]  Anoop Gupta,et al.  Characterization of Parallelism and Deadlocks in Distributed Digital Logic Simulation , 1988, 26th ACM/IEEE Design Automation Conference.

[18]  Stein Gjessing,et al.  Distributed-directory scheme: scalable coherent interface , 1990, Computer.