Eliminating cache conflict misses through XOR-based placement functions

This paper makes the case for the use of XOR-based placement functions for cache memories. It shows that these XOR-mapping schemes can eliminate many conflict misses for direct-mapped and victim caches and practically all of them for (pseudo) two-way associative organizations. The paper evaluates the performance of XOR-mapping schemes for a number of different cache organizations: direct-mapped, set-associative, victim, hash-rehash, column-associative and skewed-associative. It also proposes novel replacement policies for some of these cache organizations. In particular, it presents a low-cost implementation of a pure LRU replacement policy which demonstrates a significant improvement over the pseudo-LRU replacement previously proposed. The paper shows that for a 8 Kbyte data cache, XOR-mapping schemes approximately halve the miss ratio for two-way associative and column-associative organizations. Skewed-associative caches, which already make use of XOR-mapping functions, can benefit from the LRU replacement and also from the use of more sophisticated mapping functions. For two-way associative, columnassociative and two-way skewed-associative organizations, XORmapping schemes achieve a miss ratio that is not higher than 1.10 times that of a fully-associative cache. XOR mapping schemes also provide a very significant reduction in the miss ratio for the other cache organizations, including the direct-mapped cache. Ultimately, the conclusion of this study is that XOR-based placement functions unequivocally provide highly significant performance benefits to most cache organizations.

[1]  David A. Patterson,et al.  Computer Architecture: A Quantitative Approach , 1969 .

[2]  William Jalby,et al.  XOR-Schemes: A Flexible Data Organization in Parallel Memories , 1985, ICPP.

[3]  Alan Norton,et al.  A Class of Boolean Linear Transformations for Conflict-Free Power-of-Two Stride Access , 1987, ICPP.

[4]  Mark Horowitz,et al.  Cache performance of operating system and multiprogramming workloads , 1988, TOCS.

[5]  A. Argawal,et al.  Cache performance of operating systems and multiprogramming , 1988 .

[6]  D. T. Harper,et al.  A dynamic storage scheme for conflict-free vector access , 1989, ISCA '89.

[7]  Anant Agarwal,et al.  Analysis of cache performance for operating systems and multiprogramming , 1989, The Kluwer international series in engineering and computer science.

[8]  B. Ramakrishna Rau,et al.  The Cydram 5 Stride-Insensitive Memory System , 1989, ICPP.

[9]  Norman P. Jouppi,et al.  Improving direct-mapped cache performance by the addition of a small fully-associative cache and pre , 1990, ISCA 1990.

[10]  Norman P. Jouppi,et al.  Improving direct-mapped cache performance by the addition of a small fully-associative cache and prefetch buffers , 1990, [1990] Proceedings. The 17th Annual International Symposium on Computer Architecture.

[11]  D. T. Harper Reducing memory contention in shared memory multiprocessors , 1991, ISCA '91.

[12]  B. Ramakrishna Rau,et al.  Pseudo-randomly interleaved memory , 1991, ISCA '91.

[13]  Eduard Ayguadé,et al.  Increasing the number of strides for conflict-free vector access , 1992, ISCA '92.

[14]  André Seznec,et al.  A case for two-way skewed-associative caches , 1993, ISCA '93.

[15]  Jeff Yetter,et al.  Performance features of the PA7100 microprocessor , 1993, IEEE Micro.

[16]  François Bodin,et al.  Skewed-associative Caches , 1993, PARLE.

[17]  Anant Agarwal,et al.  Column-associative caches: a technique for reducing the miss rate of direct-mapped caches , 1993, ISCA '93.

[18]  Dirk Grunwald,et al.  Predictive sequential associative cache , 1996, Proceedings. Second International Symposium on High-Performance Computer Architecture.

[19]  J. Gonz Idetifying Contributing Factors to ILP , 1996 .

[20]  K. Kavi Cache Memories Cache Memories in Uniprocessors. Reading versus Writing. Improving Performance , 2022 .