Filtering superfluous prefetches using density vectors

A previous evaluation of scheduled region prefetching showed that this technique eliminates the bulk of main-memory stall time for applications with spatial locality. The downside to that aggressive prefetching scheme is that, even when it successfully improves performance, it increases enormously the amount of superfluous memory traffic generated by a program. We measure the predictability of spatial locality using density vectors, bit vectors that track the block-level access pattern within a region of memory. We evaluate a number of policies that use density vector information to filter out prefetches that are unlikely to be useful. We show that, across our benchmarks, an average of 70% of useless prefetches can be eliminated with virtually no overall performance loss from reduced coverage. Thanks to the increase in prefetch accuracy, a few benchmarks show performance improvements as high as 35% over the base region prefetching scheme.

[1]  Yvon Jégou,et al.  Using virtual lines to enhance locality exploitation , 1994, ICS '94.

[2]  Wen-mei W. Hwu,et al.  Run-Time Adaptive Cache Hierarchy Management via Reference Analysis , 1997, ISCA.

[3]  Sanjeev Kumar,et al.  Exploiting spatial locality in data caches using spatial footprints , 1998, ISCA.

[4]  James R. Goodman,et al.  Hardware techniques to improve the performance of the processor/memory interface , 1998 .

[5]  Jean-Loup Baer,et al.  Pursuing the performance potential of dynamic cache line sizes , 1999, Proceedings 1999 IEEE International Conference on Computer Design: VLSI in Computers and Processors (Cat. No.99CB37040).

[6]  Wei-Fen Lin,et al.  Reducing DRAM latencies with an integrated memory hierarchy design , 2001, Proceedings HPCA Seventh International Symposium on High-Performance Computer Architecture.