Cache Prefetching

Cache prefetching is a memory latency hiding technique that attempts to bring data to the caches before the occurrence of a miss. A central aspect of all cache prefetching techniques is their ability to detect and predict particular memory reference patterns. In this paper we will introduce and compare how this is done for each of the speci c memory reference patterns that have been identi ed. Because most applications contain many di erent memory reference patterns, we will also discuss how prefetching techniques can be combined into a mechanism to deal with a larger number of memory reference patterns. Finally, we will discuss how applicable the currently used prefetching techniques are for a multimedia processing system.

[1]  David A. Patterson,et al.  Computer Architecture: A Quantitative Approach , 1969 .

[2]  Norman P. Jouppi,et al.  Improving direct-mapped cache performance by the addition of a small fully-associative cache and prefetch buffers , 1990, [1990] Proceedings. The 17th Annual International Symposium on Computer Architecture.

[3]  Jean-Loup Baer,et al.  An effective on-chip preloading scheme to reduce data access penalty , 1991, Proceedings of the 1991 ACM/IEEE Conference on Supercomputing (Supercomputing '91).

[4]  Anoop Gupta,et al.  Design and evaluation of a compiler algorithm for prefetching , 1992, ASPLOS V.

[5]  J.W.C. Fu,et al.  Stride Directed Prefetching In Scalar Processors , 1992, [1992] Proceedings the 25th Annual International Symposium on Microarchitecture MICRO 25.

[6]  R. Kessler,et al.  Evaluating stream buffers as a secondary cache replacement , 1994, Proceedings of 21 International Symposium on Computer Architecture.

[7]  Ann Marie Grizzaffi Maynard,et al.  Contrasting characteristics and cache performance of technical and multi-user commercial workloads , 1994, ASPLOS VI.

[8]  Sharad Mehrotra,et al.  Data prefetch mechanisms for accelerating symbolic and numeric computation , 1996 .

[9]  Uri C. Weiser,et al.  MMX technology extension to the Intel architecture , 1996, IEEE Micro.

[10]  Todd C. Mowry,et al.  Compiler-based prefetching for recursive data structures , 1996, ASPLOS VII.

[11]  Marc Tremblay,et al.  VIS speeds new media processing , 1996, IEEE Micro.

[12]  Ruby B. Lee Subword parallelism with MAX-2 , 1996, IEEE Micro.

[13]  Michael J. Flynn,et al.  A comparison of hardware prefetching techniques for multimedia benchmarks , 1996, Proceedings of the Third IEEE International Conference on Multimedia Computing and Systems.

[14]  Shlomit S. Pinter,et al.  Tango: a hardware-based data prefetching technique for superscalar processors , 1996, Proceedings of the 29th Annual IEEE/ACM International Symposium on Microarchitecture. MICRO 29.

[15]  Luddy Harrison Examination of a memory access classification scheme for pointer-intensive and numeric programs , 1996, ICS '96.

[16]  Douglas J. Joseph,et al.  Prefetching Using Markov Predictors , 1997, Conference Proceedings. The 24th Annual International Symposium on Computer Architecture.

[17]  Yongmin Kim,et al.  Image computing library for a next-generation VLIW multimedia processor , 1998, Electronic Imaging.

[18]  James E. Smith Decoupled access/execute computer architectures , 1982, ISCA '98.

[19]  Ramesh Radhakrishnan,et al.  Evaluating MMX technology using DSP and multimedia applications , 1998, Proceedings. 31st Annual ACM/IEEE International Symposium on Microarchitecture.

[20]  Andreas Moshovos,et al.  Dependence based prefetching for linked data structures , 1998, ASPLOS VIII.

[21]  Ruby B. Lee Efficiency of microSIMD architectures and index-mapped data for media processors , 1998, Electronic Imaging.

[22]  Michael J. Flynn,et al.  An automated method for software controlled cache prefetching , 1998, Proceedings of the Thirty-First Hawaii International Conference on System Sciences.

[23]  Yongmin Kim,et al.  Critical review of programmable media processor architectures , 1998, Electronic Imaging.

[24]  Todd C. Mowry,et al.  Compiler and Hardware Support for Automatic Instruction Prefetching: A Cooperative Approach , 1998 .

[25]  Todd C. Mowry,et al.  Cooperative prefetching: compiler and hardware support for effective instruction prefetching in modern processors , 1998, Proceedings. 31st Annual ACM/IEEE International Symposium on Microarchitecture.

[26]  Norman P. Jouppi,et al.  Performance of image and video processing with general-purpose processors and media ISA extensions , 1999, ISCA.