A performance study of software and hardware data prefetching schemes
暂无分享,去创建一个
[1] Alexander V. Veidenbaum,et al. Compiler-directed data prefetching in multiprocessors with memory hierarchies , 1990, ICS '90.
[2] Dean M. Tullsen,et al. Limitations of cache prefetching on a bus-based multiprocessor , 1993, ISCA '93.
[3] Scott A. Mahlke,et al. Data access microarchitectures for superscalar processors with compiler-assisted data prefetching , 1991, MICRO 24.
[4] Norman P. Jouppi,et al. Improving direct-mapped cache performance by the addition of a small fully-associative cache and prefetch buffers , 1990, [1990] Proceedings. The 17th Annual International Symposium on Computer Architecture.
[5] David Kroft,et al. Lockup-free instruction fetch/prefetch cache organization , 1998, ISCA '81.
[6] Henry M. Levy,et al. An Architecture for Software-Controlled Data Prefetching , 1991, ISCA.
[7] Jean-Loup Baer,et al. An effective on-chip preloading scheme to reduce data access penalty , 1991, Proceedings of the 1991 ACM/IEEE Conference on Supercomputing (Supercomputing '91).
[8] Tien-Fu Chen,et al. Data prefetching for high-performance processors , 1993 .
[9] Paul Feautrier,et al. A New Solution to Coherence Problems in Multicache Systems , 1978, IEEE Transactions on Computers.
[10] Monica S. Lam,et al. A data locality optimizing algorithm , 1991, PLDI '91.
[11] Ken Kennedy,et al. Software methods for improvement of cache performance on supercomputer applications , 1989 .
[12] Anoop Gupta,et al. Design and evaluation of a compiler algorithm for prefetching , 1992, ASPLOS V.
[13] Susan J. Eggers,et al. Eliminating False Sharing , 1991, ICPP.
[14] Janak H. Patel,et al. Stride directed prefetching in scalar processors , 1992, MICRO.
[15] Anoop Gupta,et al. Tolerating Latency Through Software-Controlled Prefetching in Shared-Memory Multiprocessors , 1991, J. Parallel Distributed Comput..
[16] J.W.C. Fu,et al. Data prefetching in multiprocessor vector cache memories , 1991, [1991] Proceedings. The 18th Annual International Symposium on Computer Architecture.
[17] Anoop Gupta,et al. SPLASH: Stanford parallel applications for shared-memory , 1992, CARN.
[18] Janak H. Patel,et al. Data prefetching in multiprocessor vector cache memories , 1991, ISCA '91.