An Integrated Hardware/Software Data Prefetching Scheme for Shared-Memory Multiprocessors1
暂无分享,去创建一个
[1] Scott A. Mahlke,et al. Data access microarchitectures for superscalar processors with compiler-assisted data prefetching , 1991, MICRO 24.
[2] Norman P. Jouppi,et al. Improving direct-mapped cache performance by the addition of a small fully-associative cache and prefetch buffers , 1990, [1990] Proceedings. The 17th Annual International Symposium on Computer Architecture.
[3] Anoop Gupta,et al. Design and evaluation of a compiler algorithm for prefetching , 1992, ASPLOS V.
[4] Jean-Loup Baer,et al. An effective on-chip preloading scheme to reduce data access penalty , 1991, Proceedings of the 1991 ACM/IEEE Conference on Supercomputing (Supercomputing '91).
[5] Ken Kennedy,et al. Software prefetching , 1991, ASPLOS IV.
[6] Anoop Gupta,et al. Tolerating Latency Through Software-Controlled Prefetching in Shared-Memory Multiprocessors , 1991, J. Parallel Distributed Comput..
[7] T. Mowry,et al. Comparative evaluation of latency reducing and tolerating techniques , 1991, [1991] Proceedings. The 18th Annual International Symposium on Computer Architecture.
[8] Richard E. Hank,et al. An efficient architecture for loop based data preloading , 1992, MICRO 1992.
[9] Alexander V. Veidenbaum,et al. An effective write policy for software coherence schemes , 1992, Proceedings Supercomputing '92.
[10] Steven A. Moyer,et al. Access Ordering and Effective Memory Bandwidth , 1993 .
[11] William Jalby,et al. A Quantitative Algorithm for Data Locality Optimization , 1991, Code Generation.
[12] J.W.C. Fu,et al. Stride Directed Prefetching In Scalar Processors , 1992, [1992] Proceedings the 25th Annual International Symposium on Microarchitecture MICRO 25.
[13] Alan Jay Smith,et al. Cache Memories , 1982, CSUR.
[14] Yvon Jégou,et al. Using virtual lines to enhance locality exploitation , 1994, ICS '94.
[15] Tien-Fu Chen,et al. Data prefetching for high-performance processors , 1993 .
[16] H. Levy,et al. An architecture for software-controlled data prefetching , 1991, [1991] Proceedings. The 18th Annual International Symposium on Computer Architecture.
[17] Alexander V. Veidenbaum,et al. An Integrated Hardware/Software Data Prefetching Scheme for Shared-Memory Multiprocessors1 , 1994, 1994 Internatonal Conference on Parallel Processing Vol. 2.
[18] Alfred V. Aho,et al. Principles of Compiler Design (Addison-Wesley series in computer science and information processing) , 1977 .
[19] Ivan Sklenár. Prefetch unit for vector operations on scalar computers , 1992, ISCA.
[20] Pen-Chung Yew,et al. : Data Prefetching In Shared Memory Multiprocessors , 1987, ICPP.
[21] Monica S. Lam,et al. A data locality optimizing algorithm , 1991, PLDI '91.
[22] Yung-Chin Chen,et al. Cache Design and Performance in a Large-Scale Shared-Memory Multiprocessor System , 1993 .
[23] Anoop Gupta,et al. Comparative evaluation of latency reducing and tolerating techniques , 1991, ISCA '91.
[24] David J. Lilja,et al. The Impact of Parallel Loop Scheduling Strategies on Prefetching in a Shared Memory Multiprocessor , 1994, IEEE Trans. Parallel Distributed Syst..
[25] Alfred V. Aho,et al. Principles of Compiler Design , 1977 .
[26] Chi-Hung Chi. Compiler Optimization Technique for Data Cache Prefetching Using a Small CAM Array , 1994, 1994 International Conference on Parallel Processing Vol. 1.
[27] Manuel E. Benitez,et al. Code generation for streaming: an access/execute mechanism , 1991, ASPLOS IV.
[28] Michel Dubois,et al. International Conference on Parallel Processing Fixed and Adaptive Sequential Prefetching in Shared Memory Multiprocessors , 2006 .
[29] Janak H. Patel,et al. Stride directed prefetching in scalar processors , 1992, MICRO.
[30] Yvon Jégou,et al. Speculative prefetching , 1993, ICS '93.
[31] Jean-Loup Baer,et al. A performance study of software and hardware data prefetching schemes , 1994, ISCA '94.
[32] Henry M. Levy,et al. An architecture for software-controlled data prefetching , 1991, ISCA '91.
[33] Janak H. Patel,et al. Data prefetching in multiprocessor vector cache memories , 1991, ISCA '91.
[34] Alexander V. Veidenbaum,et al. Compiler-directed data prefetching in multiprocessors with memory hierarchies , 1990, ICS '90.
[35] Dean M. Tullsen,et al. Limitations of cache prefetching on a bus-based multiprocessor , 1993, ISCA '93.
[36] Alexander V. Veidenbaum,et al. Compiler-directed data prefetching in multiprocessors with memory hierarchies , 1990 .
[37] Alexander V. Veidenbaum,et al. Comparison and analysis of software and directory coherence schemes , 1991, Proceedings of the 1991 ACM/IEEE Conference on Supercomputing (Supercomputing '91).
[38] Dean M. Tullsen,et al. Limitations Of Cache Prefetching On A Bus-based Multiprocessor , 1993, Proceedings of the 20th Annual International Symposium on Computer Architecture.
[39] Utpal Banerjee,et al. Dependence analysis for supercomputing , 1988, The Kluwer international series in engineering and computer science.
[40] Norman P. Jouppi,et al. Improving direct-mapped cache performance by the addition of a small fully-associative cache and pre , 1990, ISCA 1990.
[41] Hye-yeon Cheong. Compiler-directed cache coherence strategies for large-scale sha , 1990 .
[42] Janak H. Patel,et al. Stride directed prefetching in scalar processors , 1992, MICRO 1992.