Exploring Speculative Techniques to Improve the Memory System Performance
暂无分享,去创建一个
[1] Jenn-Yuan Tsai,et al. The superthreaded architecture: thread pipelining with run-time data dependence checking and control speculation , 1996, Proceedings of the 1996 Conference on Parallel Architectures and Compilation Technique.
[2] Jun Yang,et al. Load redundancy removal through instruction reuse , 2000, Proceedings 2000 International Conference on Parallel Processing.
[3] Glenn Reinman,et al. Fetch directed instruction prefetching , 1999, MICRO-32. Proceedings of the 32nd Annual ACM/IEEE International Symposium on Microarchitecture.
[4] Norman P. Jouppi,et al. How useful are non-blocking loads, stream buffers and speculative execution in multiple issue processors? , 1995, Proceedings of 1995 1st IEEE Symposium on High Performance Computer Architecture.
[5] Brad Calder,et al. Instruction recycling on a multiple-path processor , 1999, Proceedings Fifth International Symposium on High-Performance Computer Architecture.
[6] Mikko H. Lipasti,et al. Silent stores for free , 2000, MICRO 33.
[7] Ying Chen,et al. Using incorrect speculation to prefetch data in a concurrent multithreaded processor , 2003, Proceedings International Parallel and Distributed Processing Symposium.
[9] James E. Smith,et al. A study of branch prediction strategies , 1981, ISCA '98.
[10] T. Ozawa,et al. Cache miss heuristics and preloading techniques for general-purpose programs , 1995, Proceedings of the 28th Annual International Symposium on Microarchitecture.
[11] Mikko H. Lipasti,et al. Value locality and load value prediction , 1996, ASPLOS VII.
[12] Trevor N. Mudge,et al. Wrong-path instruction prefetching , 1996, Proceedings of the 29th Annual IEEE/ACM International Symposium on Microarchitecture. MICRO 29.
[13] David J. Lilja,et al. Exploiting the Prefetching Effect Provided by Executing Mispredicted Load Instructions , 2002, Euro-Par.
[14] Jun Yang,et al. Frequent value locality and value-centric data cache design , 2000, SIGP.
[15] F. Gabbay. Speculative Execution based on Value Prediction Research Proposal towards the Degree of Doctor of Sciences , 1996 .
[16] Gurindar S. Sohi,et al. Multiscalar processors , 1995, Proceedings 22nd Annual International Symposium on Computer Architecture.
[17] David Bernstein,et al. Compiler techniques for data prefetching on the PowerPC , 1995, PACT.
[18] Theo Ungerer,et al. Multithreaded Processors , 2002, Comput. J..
[19] Trevor N. Mudge,et al. The effect of speculative execution on cache performance , 1994, Proceedings of 8th International Parallel Processing Symposium.
[20] Todd C. Mowry,et al. Compiler-based prefetching for recursive data structures , 1996, ASPLOS VII.
[21] Janak H. Patel,et al. Data prefetching in multiprocessor vector cache memories , 1991, ISCA '91.
[22] Gurindar S. Sohi,et al. Understanding the differences between value prediction and instruction reuse , 1998, Proceedings. 31st Annual ACM/IEEE International Symposium on Microarchitecture.
[23] Ravi Pendse,et al. Selective prefetching: prefetching when only required , 1999, 42nd Midwest Symposium on Circuits and Systems (Cat. No.99CH36356).
[24] Alexander V. Veidenbaum,et al. Compiler-directed data prefetching in multiprocessors with memory hierarchies , 1990, ICS '90.
[25] Christopher Hughes,et al. Speculative precomputation: long-range prefetching of delinquent loads , 2001, ISCA 2001.
[26] Jean-Loup Baer,et al. A performance study of software and hardware data prefetching schemes , 1994, ISCA '94.
[27] Jun Yang,et al. Frequent value compression in data caches , 2000, MICRO 33.
[28] David J. Lilja,et al. Data prefetch mechanisms , 2000, CSUR.
[29] Mikko H. Lipasti,et al. Characterization of silent stores , 2000, Proceedings 2000 International Conference on Parallel Architectures and Compilation Techniques (Cat. No.PR00622).
[30] A. J. KleinOsowski,et al. MinneSPEC: A New SPEC Benchmark Workload for Simulation-Based Computer Architecture Research , 2002, IEEE Computer Architecture Letters.
[31] Mikko H. Lipasti,et al. Partial resolution in branch target buffers , 1995, Proceedings of the 28th Annual International Symposium on Microarchitecture.
[32] Dirk Grunwald,et al. Confidence estimation for speculation control , 1998, ISCA.
[33] David J. Lilja,et al. Address Correlation: Exceeding the Limits of Locality , 2003, IEEE Computer Architecture Letters.
[34] Jean-Loup Baer,et al. Effective Hardware Based Data Prefetching for High-Performance Processors , 1995, IEEE Trans. Computers.
[35] Jian Huang,et al. The Superthreaded Processor Architecture , 1999, IEEE Trans. Computers.
[36] Mikko H. Lipasti. Value locality and speculative execution , 1998 .
[37] Yale N. Patt,et al. A Comparison Of Dynamic Branch Predictors That Use Two Levels Of Branch History , 1993, Proceedings of the 20th Annual International Symposium on Computer Architecture.
[38] J. Gregory Steffan. The Potential for Thread-Level Data Speculat ion in Tight ly-Coupled Mult iprocessors , 1997 .
[39] Margaret Martonosi,et al. Branch Prediction, Instruction-Window Size, and Cache Size: Performance Trade-Offs and Simulation Techniques , 1999, IEEE Trans. Computers.
[40] Anoop Gupta,et al. Tolerating Latency Through Software-Controlled Prefetching in Shared-Memory Multiprocessors , 1991, J. Parallel Distributed Comput..
[41] Rajiv Gupta,et al. Value prediction in VLIW machines , 1999, ISCA.
[42] Chi-Keung Luk,et al. Tolerating memory latency through software-controlled pre-execution in simultaneous multithreading processors , 2001, Proceedings 28th Annual International Symposium on Computer Architecture.
[43] David A. Patterson,et al. Computer Architecture - A Quantitative Approach, 5th Edition , 1996 .
[44] Todd C. Mowry,et al. The Potential for Thread-level Data Speculation in Tightly-coupled Multiprocessors , 1997 .
[45] Rajiv Gupta,et al. Global context-based value prediction , 1999, Proceedings Fifth International Symposium on High-Performance Computer Architecture.
[46] Jenn-Yuan Tsai,et al. Program Optimization for Concurrent Multithreaded Architectures , 1997, LCPC.
[47] J. Liang,et al. Designing the Agassiz Compiler for Concurrent Multithreaded Architectures , 1999, LCPC.
[48] Douglas J. Joseph,et al. Prefetching Using Markov Predictors , 1997, Conference Proceedings. The 24th Annual International Symposium on Computer Architecture.
[49] Anoop Gupta,et al. Design and evaluation of a compiler algorithm for prefetching , 1992, ASPLOS V.
[50] Jenn-Yuan Tsai,et al. Performance study of a concurrent multithreaded processor , 1998, Proceedings 1998 Fourth International Symposium on High-Performance Computer Architecture.
[51] Glenn Reinman,et al. A scalable front-end architecture for fast instruction delivery , 1999, ISCA.
[52] Jun Yang,et al. Energy efficient Frequent Value data Cache design , 2002, 35th Annual IEEE/ACM International Symposium on Microarchitecture, 2002. (MICRO-35). Proceedings..
[53] K. Kavi. Cache Memories Cache Memories in Uniprocessors. Reading versus Writing. Improving Performance , 2022 .
[54] Michael D. Smith,et al. Improving the accuracy of static branch prediction using branch correlation , 1994, ASPLOS VI.
[55] Mikko H. Lipasti,et al. Silent Stores and Store Value Locality , 2001, IEEE Trans. Computers.
[56] Andreas Moshovos,et al. Dependence based prefetching for linked data structures , 1998, ASPLOS VIII.
[57] Doug Burger,et al. Evaluating Future Microprocessors: the SimpleScalar Tool Set , 1996 .
[58] James E. Smith,et al. Prefetching in supercomputer instruction caches , 1992, Proceedings Supercomputing '92.
[59] Jun Yang,et al. Energy-efficient load and store reuse , 2001, ISLPED '01.
[60] Alan Jay Smith,et al. Branch Prediction Strategies and Branch Target Buffer Design , 1995, Computer.
[61] Michel Dubois,et al. Fixed and Adaptive Sequential Prefetching in Shared Memory Multiprocessors , 1993, 1993 International Conference on Parallel Processing - ICPP'93.
[62] G.S. Sohi,et al. Dynamic Instruction Reuse , 1997, Conference Proceedings. The 24th Annual International Symposium on Computer Architecture.
[63] S. McFarling. Combining Branch Predictors , 1993 .
[64] Joseph T. Rahmeh,et al. Improving the accuracy of dynamic branch prediction using branch correlation , 1992, ASPLOS V.
[65] Mikko H. Lipasti,et al. Temporally silent stores , 2002, ASPLOS X.
[66] Norman P. Jouppi,et al. Improving direct-mapped cache performance by the addition of a small fully-associative cache and prefetch buffers , 1990, [1990] Proceedings. The 17th Annual International Symposium on Computer Architecture.
[67] Joseph A. Fisher,et al. Predicting conditional branch directions from previous runs of a program , 1992, ASPLOS V.
[68] Jun Yang,et al. Frequent value locality and its applications , 2002, TECS.