Value locality and speculative execution
暂无分享,去创建一个
[1] Chau-Wen Tseng,et al. Compiler optimizations for improving data locality , 1994, ASPLOS VI.
[2] David W. Wall,et al. Link-time optimization of address calculation on a 64-bit architecture , 1994, PLDI '94.
[3] Anoop Gupta,et al. Design and evaluation of a compiler algorithm for prefetching , 1992, ASPLOS V.
[4] Scott A. Mahlke,et al. Data access microarchitectures for superscalar processors with compiler-assisted data prefetching , 1991, MICRO 24.
[5] Mikko H. Lipasti,et al. Exceeding the dataflow limit via value prediction , 1996, Proceedings of the 29th Annual IEEE/ACM International Symposium on Microarchitecture. MICRO 29.
[6] Michael D. Smith,et al. A comparative analysis of schemes for correlated branch prediction , 1995, Proceedings 22nd Annual International Symposium on Computer Architecture.
[7] Christian Piguet,et al. Microprocessor design , 1997 .
[8] Yale N. Patt,et al. A two-level approach to making class predictions , 2003, 36th Annual Hawaii International Conference on System Sciences, 2003. Proceedings of the.
[9] K. Kavi. Cache Memories Cache Memories in Uniprocessors. Reading versus Writing. Improving Performance , 2022 .
[10] Eric Rotenberg,et al. Trace cache: a low latency approach to high bandwidth instruction fetching , 1996, Proceedings of the 29th Annual IEEE/ACM International Symposium on Microarchitecture. MICRO 29.
[11] John Paul Shen,et al. Speculative disambiguation: a compilation technique for dynamic memory disambiguation , 1994, Proceedings of 21 International Symposium on Computer Architecture.
[12] Edward M. Riseman,et al. The Inhibition of Potential Parallelism by Conditional Jumps , 1972, IEEE Transactions on Computers.
[13] Thomas Thomas,et al. The PowerPC 620 microprocessor: a high performance superscalar RISC microprocessor , 1995, Digest of Papers. COMPCON'95. Technologies for the Information Superhighway.
[14] James E. Smith,et al. Complexity-Effective Superscalar Processors , 1997, ISCA.
[15] S. Richardson. Caching Function Results: Faster Arithmetic by Avoiding Unnecessary Computation , 1992 .
[16] Samuel Pollock Harbison. A computer architecture for the dynamic optimization of high-level language programs , 1980 .
[17] P. Bannon,et al. Internal architecture of Alpha 21164 microprocessor , 1995, Digest of Papers. COMPCON'95. Technologies for the Information Superhighway.
[18] Trung A. Diep,et al. VMW: A Visualization-Based Microarchitecture Workbench , 1995, Computer.
[19] Burzin A. Patel,et al. Optimization of instruction fetch mechanisms for high issue rates , 1995, Proceedings 22nd Annual International Symposium on Computer Architecture.
[20] Monica S. Lam,et al. Limits of control flow on parallelism , 1992, ISCA '92.
[21] Gary S. Tyson,et al. A modified approach to data cache management , 1995, MICRO 1995.
[22] Alfred V. Aho,et al. Compilers: Principles, Techniques, and Tools , 1986, Addison-Wesley series in computer science / World student series edition.
[23] Mikko H. Lipasti,et al. Value locality and load value prediction , 1996, ASPLOS VII.
[24] F. Gabbay. Speculative Execution based on Value Prediction Research Proposal towards the Degree of Doctor of Sciences , 1996 .
[25] Mikko H. Lipasti,et al. The Performance Potential of Value and Dependence Prediction , 1997, Euro-Par.
[26] Ken Kennedy,et al. Software prefetching , 1991, ASPLOS IV.
[27] Mike Johnson,et al. Superscalar microprocessor design , 1991, Prentice Hall series in innovative technology.
[28] Kevin B. Theobald,et al. On the limits of program parallelism and its smoothability , 1992, MICRO 1992.
[29] Kunle Olukotun,et al. The case for a single-chip multiprocessor , 1996, ASPLOS VII.
[30] S. McFarling. Combining Branch Predictors , 1993 .
[31] Trung A. Diep,et al. Performance evaluation of the PowerPC 620 microarchitecture , 1995, Proceedings 22nd Annual International Symposium on Computer Architecture.
[32] Norman P. Jouppi,et al. Improving direct-mapped cache performance by the addition of a small fully-associative cache and pre , 1990, ISCA 1990.
[33] Dionisios N. Pnevmatikatos,et al. Streamlining data cache access with fast address calculation , 1995, Proceedings 22nd Annual International Symposium on Computer Architecture.
[34] Apostolos Dollas,et al. Predicting and precluding problems with memory latency , 1994, IEEE Micro.
[35] David W. Wall,et al. Limits of instruction-level parallelism , 1991, ASPLOS IV.
[36] Samuel P. Harbison. An architectural alternative to optimizing compilers , 1982, ASPLOS I.
[37] Jean-Loup Baer,et al. A performance study of software and hardware data prefetching schemes , 1994, ISCA '94.
[38] Duncan H. Lawrie,et al. On the Performance Enhancement of Paging Systems Through Program Analysis and Transformations , 1981, IEEE Transactions on Computers.
[39] T. Ozawa,et al. Cache miss heuristics and preloading techniques for general-purpose programs , 1995, Proceedings of the 28th Annual International Symposium on Microarchitecture.
[40] Mikko H. Lipasti,et al. Approaching 10 IPC via Superspeculation , 1997 .
[41] Kevin McGrath,et al. Eliminating operand read latency , 1996, CARN.
[42] Manoj Franklin,et al. The multiscalar architecture , 1993 .
[43] R. P. Colwell,et al. A 0.6 /spl mu/m BiCMOS processor with dynamic execution , 1995, Proceedings ISSCC '95 - International Solid-State Circuits Conference.
[44] Mikko H. Lipasti,et al. Partial resolution in branch target buffers , 1995, Proceedings of the 28th Annual International Symposium on Microarchitecture.
[45] Norman P. Jouppi,et al. Improving direct-mapped cache performance by the addition of a small fully-associative cache and prefetch buffers , 1990, [1990] Proceedings. The 17th Annual International Symposium on Computer Architecture.
[46] John Paul Shen,et al. The intrinsic bandwidth requirements of ordinary programs , 1996, ASPLOS VII.
[47] Scott A. Mahlke,et al. Dynamic memory disambiguation using the memory conflict buffer , 1994, ASPLOS VI.
[48] Norman P. Jouppi,et al. Architectural And Organizational Tradeoffs In The Design Of The Multititan CPU , 1989, The 16th Annual International Symposium on Computer Architecture.
[49] Trevor Mudge,et al. Hardware support for hiding cache latency , 1993 .