Increasing Cache Port Efficiency for Dynamic Superscalar Microprocessors
暂无分享,去创建一个
[1] Norman P. Jouppi,et al. Improving direct-mapped cache performance by the addition of a small fully-associative cache and prefetch buffers , 1990, [1990] Proceedings. The 17th Annual International Symposium on Computer Architecture.
[2] Jeff Yetter,et al. Performance features of the PA7100 microprocessor , 1993, IEEE Micro.
[3] Thomas M. Conte. Tradeoffs in processor/memory interfaces for superscalar processors , 1992, MICRO 1992.
[4] Gurindar S. Sohi,et al. High-bandwidth data memory systems for superscalar processors , 1991, ASPLOS IV.
[5] Norman P. Jouppi,et al. WRL Research Report 93/5: An Enhanced Access and Cycle Time Model for On-chip Caches , 1994 .
[6] Chung-Ho Chen,et al. A unified architectural tradeoff methodology , 1994, ISCA '94.
[7] Trevor Mudge,et al. Performance optimization of pipelined primary cache , 1992, ISCA '92.
[8] Trevor N. Mudge,et al. Resource allocation in a high clock rate microprocessor , 1994, ASPLOS VI.
[9] Jean-Loup Baer,et al. Reducing memory latency via non-blocking and prefetching caches , 1992, ASPLOS V.
[10] Ann Marie Grizzaffi Maynard,et al. Contrasting characteristics and cache performance of technical and multi-user commercial workloads , 1994, ASPLOS VI.
[11] Gary S. Tyson,et al. A study of single-chip processor/cache organizations for large numbers of transistors , 1994, ISCA '94.
[12] Mark Horowitz,et al. Performance tradeoffs in cache design , 1988, ISCA '88.
[13] Jim Gray,et al. Benchmark Handbook: For Database and Transaction Processing Systems , 1992 .
[14] David A. Patterson,et al. Computer Architecture: A Quantitative Approach , 1969 .
[15] Mike Johnson,et al. Superscalar microprocessor design , 1991, Prentice Hall series in innovative technology.
[16] Anoop Gupta,et al. The Stanford FLASH Multiprocessor , 1994, ISCA.
[17] Norman P. Jouppi,et al. Complexity/performance tradeoffs with non-blocking loads , 1994, ISCA '94.
[18] Norman P. Jouppi,et al. Improving direct-mapped cache performance by the addition of a small fully-associative cache and pre , 1990, ISCA 1990.
[19] Kunle Olukotun,et al. Performance Optimization of Pipelined Primary Caches , 1992, [1992] Proceedings the 19th Annual International Symposium on Computer Architecture.
[20] Anoop Gupta,et al. The impact of architectural trends on operating system performance , 1995, SOSP.
[21] R. M. Tomasulo,et al. An efficient algorithm for exploiting multiple arithmetic units , 1995 .
[22] Thomas M. Conte. Tradeoffs in processor/memory interfaces for superscalar processors , 1992, MICRO.
[23] Zarka Cvetanovic,et al. Characterization of Alpha AXP performance using TP and SPEC workloads , 1994, Proceedings of 21 International Symposium on Computer Architecture.
[24] Norman P. Jouppi. Cache write policies and performance , 1993, ISCA '93.
[25] David W. Wall,et al. Limits of instruction-level parallelism , 1991, ASPLOS IV.
[26] Dionisios N. Pnevmatikatos,et al. Cache performance of the SPEC92 benchmark suite , 1993, IEEE Micro.
[27] David Kroft,et al. Lockup-free instruction fetch/prefetch cache organization , 1998, ISCA '81.
[28] Edward McLellan. The Alpha AXP architecture and 21064 processor , 1993, IEEE Micro.
[29] Mendel Rosenblum,et al. Embra: fast and flexible machine simulation , 1996, SIGMETRICS '96.
[30] Rajiv V. Joshi,et al. A 2-ns cycle, 3.8-ns access 512-kb CMOS ECL SRAM with a fully pipelined architecture , 1991 .
[31] Michael J. Flynn,et al. Performance Factors for Superscalar Processors , 1995 .
[32] Anoop Gupta,et al. Complete computer system simulation: the SimOS approach , 1995, IEEE Parallel Distributed Technol. Syst. Appl..