Low-Cost Epoch-Based Correlation Prefetching for Commercial Applications
暂无分享,去创建一个
[1] Irith Pomeranz,et al. Transient-fault recovery for chip multiprocessors , 2003, 30th Annual International Symposium on Computer Architecture, 2003. Proceedings..
[2] Wei-Fen Lin,et al. Reducing DRAM latencies with an integrated memory hierarchy design , 2001, Proceedings HPCA Seventh International Symposium on High-Performance Computer Architecture.
[3] Sanjay J. Patel,et al. Characterizing the effects of transient faults on a high-performance processor pipeline , 2004, International Conference on Dependable Systems and Networks, 2004.
[4] Todd M. Austin,et al. DIVA: a reliable substrate for deep submicron microarchitecture design , 1999, MICRO-32. Proceedings of the 32nd Annual ACM/IEEE International Symposium on Microarchitecture.
[5] Brian Fahs,et al. Microarchitecture optimizations for exploiting memory-level parallelism , 2004, Proceedings. 31st Annual International Symposium on Computer Architecture, 2004..
[6] Santosh G. Abraham,et al. Effective instruction prefetching in chip multiprocessors for modern commercial applications , 2005, 11th International Symposium on High-Performance Computer Architecture.
[7] Richard E. Kessler,et al. Evaluating stream buffers as a secondary cache replacement , 1994, Proceedings of 21 International Symposium on Computer Architecture.
[8] Haitham Akkary,et al. Checkpoint processing and recovery: towards scalable large instruction window processors , 2003, Proceedings. 36th Annual IEEE/ACM International Symposium on Microarchitecture, 2003. MICRO-36..
[9] Thomas F. Wenisch,et al. Temporal streaming of shared memory , 2005, 32nd International Symposium on Computer Architecture (ISCA'05).
[10] David A. Patterson,et al. Performance characterization of a Quad Pentium Pro SMP using OLTP workloads , 1998, ISCA.
[11] John P. Hayes,et al. High-level design verification of microprocessors via error modeling , 1998, TODE.
[12] Martin Burtscher,et al. Future execution: A prefetching mechanism that uses multiple cores to speed up single threads , 2006, TACO.
[13] Kathryn S. McKinley,et al. Guided region prefetching: a cooperative hardware/software approach , 2003, ISCA '03.
[14] James E. Smith,et al. A Performance Study of Instruction Cache Prefetching Methods , 1998, IEEE Trans. Computers.
[15] Sule Ozev,et al. A mechanism for online diagnosis of hard faults in microprocessors , 2005, 38th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'05).
[16] Jignesh M. Patel,et al. Call graph prefetching for database applications , 2003, TOCS.
[17] Brad Calder,et al. Predictor-directed stream buffers , 2000, MICRO 33.
[18] Santosh G. Abraham,et al. Accurate modeling of aggressive speculation in modern microprocessor architectures , 2005, 13th IEEE International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems.
[19] James E. Smith,et al. Data Cache Prefetching Using a Global History Buffer , 2004, 10th International Symposium on High Performance Computer Architecture (HPCA'04).
[20] Olivier Temam,et al. MicroLib: A Case for the Quantitative Comparison of Micro-Architecture Mechanisms , 2004, 37th International Symposium on Microarchitecture (MICRO-37'04).
[21] Janak H. Patel,et al. Stride directed prefetching in scalar processors , 1992, MICRO.
[22] Andrew F. Glew. MLP yes! ILP no , 1998, ASPLOS 1998.
[23] Thomas Alexander,et al. Distributed prefetch-buffer/cache design for high performance memory systems , 1996, Proceedings. Second International Symposium on High-Performance Computer Architecture.
[24] C. Bazeghi,et al. /spl mu/Complexity: estimating processor design effort , 2005, 38th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'05).
[25] J.F. Martinez,et al. Cherry: Checkpointed early resource recycling in out-of-order microprocessors , 2002, 35th Annual IEEE/ACM International Symposium on Microarchitecture, 2002. (MICRO-35). Proceedings..
[26] Josep Torrellas,et al. Using a user-level memory thread for correlation prefetching , 2002, ISCA.
[27] Richard L. Sites,et al. Binary translation , 1993, CACM.
[28] Norman P. Jouppi,et al. Improving direct-mapped cache performance by the addition of a small fully-associative cache and prefetch buffers , 1990, [1990] Proceedings. The 17th Annual International Symposium on Computer Architecture.
[29] Pedro J. Gil,et al. Fault Injection into VHDL Models: Experimental Validation of a Fault Tolerant Microcomputer System , 1999, EDCC.
[30] Huiyang Zhou,et al. Dual-core execution: building a highly scalable single-thread instruction window , 2005, 14th International Conference on Parallel Architectures and Compilation Techniques (PACT'05).
[31] Babak Falsafi,et al. Last-Touch Correlated Data Streaming , 2007, 2007 IEEE International Symposium on Performance Analysis of Systems & Software.
[32] T. N. Vijaykumar,et al. Reducing Design Complexity of the Load/Store Queue , 2003, MICRO.
[33] Santosh G. Abraham,et al. Store memory-level parallelism optimizations for commercial applications , 2005, 38th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'05).
[34] Susan J. Eggers,et al. An analysis of database workload performance on simultaneous multithreaded processors , 1998, ISCA.
[35] Sanjeev Kumar,et al. Exploiting spatial locality in data caches using spatial footprints , 1998, ISCA.
[36] John Paul Shen,et al. Scaling and characterizing database workloads: bridging the gap between research and practice , 2003, Proceedings. 36th Annual IEEE/ACM International Symposium on Microarchitecture, 2003. MICRO-36..
[37] Douglas J. Joseph,et al. Prefetching Using Markov Predictors , 1997, Conference Proceedings. The 24th Annual International Symposium on Computer Architecture.
[38] Todd M. Austin,et al. A fault tolerant approach to microprocessor design , 2001, 2001 International Conference on Dependable Systems and Networks.
[39] Eric Rotenberg,et al. AR-SMT: a microarchitectural approach to fault tolerance in microprocessors , 1999, Digest of Papers. Twenty-Ninth Annual International Symposium on Fault-Tolerant Computing (Cat. No.99CB36352).
[40] Mary K. Vernon,et al. Analytic evaluation of shared-memory systems with ILP processors , 1998, ISCA.
[41] Thomas F. Wenisch,et al. Spatial Memory Streaming , 2006, 33rd International Symposium on Computer Architecture (ISCA'06).
[42] Shubhendu S. Mukherjee,et al. Measuring Architectural Vulnerability Factors , 2003, IEEE Micro.
[43] Alan Jay Smith,et al. Sequential Program Prefetching in Memory Hierarchies , 1978, Computer.
[44] Jean-Loup Baer,et al. Effective Hardware Based Data Prefetching for High-Performance Processors , 1995, IEEE Trans. Computers.
[45] Glenn Reinman,et al. Fetch directed instruction prefetching , 1999, MICRO-32. Proceedings of the 32nd Annual ACM/IEEE International Symposium on Microarchitecture.
[46] Babak Falsafi,et al. Reunion: Complexity-Effective Multicore Redundancy , 2006, 2006 39th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'06).
[47] Jean-Loup Baer,et al. Dynamic Improvement of Locality in Virtual Memory Systems , 1976, IEEE Transactions on Software Engineering.
[48] Luiz André Barroso,et al. Memory system characterization of commercial workloads , 1998, ISCA.
[49] Babak Falsafi,et al. Dead-block prediction & dead-block correlating prefetchers , 2001, ISCA 2001.
[50] Daniel A. Jiménez,et al. Neural methods for dynamic branch prediction , 2002, TOCS.
[51] Brad Calder,et al. Automatically characterizing large scale program behavior , 2002, ASPLOS X.