A performance-correctness explicitly-decoupled architecture
暂无分享,去创建一个
[1] Trevor N. Mudge,et al. Author retrospective improving data cache performance by pre-executing instructions under a cache miss , 1997, International Conference on Supercomputing.
[2] Martin Burtscher,et al. On the importance of optimizing the configuration of stream prefetchers , 2005, MSP '05.
[3] Eric Rotenberg,et al. A study of slipstream processors , 2000, MICRO 33.
[4] Gurindar S. Sohi,et al. Master/slave speculative parallelization , 2002, MICRO.
[5] E SmithJames. Decoupled access/execute computer architectures , 1982 .
[6] Yale N. Patt,et al. Simultaneous subordinate microthreading (SSMT) , 1999, ISCA.
[7] Milo M. K. Martin,et al. Token Coherence: decoupling performance and correctness , 2003, 30th Annual International Symposium on Computer Architecture, 2003. Proceedings..
[8] Yuan Chou,et al. Low-Cost Epoch-Based Correlation Prefetching for Commercial Applications , 2007, 40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 2007).
[9] Jose Renau,et al. CAVA: Hiding L2 Misses with Checkpoint-Assisted Value Prediction , 2004, IEEE Computer Architecture Letters.
[10] Mikko H. Lipasti,et al. Understanding scheduling replay schemes , 2004, 10th International Symposium on High Performance Computer Architecture (HPCA'04).
[11] Michael C. Huang,et al. A Performance-Correctness Explicitly-Decoupled Architecture : Technical Report , 2008 .
[12] K. Sundaramoorthy,et al. Slipstream processors: improving both performance and fault tolerance , 2000, SIGP.
[13] Simha Sethumadhavan,et al. Scalable Hardware Memory Disambiguation for High-ILP Processors , 2004, IEEE Micro.
[14] C. Bazeghi,et al. /spl mu/Complexity: estimating processor design effort , 2005, 38th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'05).
[15] Christopher Hughes,et al. Speculative precomputation: long-range prefetching of delinquent loads , 2001, ISCA 2001.
[16] David Blaauw,et al. Making typical silicon matter with Razor , 2004, Computer.
[17] Víctor Viñals,et al. Store buffer design in first-level multibanked data caches , 2005, 32nd International Symposium on Computer Architecture (ISCA'05).
[18] Olivier Temam,et al. Dataflow analysis of branch mispredictions and its application to early resolution of branch outcomes , 1998, Proceedings. 31st Annual ACM/IEEE International Symposium on Microarchitecture.
[19] Jose Renau,et al. Effective Optimistic-Checker Tandem Core Design through Architectural Pruning , 2007, 40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 2007).
[20] Gurindar S. Sohi,et al. Speculative data-driven multithreading , 2001, Proceedings HPCA Seventh International Symposium on High-Performance Computer Architecture.
[21] José F. Martínez,et al. Checkpointed early load retirement , 2005, 11th International Symposium on High-Performance Computer Architecture.
[22] Balaram Sinharoy,et al. POWER4 system microarchitecture , 2002, IBM J. Res. Dev..
[23] Robert Muth,et al. alto: a link‐time optimizer for the Compaq Alpha , 2001 .
[24] Onur Mutlu,et al. Runahead execution: an alternative to very large instruction windows for out-of-order processors , 2003, The Ninth International Symposium on High-Performance Computer Architecture, 2003. HPCA-9 2003. Proceedings..
[25] Todd M. Austin,et al. The SimpleScalar tool set, version 2.0 , 1997, CARN.
[26] James Tschanz,et al. Parameter variations and impact on circuits and microarchitecture , 2003, Proceedings 2003. Design Automation Conference (IEEE Cat. No.03CH37451).
[27] C. R. Moore,et al. Scalable hardware memory disambiguation for high-ILP processors , 2004, IEEE Micro.
[28] Alpha 21264 / EV 6 Microprocessor Hardware Reference Manual , 2000 .
[29] Chi-Keung Luk,et al. Tolerating memory latency through software-controlled pre-execution in simultaneous multithreading processors , 2001, Proceedings 28th Annual International Symposium on Computer Architecture.
[30] Huiyang Zhou,et al. Dual-core execution: building a highly scalable single-thread instruction window , 2005, 14th International Conference on Parallel Architectures and Compilation Techniques (PACT'05).
[31] Onur Mutlu,et al. Address-value delta (AVD) prediction: increasing the effectiveness of runahead execution by exploiting regular memory allocation patterns , 2005, 38th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'05).
[32] Tong Li,et al. A large, fast instruction window for tolerating cache misses , 2002, Proceedings 29th Annual International Symposium on Computer Architecture.
[33] Haitham Akkary,et al. Checkpoint Processing and Recovery: Towards Scalable Large Instruction Window Processors , 2003, MICRO.
[34] Jignesh M. Patel,et al. Data prefetching by dependence graph precomputation , 2001, ISCA 2001.
[35] Sanjay J. Patel,et al. Beating in-order stalls with "flea-flicker" two-pass pipelining , 2006, IEEE Transactions on Computers.
[36] Josep Torrellas,et al. Paceline: Improving Single-Thread Performance in Nanoscale CMPs through Core Overclocking , 2007, 16th International Conference on Parallel Architecture and Compilation Techniques (PACT 2007).
[37] M. Dubois,et al. Assisted Execution , 1998 .
[38] Jose Renau,et al. μ Complexity : Estimating Processor Design Effort , 2005 .
[39] Craig Zilles,et al. Execution-based prediction using speculative slices , 2001, ISCA 2001.
[40] Haitham Akkary,et al. Scalable load and store processing in latency tolerant processors , 2005, 32nd International Symposium on Computer Architecture (ISCA'05).
[41] Dionisios N. Pnevmatikatos,et al. Slice-processors: an implementation of operation-based prediction , 2001, ICS '01.
[42] Gurindar S. Sohi,et al. Program Demultiplexing: Data-flow based Speculative Parallelization of Methods in Sequential Programs , 2006, 33rd International Symposium on Computer Architecture (ISCA'06).
[43] Todd M. Austin,et al. DIVA: a reliable substrate for deep submicron microarchitecture design , 1999, MICRO-32. Proceedings of the 32nd Annual ACM/IEEE International Symposium on Microarchitecture.
[44] Rajeev Balasubramonian,et al. Memory hierarchy reconfiguration for energy and performance in general-purpose processor architectures , 2000, MICRO 33.