Re-NUCA: Boosting CMP Performance Through Block Replication
暂无分享,去创建一个
[1] Rohit Bhatia,et al. Montecito: a dual-core, dual-thread Itanium processor , 2005, IEEE Micro.
[2] Zeshan Chishti,et al. Optimizing replication, communication, and capacity allocation in CMPs , 2005, 32nd International Symposium on Computer Architecture (ISCA'05).
[3] Jaehyuk Huh,et al. A NUCA Substrate for Flexible CMP Cache Sharing , 2007, IEEE Transactions on Parallel and Distributed Systems.
[4] Kunle Olukotun,et al. A Single-Chip Multiprocessor , 1997, Computer.
[5] Jung-Hsien Chiang,et al. Neural and Fuzzy Methods in Handwriting Recognition , 1997, Computer.
[6] Kunle Olukotun,et al. The case for a single-chip multiprocessor , 1996, ASPLOS VII.
[7] Uday Bondhugula,et al. A Compile-Time Data Locality Optimization Framework for NUCA Chip Multiprocessors , 2008 .
[8] Kunle Olukotun,et al. Niagara: a 32-way multithreaded Sparc processor , 2005, IEEE Micro.
[9] Avi Mendelson,et al. CMP Implementation in Systems Based on the Intel Core Duo Processor , 2006 .
[10] Shyamkumar Thoziyoor,et al. CACTI 5 . 1 , 2008 .
[11] John L. Henning. SPEC CPU2000: Measuring CPU Performance in the New Millennium , 2000, Computer.
[12] Pierfrancesco Foglia,et al. An Evaluation of Behaviors of S-NUCA CMPs Running Scientific Workload , 2009, 2009 12th Euromicro Conference on Digital System Design, Architectures, Methods and Tools.
[13] David A. Wood,et al. Managing Wire Delay in Large Chip-Multiprocessor Caches , 2004, 37th International Symposium on Microarchitecture (MICRO-37'04).
[14] Kai Li,et al. The PARSEC benchmark suite: Characterization and architectural implications , 2008, 2008 International Conference on Parallel Architectures and Compilation Techniques (PACT).
[15] William J. Dally,et al. Principles and Practices of Interconnection Networks , 2004 .
[16] Valentin Puente,et al. ESP-NUCA: A low-cost adaptive Non-Uniform Cache Architecture , 2010, HPCA - 16 2010 The Sixteenth International Symposium on High-Performance Computer Architecture.
[17] Kourosh Gharachorloo,et al. Architecture and design of AlphaServer GS320 , 2000, SIGP.
[18] Pierfrancesco Foglia,et al. Analysis of Performance Dependencies in NUCA-Based CMP Systems , 2009, 2009 21st International Symposium on Computer Architecture and High Performance Computing.
[19] Sudhakar Yalamanchili,et al. Interconnection Networks: An Engineering Approach , 2002 .
[20] Krste Asanovic,et al. Victim replication: maximizing capacity while hiding wire delay in tiled chip multiprocessors , 2005, 32nd International Symposium on Computer Architecture (ISCA'05).
[21] Yu (Kevin) Cao,et al. What is Predictive Technology Model (PTM)? , 2009, SIGD.
[22] Babak Falsafi,et al. Reactive NUCA: near-optimal block placement and replication in distributed caches , 2009, ISCA '09.
[23] Alessandro Bardine,et al. A power-efficient migration mechanism for D-NUCA caches , 2009, 2009 Design, Automation & Test in Europe Conference & Exhibition.
[24] Balaram Sinharoy,et al. POWER5 system microarchitecture , 2005, IBM J. Res. Dev..
[25] Anoop Gupta,et al. The SPLASH-2 programs: characterization and methodological considerations , 1995, ISCA.
[26] Ken Mai,et al. The future of wires , 2001, Proc. IEEE.
[27] David A. Wood,et al. ASR: Adaptive Selective Replication for CMP Caches , 2006, 2006 39th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'06).
[28] Doug Burger,et al. An adaptive, non-uniform cache structure for wire-delay dominated on-chip caches , 2002, ASPLOS X.
[29] Jichuan Chang,et al. Cooperative Caching for Chip Multiprocessors , 2006, 33rd International Symposium on Computer Architecture (ISCA'06).
[30] Pierfrancesco Foglia,et al. Investigating Design Trade-Off in S-NUCA Based CMP Systems , 2009 .