Victim Replication: Maximizing Capacity while Hiding Wire Delay in Tiled Chip Multiprocessors
暂无分享,去创建一个
[1] Thomas F. Wenisch,et al. SMARTS: accelerating microarchitecture simulation via rigorous statistical sampling , 2003, ISCA '03.
[2] Eric Sprangle,et al. Increasing processor performance by implementing deeper pipelines , 2002, ISCA.
[3] David A. Wood,et al. Managing Wire Delay in Large Chip-Multiprocessor Caches , 2004, 37th International Symposium on Microarchitecture (MICRO-37'04).
[4] Pradip Bose,et al. Optimizing pipelines for power and performance , 2002, 35th Annual IEEE/ACM International Symposium on Microarchitecture, 2002. (MICRO-35). Proceedings..
[5] Norman P. Jouppi,et al. The optimal logic depth per pipeline stage is 6 to 8 FO4 inverter delays , 2002, ISCA.
[6] Luiz André Barroso,et al. Piranha: a scalable architecture based on single-chip multiprocessing , 2000, Proceedings of 27th International Symposium on Computer Architecture (IEEE Cat. No.RS00201).
[7] Ken Mai,et al. The future of wires , 2001, Proc. IEEE.
[8] Vikas Agarwal,et al. Clock rate versus IPC: the end of the road for conventional microarchitectures , 2000, Proceedings of 27th International Symposium on Computer Architecture (IEEE Cat. No.RS00201).
[9] Henry Hoffmann,et al. Evaluation of the Raw microprocessor: an exposed-wire-delay architecture for ILP and streams , 2004, Proceedings. 31st Annual International Symposium on Computer Architecture, 2004..
[10] Balaram Sinharoy,et al. POWER4 system microarchitecture , 2002, IBM J. Res. Dev..
[11] Josep Torrellas,et al. Cache-Only Memory Architectures , 1999, Computer.
[12] Kunle Olukotun,et al. The case for a single-chip multiprocessor , 1996, ASPLOS VII.
[13] Alaa R. Alameldeen,et al. Addressing Workload Variability in Architectural Simulations , 2003, IEEE Micro.
[14] Norman P. Jouppi,et al. Improving direct-mapped cache performance by the addition of a small fully-associative cache and pre , 1990, ISCA 1990.
[15] Josep Torrellas,et al. Reducing remote conflict misses: NUMA with remote cache versus COMA , 1997, Proceedings Third International Symposium on High-Performance Computer Architecture.
[16] Krste Asanovic,et al. Accelerating Multiprocessor Simulation with a Memory Timestamp Record , 2005, IEEE International Symposium on Performance Analysis of Systems and Software, 2005. ISPASS 2005..
[17] N. Ranganathan,et al. Utilization of Cache Area in On-Chip Multiprocessor , 1999, ISHPC.
[18] T. N. Vijaykumar,et al. Distance associativity for high-performance energy-efficient non-uniform cache architectures , 2003, Proceedings. 36th Annual IEEE/ACM International Symposium on Microarchitecture, 2003. MICRO-36..
[19] Thomas R. Puzak,et al. Optimum power/performance pipeline depth , 2003, Proceedings. 36th Annual IEEE/ACM International Symposium on Microarchitecture, 2003. MICRO-36..
[20] Jaehyuk Huh,et al. Exploring the design space of future CMPs , 2001, Proceedings 2001 International Conference on Parallel Architectures and Compilation Techniques.
[21] Doug Burger,et al. An adaptive, non-uniform cache structure for wire-delay dominated on-chip caches , 2002, ASPLOS X.