Accelerating Sequential Applications on CMPs Using Core Spilling
暂无分享,去创建一个
[1] Haitham Akkary,et al. Continual flow pipelines , 2004, ASPLOS XI.
[2] James E. Smith,et al. Complexity-Effective Superscalar Processors , 1997, Conference Proceedings. The 24th Annual International Symposium on Computer Architecture.
[3] Gurindar S. Sohi,et al. Register traffic analysis for streamlining inter-operation communication in fine-grain parallel processors , 1992, MICRO.
[4] Onur Mutlu,et al. Runahead execution: an alternative to very large instruction windows for out-of-order processors , 2003, The Ninth International Symposium on High-Performance Computer Architecture, 2003. HPCA-9 2003. Proceedings..
[5] Haitham Akkary,et al. Checkpoint processing and recovery: towards scalable large instruction window processors , 2003, Proceedings. 36th Annual IEEE/ACM International Symposium on Microarchitecture, 2003. MICRO-36..
[6] Richard E. Kessler,et al. The Alpha 21264 microprocessor architecture , 1998, Proceedings International Conference on Computer Design. VLSI in Computers and Processors (Cat. No.98CB36273).
[7] Vikas Agarwal,et al. Clock rate versus IPC: the end of the road for conventional microarchitectures , 2000, Proceedings of 27th International Symposium on Computer Architecture (IEEE Cat. No.RS00201).
[8] Norman P. Jouppi,et al. The multicluster architecture: reducing cycle time through partitioning , 1997, Proceedings of 30th Annual International Symposium on Microarchitecture.
[9] Margaret Martonosi,et al. Multipath execution: opportunities and limits , 1998, ICS '98.
[10] Saman P. Amarasinghe. Multicores from the Compiler's Perspective: A Blessing or a Curse? , 2005, CGO.
[11] David I. August,et al. Microarchitectural exploration with Liberty , 2002, MICRO 35.
[12] Jaehyuk Huh,et al. Exploring the design space of future CMPs , 2001, Proceedings 2001 International Conference on Parallel Architectures and Compilation Techniques.
[13] M. Bohr. Interconnect scaling-the real limiter to high performance ULSI , 1995, Proceedings of International Electron Devices Meeting.
[14] Balaram Sinharoy,et al. IBM Power5 chip: a dual-core multithreaded processor , 2004, IEEE Micro.
[15] Haitham Akkary,et al. Checkpoint Processing and Recovery: Towards Scalable Large Instruction Window Processors , 2003, MICRO.
[16] Luiz André Barroso,et al. Piranha: a scalable architecture based on single-chip multiprocessing , 2000, Proceedings of 27th International Symposium on Computer Architecture (IEEE Cat. No.RS00201).
[17] Brad Calder,et al. Threaded multiple path execution , 1998, Proceedings. 25th Annual International Symposium on Computer Architecture (Cat. No.98CB36235).
[18] Lixin Zhang,et al. Adaptive mechanisms and policies for managing cache hierarchies in chip multiprocessors , 2005, 32nd International Symposium on Computer Architecture (ISCA'05).
[19] Christopher Hughes,et al. Speculative precomputation: long-range prefetching of delinquent loads , 2001, ISCA 2001.
[20] Craig Zilles,et al. Execution-based prediction using speculative slices , 2001, ISCA 2001.
[21] Krste Asanovic,et al. Reducing power density through activity migration , 2003, ISLPED '03.
[22] John Paul Shen,et al. Dynamic speculative precomputation , 2001, MICRO.
[23] Todd M. Austin,et al. The SimpleScalar tool set, version 2.0 , 1997, CARN.
[24] Gurindar S. Sohi,et al. Master/Slave Speculative Parallelization , 2002, 35th Annual IEEE/ACM International Symposium on Microarchitecture, 2002. (MICRO-35). Proceedings..
[25] Kunle Olukotun,et al. The case for a single-chip multiprocessor , 1996, ASPLOS VII.
[26] David J. Sager,et al. The microarchitecture of the Pentium 4 processor , 2001 .
[27] Dean M. Tullsen,et al. Multithreaded value prediction , 2005, 11th International Symposium on High-Performance Computer Architecture.
[28] Mateo Valero,et al. Toward kilo-instruction processors , 2004, TACO.
[29] Norman P. Jouppi,et al. Single-ISA Heterogeneous Multi-Core Architectures: The Potential for Processor Power Reduction , 2003, MICRO.
[30] Haitham Akkary,et al. Continual flow pipelines: achieving resource-efficient latency tolerance , 2004, IEEE Micro.
[31] Yale N. Patt,et al. Select-free instruction scheduling logic , 2001, MICRO.
[32] Rajeev Balasubramonian,et al. Dynamically allocating processor resources between nearby and distant ILP , 2001, ISCA 2001.
[33] Mateo Valero,et al. Delaying physical register allocation through virtual-physical registers , 1999, MICRO-32. Proceedings of the 32nd Annual ACM/IEEE International Symposium on Microarchitecture.
[34] Gurindar S. Sohi,et al. Speculative data-driven multithreading , 2001, Proceedings HPCA Seventh International Symposium on High-Performance Computer Architecture.
[35] Gurindar S. Sohi,et al. Multiscalar processors , 1995, Proceedings 22nd Annual International Symposium on Computer Architecture.
[36] Dean M. Tullsen,et al. Interconnections in multi-core architectures: understanding mechanisms, overheads and scaling , 2005, 32nd International Symposium on Computer Architecture (ISCA'05).
[37] Dean M. Tullsen,et al. Simultaneous multithreading: Maximizing on-chip parallelism , 1995, Proceedings 22nd Annual International Symposium on Computer Architecture.
[38] Kunle Olukotun,et al. Niagara: a 32-way multithreaded Sparc processor , 2005, IEEE Micro.
[39] Eric Rotenberg,et al. Slipstream execution mode for CMP-based multiprocessors , 2003, The Ninth International Symposium on High-Performance Computer Architecture, 2003. HPCA-9 2003. Proceedings..
[40] Brad Calder,et al. Phase tracking and prediction , 2003, ISCA '03.
[41] Todd M. Austin,et al. Cyclone: a broadcast-free dynamic instruction scheduler with selective replay , 2003, ISCA '03.
[42] Kunle Olukotun,et al. A Single-Chip Multiprocessor , 1997, Computer.
[43] Yunheung Paek,et al. Parallel Programming with Polaris , 1996, Computer.
[44] Eric Rotenberg,et al. A large, fast instruction window for tolerating cache misses , 2002, ISCA.
[45] José E. Moreira,et al. Evaluation of a multithreaded architecture for cellular computing , 2002, Proceedings Eighth International Symposium on High Performance Computer Architecture.
[46] T. N. Vijaykumar,et al. Heat-and-run: leveraging SMT and CMP to manage power density through the operating system , 2004, ASPLOS XI.
[47] Dean M. Tullsen,et al. Clustered multithreaded architectures - pursuing both IPC and cycle time , 2004, 18th International Parallel and Distributed Processing Symposium, 2004. Proceedings..