Improving the performance and power efficiency of shared helpers in CMPs
暂无分享,去创建一个
[1] Yale N. Patt,et al. Reducing the performance impact of instruction cache misses by writing instructions into the reservation stations out-of-order , 1997, Proceedings of 30th Annual International Symposium on Microarchitecture.
[2] Chris Wilkerson,et al. Locality vs. criticality , 2001, ISCA 2001.
[3] William H. Mangione-Smith,et al. The filter cache: an energy efficient memory structure , 1997, Proceedings of 30th Annual International Symposium on Microarchitecture.
[4] Yale N. Patt,et al. A comprehensive instruction fetch mechanism for a processor supporting speculative execution , 1992, MICRO 1992.
[5] Ho-Seop Kim,et al. An instruction set and microarchitecture for instruction level distributed processing , 2002, Proceedings 29th Annual International Symposium on Computer Architecture.
[6] John Paul Shen,et al. Helper threads via virtual multithreading , 2004, IEEE Micro.
[7] Margaret Martonosi,et al. Wattch: a framework for architectural-level power analysis and optimizations , 2000, Proceedings of 27th International Symposium on Computer Architecture (IEEE Cat. No.RS00201).
[8] Trung A. Diep,et al. A case for shared instruction cache on chip multiprocessors running OLTP , 2004, SIGARCH Comput. Archit. News.
[9] B. Calder,et al. A scalable front-end architecture for fast instruction delivery , 1999, Proceedings of the 26th International Symposium on Computer Architecture (Cat. No.99CB36367).
[10] Rajeev Balasubramonian,et al. Reducing the complexity of the register file in dynamic superscalar processors , 2001, Proceedings. 34th ACM/IEEE International Symposium on Microarchitecture. MICRO-34.
[11] André Seznec,et al. CASH: Revisiting Hardware Sharing in Single-Chip Parallel Processors , 2004, J. Instr. Level Parallelism.
[12] Todd M. Austin,et al. The SimpleScalar tool set, version 2.0 , 1997, CARN.
[13] Kai Wang,et al. Highly accurate data value prediction using hybrid predictors , 1997, Proceedings of 30th Annual International Symposium on Microarchitecture.
[14] M TullsenDean,et al. Symbiotic jobscheduling for a simultaneous mutlithreading processor , 2000 .
[15] Glenn Reinman,et al. An Evaluation of Deeply Decoupled Cores , 2006, J. Instr. Level Parallelism.
[16] Brad Calder,et al. Phase tracking and prediction , 2003, ISCA '03.
[17] James E. Smith,et al. Instruction Level Distributed Processing , 2000, HiPC.
[18] Norman P. Jouppi,et al. Conjoined-Core Chip Multiprocessing , 2004, 37th International Symposium on Microarchitecture (MICRO-37'04).
[19] Dean M. Tullsen,et al. Symbiotic jobscheduling for a simultaneous mutlithreading processor , 2000, SIGP.
[20] T. Sherwood,et al. Predictor-directed stream buffers , 2000, Proceedings 33rd Annual IEEE/ACM International Symposium on Microarchitecture. MICRO-33 2000.
[21] Norman P. Jouppi,et al. Single-ISA heterogeneous multi-core architectures: the potential for processor power reduction , 2003, Proceedings. 36th Annual IEEE/ACM International Symposium on Microarchitecture, 2003. MICRO-36..
[22] Kunle Olukotun,et al. A Single-Chip Multiprocessor , 1997, Computer.
[23] Norman P. Jouppi,et al. Cacti 3. 0: an integrated cache timing, power, and area model , 2001 .