MLP-aware dynamic instruction window resizing for adaptively exploiting both ILP and MLP
暂无分享,去创建一个
[1] Fred J. Pollack. New microarchitecture challenges in the coming generations of CMOS process technologies (keynote address)(abstract only) , 1999, MICRO.
[2] Onur Mutlu,et al. Runahead Execution: An Effective Alternative to Large Instruction Windows , 2003, IEEE Micro.
[3] Francisco J. Cazorla,et al. Kilo-instruction processors: overcoming the memory wall , 2005, IEEE Micro.
[4] Eric Rotenberg,et al. A large, fast instruction window for tolerating cache misses , 2002, ISCA.
[5] Michael C. Huang,et al. Dynamically Tuning Processor Resources with Adaptive Processing , 2003, Computer.
[6] Stefanos Kaxiras,et al. MLP-Aware Instruction Queue Resizing: The Key to Power-Efficient Performance , 2010, ARCS.
[7] Gürhan Küçük,et al. Reducing power requirements of instruction scheduling through dynamic allocation of multiple datapath resources , 2001, MICRO.
[8] Eric M. Schwarz,et al. IBM POWER6 microarchitecture , 2007, IBM J. Res. Dev..
[9] Yuan Chou,et al. Low-Cost Epoch-Based Correlation Prefetching for Commercial Applications , 2007, 40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 2007).
[10] Haitham Akkary,et al. Continual flow pipelines , 2004, ASPLOS XI.
[11] James E. Smith,et al. Complexity-Effective Superscalar Processors , 1997, Conference Proceedings. The 24th Annual International Symposium on Computer Architecture.
[12] Marc Tremblay,et al. Rock: A High-Performance Sparc CMT Processor , 2009, IEEE Micro.
[13] Chris Wilkerson,et al. Hierarchical Scheduling Windows , 2002, MICRO.
[14] Jung Ho Ahn,et al. McPAT: An integrated power, area, and timing modeling framework for multicore and manycore architectures , 2009, 2009 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[15] R. J. Joenk,et al. IBM journal of research and development: information for authors , 1978 .
[16] Onur Mutlu,et al. Techniques for efficient processing in runahead execution engines , 2005, 32nd International Symposium on Computer Architecture (ISCA'05).
[17] Jean-Loup Baer,et al. An effective on-chip preloading scheme to reduce data access penalty , 1991, Proceedings of the 1991 ACM/IEEE Conference on Supercomputing (Supercomputing '91).
[18] Norman P. Jouppi,et al. CACTI 6.0: A Tool to Model Large Caches , 2009 .
[19] Hideki Ando,et al. Evaluation of issue queue delay: Banking tag RAM and identifying correct critical path , 2011, 2011 IEEE 29th International Conference on Computer Design (ICCD).
[20] Joseph Shor,et al. A Fully Integrated Multi-CPU, Processor Graphics, and Memory Controller 32-nm Processor , 2012, IEEE Journal of Solid-State Circuits.
[21] Norman P. Jouppi,et al. Improving direct-mapped cache performance by the addition of a small fully-associative cache and prefetch buffers , 1990, [1990] Proceedings. The 17th Annual International Symposium on Computer Architecture.
[22] Antonio González,et al. Energy-effective issue logic , 2001, ISCA 2001.