The Good Block: Hardware/Software Design for Composable, Block-Atomic Processors
暂无分享,去创建一个
Kathryn S. McKinley | Bertrand A. Maher | Doug Burger | Katherine E. Coons | D. Burger | K. McKinley
[1] Joseph A. Fisher,et al. Trace Scheduling: A Technique for Global Microcode Compaction , 1981, IEEE Transactions on Computers.
[2] Jeffrey R. Diamond,et al. An evaluation of the TRIPS computer system , 2009, ASPLOS.
[3] Y. Patt,et al. Exploiting fine-grained parallelism through a combination of hardware and software techniques , 1991, [1991] Proceedings. The 18th Annual International Symposium on Computer Architecture.
[4] Per Stenström. Transactions on High-Performance Embedded Architectures and Compilers I , 2007, Trans. HiPEAC.
[5] Karthikeyan Sankaralingam,et al. Dataflow Predication , 2006, 2006 39th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'06).
[6] Saman P. Amarasinghe,et al. Meta optimization: improving compiler heuristics with machine learning , 2003, PLDI '03.
[7] Scott Mahlke,et al. Effective compiler support for predicated execution using the hyperblock , 1992, MICRO 1992.
[8] Keith D. Cooper,et al. Value-driven redundancy elimination , 1996 .
[9] Aaron Smith,et al. Merging Head and Tail Duplication for Convergent Hyperblock Formation , 2006, 2006 39th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'06).
[10] Kathryn S. McKinley,et al. Strategies for mapping dataflow blocks to distributed hardware , 2008, 2008 41st IEEE/ACM International Symposium on Microarchitecture.
[11] DahlinMike,et al. Scaling to the End of Silicon with EDGE Architectures , 2004 .
[12] John R. Koza,et al. Genetic programming - on the programming of computers by means of natural selection , 1993, Complex adaptive systems.
[13] Engin Ipek,et al. Core fusion: accommodating software diversity in chip multiprocessors , 2007, ISCA '07.
[14] Doug Burger,et al. An adaptive, non-uniform cache structure for wire-delay dominated on-chip caches , 2002, ASPLOS X.
[15] Aaron Smith,et al. Compiling for EDGE architectures , 2006, International Symposium on Code Generation and Optimization (CGO'06).
[16] Scott A. Mahlke,et al. The superblock: An effective technique for VLIW and superscalar compilation , 1993, The Journal of Supercomputing.
[17] S. Winkel. Optimal versus Heuristic Global Code Scheduling , 2007, 40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 2007).
[18] Yale N. Patt,et al. Exploiting Fine-Grained Parallelism Through a Combination of Hardware and Software Techniques , 1991, ISCA.
[19] Kathryn S. McKinley,et al. Atomic block formation for explicit data graph execution architectures , 2010 .
[20] Lizy Kurian John,et al. Scaling to the end of silicon with EDGE architectures , 2004, Computer.
[21] Yale N. Patt,et al. Hardware Support For Large Atomic Units in Dynamically Scheduled Machines , 1988, [1988] Proceedings of the 21st Annual Workshop on Microprogramming and Microarchitecture - MICRO '21.
[22] Kathryn S. McKinley,et al. Convergent Compilation Applied to Loop Unrolling , 2007, Trans. High Perform. Embed. Archit. Compil..
[23] Brad Calder,et al. Basic block distribution analysis to find periodic behavior and simulation points in applications , 2001, Proceedings 2001 International Conference on Parallel Architectures and Compilation Techniques.
[24] Yale N. Patt,et al. Enhancing instruction scheduling with a block-structured ISA , 2007, International Journal of Parallel Programming.
[25] Milos D. Ercegovac,et al. The Art of Deception: Adaptive Precision Reduction for Area Efficient Physics Acceleration , 2007, 40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 2007).
[26] Mark D. Hill,et al. Amdahl's Law in the Multicore Era , 2008 .