Architectural Specialization for Inter-Iteration Loop Dependence Patterns
暂无分享,去创建一个
Shreesha Srinath | Berkin Ilbeyi | Christopher Batten | Zhiru Zhang | Gai Liu | Mingxing Tan | Mingxing Tan | C. Batten | Zhiru Zhang | Gai Liu | S. Srinath | Berkin Ilbeyi
[1] Ken Kennedy,et al. Practical dependence testing , 1991, PLDI '91.
[2] Multiscalar processors , 1995, ISCA 1995.
[3] J. Wawrzynek. Spert-II : A Vector Micro Processore System, Special Issue of Neural Computing in , 1996 .
[4] Brian Kingsbury,et al. Spert-II: A Vector Microprocessor System , 1996, Computer.
[5] Mateo Valero,et al. Vector architectures: past, present and future , 1998, ICS '98.
[6] Josep Torrellas,et al. A Chip-Multiprocessor Architecture with Speculative Multithreading , 1999, IEEE Trans. Computers.
[7] Antonia Zhai,et al. A scalable approach to thread-level speculation , 2000, Proceedings of 27th International Symposium on Computer Architecture (IEEE Cat. No.RS00201).
[8] C. Jesshope. Implementing an efficient vector instruction set in a chip multi-processor using micro-threaded pipelines , 2001, Proceedings 6th Australasian Computer Systems Architecture Conference. ACSAC 2001.
[9] Trevor Mudge,et al. MiBench: A free, commercially representative embedded benchmark suite , 2001 .
[10] Chris R. Jesshope. Implementing an efficient vector instruction set in a chip multi-processor using micro-threaded pipelines , 2001 .
[11] Christoforos E. Kozyrakis,et al. Scalable Vector Processors for Embedded Systems , 2003, IEEE Micro.
[12] Automatic Application-Specific Instruction-Set Extensions Under Microarchitectural Constraints , 2003, International Journal of Parallel Programming.
[13] Jason Cong,et al. Application-specific instruction generation for configurable processor architectures , 2004, FPGA '04.
[14] Scott A. Mahlke,et al. Application-Specific Processing on a General-Purpose Core via Transparent Instruction Set Customization , 2004, 37th International Symposium on Microarchitecture (MICRO-37'04).
[15] Christopher Batten,et al. The vector-thread architecture , 2004, Proceedings. 31st Annual International Symposium on Computer Architecture, 2004..
[16] Christopher J. Hughes,et al. Carbon: architectural support for fine-grained parallelism on chip multiprocessors , 2007, ISCA '07.
[17] Scott A. Mahlke,et al. Uncovering hidden loop level parallelism in sequential applications , 2008, 2008 IEEE 14th International Symposium on High Performance Computer Architecture.
[18] David Black-Schaffer,et al. Efficient Embedded Computing , 2008, Computer.
[19] Erik Lindholm,et al. NVIDIA Tesla: A Unified Graphics and Computing Architecture , 2008, IEEE Micro.
[20] Norman P. Jouppi,et al. CACTI 6.0: A Tool to Model Large Caches , 2009 .
[21] Jung Ho Ahn,et al. McPAT: An integrated power, area, and timing modeling framework for multicore and manycore architectures , 2009, 2009 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[22] James R. Larus,et al. Transactional Memory, 2nd edition , 2010, Transactional Memory.
[23] Christoforos E. Kozyrakis,et al. Flexible architectural support for fine-grain scheduling , 2010, ASPLOS XV.
[24] Bradford M. Beckmann,et al. The gem5 simulator , 2011, CARN.
[25] Steven Swanson,et al. QSCORES: Trading dark silicon for scalable energy efficiency with quasi-specific cores , 2011, 2011 44th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[26] Karthikeyan Sankaralingam,et al. Dynamically Specialized Datapaths for energy efficient computing , 2011, 2011 IEEE 17th International Symposium on High Performance Computer Architecture.
[27] Amin Ansari,et al. Bundled execution of recurring traces for energy-efficient general purpose processing , 2011, 2011 44th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[28] AsanovićKrste,et al. Exploring the tradeoffs between programmability and efficiency in data-parallel accelerators , 2011 .
[29] Emmett Kilgariff,et al. Fermi GF100 GPU Architecture , 2011, IEEE Micro.
[30] Karthikeyan Sankaralingam,et al. DySER: Unifying Functionality and Parallelism Specialization for Energy-Efficient Computing , 2012, IEEE Micro.
[31] Guy E. Blelloch,et al. Brief announcement: the problem based benchmark suite , 2012, SPAA '12.
[32] Karthikeyan Sankaralingam,et al. Breaking SIMD shackles with an exposed flexible microarchitecture and the access execute PDG , 2013, Proceedings of the 22nd International Conference on Parallel Architectures and Compilation Techniques.
[33] Gu-Yeon Wei,et al. HELIX-RC: An architecture-compiler co-design for automatic parallelization of irregular programs , 2014, 2014 ACM/IEEE 41st International Symposium on Computer Architecture (ISCA).
[34] Willie Anderson,et al. Hexagon DSP: An Architecture Optimized for Mobile Multimedia and Communications , 2014, IEEE Micro.
[35] Christopher Batten,et al. PyMTL: A Unified Framework for Vertically Integrated Computer Architecture Research , 2014, 2014 47th Annual IEEE/ACM International Symposium on Microarchitecture.
[36] IEEE Micro , 2022 .