A Survey of Coarse-Grained Reconfigurable Architecture and Design

As general-purpose processors have hit the power wall and chip fabrication cost escalates alarmingly, coarse-grained reconfigurable architectures (CGRAs) are attracting increasing interest from both academia and industry, because they offer the performance and energy efficiency of hardware with the flexibility of software. However, CGRAs are not yet mature in terms of programmability, productivity, and adaptability. This article reviews the architecture and design of CGRAs thoroughly for the purpose of exploiting their full potential. First, a novel multidimensional taxonomy is proposed. Second, major challenges and the corresponding state-of-the-art techniques are surveyed and analyzed. Finally, the future development is discussed.

[1]  Dong Li,et al.  How to implement effective prediction and forwarding for fusable dynamic multicore architectures , 2013, 2013 IEEE 19th International Symposium on High Performance Computer Architecture (HPCA).

[2]  Leibo Liu,et al.  Acceleration of control flows on Reconfigurable Architecture with a composite method , 2015, 2015 52nd ACM/EDAC/IEEE Design Automation Conference (DAC).

[3]  Tao Zhang,et al.  PRIME: A Novel Processing-in-Memory Architecture for Neural Network Computation in ReRAM-Based Main Memory , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).

[4]  Kermin Fleming,et al.  The LEAP FPGA operating system , 2014, 2014 24th International Conference on Field Programmable Logic and Applications (FPL).

[5]  Chen Yang,et al.  HReA: An Energy-Efficient Embedded Dynamically Reconfigurable Fabric for 13-Dwarfs Processing , 2018, IEEE Transactions on Circuits and Systems II: Express Briefs.

[6]  Jason Cong,et al.  A Fully Pipelined and Dynamically Composable Architecture of CGRA , 2014, 2014 IEEE 22nd Annual International Symposium on Field-Programmable Custom Computing Machines.

[7]  Scott A. Mahlke,et al.  A comparison of full and partial predicated execution support for ILP processors , 1995, Proceedings 22nd Annual International Symposium on Computer Architecture.

[8]  Antonia Zhai,et al.  Triggered instructions: a control paradigm for spatially-programmed architectures , 2013, ISCA.

[9]  Jason Helge Anderson,et al.  CGRA-ME: A unified framework for CGRA modelling and exploration , 2017, 2017 IEEE 28th International Conference on Application-specific Systems, Architectures and Processors (ASAP).

[10]  Olivier Zendra,et al.  The HiPEAC Vision 2017 , 2017 .

[11]  Steven Swanson,et al.  The WaveScalar architecture , 2007, TOCS.

[12]  Christoforos E. Kozyrakis,et al.  Convolution engine , 2015, Commun. ACM.

[13]  Pedro C. Diniz,et al.  Compiling for reconfigurable computing: A survey , 2010, CSUR.

[14]  Bernard Pottier,et al.  A holistic approach for tightly coupled reconfigurable parallel processors , 2009, Microprocess. Microsystems.

[15]  Georgi Gaydadjiev,et al.  Architectural Exploration of the ADRES Coarse-Grained Reconfigurable Array , 2007, ARC.

[16]  George Theodoridis,et al.  A Survey of Coarse-Grain Reconfigurable Architectures and Cad Tools , 2007 .

[17]  Rabi N. Mahapatra,et al.  Dynamic Context Compression for Low-Power Coarse-Grained Reconfigurable Architecture , 2010, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[18]  Hiroyuki Ochi,et al.  A cost-effective selective TMR for heterogeneous coarse-grained reconfigurable architectures based on DFG-level vulnerability analysis , 2013, 2013 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[19]  Jung Ho Ahn,et al.  NDA: Near-DRAM acceleration architecture leveraging commodity DRAM devices and standard memory modules , 2015, 2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA).

[20]  Götz Kappen,et al.  Application specific instruction processor based implementation of a GNSS receiver on an FPGA , 2006, Proceedings of the Design Automation & Test in Europe Conference.

[21]  Vivek Sarkar,et al.  Baring It All to Software: Raw Machines , 1997, Computer.

[22]  G.E. Moore,et al.  Cramming More Components Onto Integrated Circuits , 1998, Proceedings of the IEEE.

[23]  Yongqiang Xiong,et al.  ClickNP: Highly Flexible and High Performance Network Processing with Reconfigurable Hardware , 2016, SIGCOMM.

[24]  Bingfeng Mei,et al.  Mapping an H.264/AVC decoder onto the ADRES reconfigurable architecture , 2005, International Conference on Field Programmable Logic and Applications, 2005..

[25]  Frederic T. Chong,et al.  Active pages: a computation model for intelligent memory , 1998, ISCA.

[26]  B. Ramakrishna Rau,et al.  Some scheduling techniques and an easily schedulable horizontal architecture for high performance scientific computing , 1981, MICRO 14.

[27]  Bjorn De Sutter,et al.  Implementation of a Coarse-Grained Reconfigurable Media Processor for AVC Decoder , 2008, J. Signal Process. Syst..

[28]  Yoav Etsion,et al.  Single-graph multiple flows: Energy efficient design alternative for GPGPUs , 2014, 2014 ACM/IEEE 41st International Symposium on Computer Architecture (ISCA).

[29]  Steven Swanson,et al.  Near-Data Processing: Insights from a MICRO-46 Workshop , 2014, IEEE Micro.

[30]  Jason Cong,et al.  High-Level Synthesis for FPGAs: From Prototyping to Deployment , 2011, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[31]  Kiyoung Choi,et al.  Mapping control intensive kernels onto coarse-grained reconfigurable array architecture , 2008, 2008 International SoC Design Conference.

[32]  Dong Wang,et al.  An Energy-Efficient Coarse-Grained Reconfigurable Processing Unit for Multiple-Standard Video Decoding , 2015, IEEE Transactions on Multimedia.

[33]  William J. Dally,et al.  GPUs and the Future of Parallel Computing , 2011, IEEE Micro.

[34]  Kiyoung Choi,et al.  A scalable processing-in-memory accelerator for parallel graph processing , 2015, 2015 ACM/IEEE 42nd Annual International Symposium on Computer Architecture (ISCA).

[35]  Bo Wang,et al.  Exploration of Benes Network in Cryptographic Processors: A Random Infection Countermeasure for Block Ciphers Against Fault Attacks , 2017, IEEE Transactions on Information Forensics and Security.

[36]  Seda Ogrenci Memik,et al.  An ILP Formulation for the Task Graph Scheduling Problem Tailored to Bi-dimensional Reconfigurable Architectures , 2008, 2008 International Conference on Reconfigurable Computing and FPGAs.

[37]  Wei Zhang,et al.  A performance analysis framework for optimizing OpenCL applications on FPGAs , 2016, 2016 IEEE International Symposium on High Performance Computer Architecture (HPCA).

[38]  Arvind,et al.  Executing a Program on the MIT Tagged-Token Dataflow Architecture , 1990, IEEE Trans. Computers.

[39]  Hideharu Amano,et al.  Stream applications on the dynamically reconfigurable processor , 2004, Proceedings. 2004 IEEE International Conference on Field- Programmable Technology (IEEE Cat. No.04EX921).

[40]  Hideharu Amano,et al.  A cost-effective context memory structure for dynamically reconfigurable processors , 2006, Proceedings 20th IEEE International Parallel & Distributed Processing Symposium.

[41]  Leibo Liu,et al.  DRMaSV: Enhanced Capability Against Hardware Trojans in Coarse Grained Reconfigurable Architectures , 2018, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[42]  Shreesha Srinath,et al.  An Architectural Framework for Accelerating Dynamic Parallel Algorithms on Reconfigurable Hardware , 2018, 2018 51st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[43]  Aviral Shrivastava,et al.  EPIMap: Using Epimorphism to map applications on CGRAs , 2012, DAC Design Automation Conference 2012.

[44]  Aviral Shrivastava,et al.  REGIMap: Register-aware application mapping on Coarse-Grained Reconfigurable Architectures (CGRAs) , 2013, 2013 50th ACM/EDAC/IEEE Design Automation Conference (DAC).

[45]  Markus Weinhardt,et al.  PACT XPP—A Self-Reconfigurable Data Processing Architecture , 2003, The Journal of Supercomputing.

[46]  Henk Corporaal,et al.  Coarse grained reconfigurable architectures in the past 25 years: Overview and classification , 2016, 2016 International Conference on Embedded Computer Systems: Architectures, Modeling and Simulation (SAMOS).

[47]  Leibo Liu,et al.  Deep Convolutional Neural Network Architecture With Reconfigurable Computation Patterns , 2017, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[48]  Douglas L. Maskell,et al.  Are Coarse-Grained Overlays Ready for General Purpose Application Acceleration on FPGAs? , 2016, 2016 IEEE 14th Intl Conf on Dependable, Autonomic and Secure Computing, 14th Intl Conf on Pervasive Intelligence and Computing, 2nd Intl Conf on Big Data Intelligence and Computing and Cyber Science and Technology Congress(DASC/PiCom/DataCom/CyberSciTech).

[49]  Aviral Shrivastava,et al.  A Software Scheme for Multithreading on CGRAs , 2015, ACM Trans. Embed. Comput. Syst..

[50]  Yuan Xie,et al.  DRISA: A DRAM-based Reconfigurable In-Situ Accelerator , 2017, 2017 50th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[51]  Kunle Olukotun,et al.  Automatic Generation of Efficient Accelerators for Reconfigurable Hardware , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).

[52]  Jaehyuk Huh,et al.  TRIPS: A polymorphous architecture for exploiting ILP, TLP, and DLP , 2004, TACO.

[53]  C. Nicol A Coarse Grain Reconfigurable Array ( CGRA ) for Statically Scheduled Data Flow Computing , 2017 .

[54]  Scott A. Mahlke,et al.  Effective compiler support for predicated execution using the hyperblock , 1992, MICRO 25.

[55]  A. G. Hirschbiel,et al.  A Novel ASIC Design Approach based on a New Machine Paradigm , 1990, ESSCIRC '90: Sixteenth European Solid-State Circuits Conference.

[56]  Kingshuk Karuri,et al.  A Design Flow for Architecture Exploration and Implementation of Partially Reconfigurable Processors , 2008, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[57]  Dong Wang,et al.  An energy-efficient coarse-grained dynamically reconfigurable fabric for multiple-standard video decoding applications , 2013, Proceedings of the IEEE 2013 Custom Integrated Circuits Conference.

[58]  Kermin Fleming,et al.  Leap scratchpads: automatic memory and cache management for reconfigurable logic , 2010, FPGA '11.

[59]  Jürgen Becker,et al.  H. 264 Decoder at HD Resolution on a Coarse Grain Dynamically Reconfigurable Architecture , 2007, 2007 International Conference on Field Programmable Logic and Applications.

[60]  Karthikeyan Sankaralingam,et al.  Stream-dataflow acceleration , 2017, 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA).

[61]  Hannu Tenhunen,et al.  Compression Based Efficient and Agile Configuration Mechanism for Coarse Grained Reconfigurable Architectures , 2011, 2011 IEEE International Symposium on Parallel and Distributed Processing Workshops and Phd Forum.

[62]  Steven S. Lumetta,et al.  HybridOS: runtime support for reconfigurable accelerators , 2008, FPGA '08.

[63]  Steven Swanson,et al.  Conservation cores: reducing the energy of mature computations , 2010, ASPLOS XV.

[64]  Robert W. Brodersen,et al.  Borph: an operating system for fpga-based reconfigurable computers , 2007 .

[65]  Chenchen Deng,et al.  Against Double Fault Attacks: Injection Effort Model, Space and Time Randomization Based Countermeasures for Reconfigurable Array Architecture , 2016, IEEE Transactions on Information Forensics and Security.

[66]  Karthikeyan Sankaralingam,et al.  DySER: Unifying Functionality and Parallelism Specialization for Energy-Efficient Computing , 2012, IEEE Micro.

[67]  Ninghui Sun,et al.  DianNao: a small-footprint high-throughput accelerator for ubiquitous machine-learning , 2014, ASPLOS.

[68]  Gurindar S. Sohi,et al.  Dataflow execution of sequential imperative programs on multicore architectures , 2011, 2011 44th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[69]  Rainer Leupers,et al.  LISA: A Uniform ADL for Embedded Processor Modeling, Implementation, and Software Toolsuite Generation , 2008 .

[70]  Liesbet Van der Perre,et al.  Mapping of 40 MHz MIMO SDM-OFDM Baseband Processing on Multi-Processor SDR Platform , 2008, 2008 11th IEEE Workshop on Design and Diagnostics of Electronic Circuits and Systems.

[71]  Luigi Carro,et al.  A reconfigurable heterogeneous multicore with a homogeneous ISA , 2016, 2016 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[72]  Christoforos E. Kozyrakis,et al.  A case for intelligent RAM , 1997, IEEE Micro.

[73]  Minsoo Kim,et al.  Flexible video processing platform for 8K UHD TV , 2015, 2015 IEEE Hot Chips 27 Symposium (HCS).

[74]  Karthikeyan Sankaralingam,et al.  Dark Silicon and the End of Multicore Scaling , 2012, IEEE Micro.

[75]  Christof Paar,et al.  An instruction-level distributed processor for symmetric-key cryptography , 2005, IEEE Transactions on Parallel and Distributed Systems.

[76]  Wayne Luk,et al.  Reconfigurable computing: architectures and design methods , 2005 .

[77]  Jan M. Rabaey,et al.  A reconfigurable multiprocessor IC for rapid prototyping of algorithmic-specific high-speed DSP data paths , 1992 .

[78]  Gilles Kahn,et al.  The Semantics of a Simple Language for Parallel Programming , 1974, IFIP Congress.

[79]  Marcin Wójcik,et al.  On reconfigurable fabrics and generic side-channel countermeasures , 2012, 22nd International Conference on Field Programmable Logic and Applications (FPL).

[80]  Yu Zhang,et al.  Enabling FPGAs in the cloud , 2014, Conf. Computing Frontiers.

[81]  Kunle Olukotun,et al.  Plasticine: A reconfigurable architecture for parallel patterns , 2017, 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA).

[82]  Fadi J. Kurdahi,et al.  MorphoSys: An Integrated Reconfigurable System for Data-Parallel and Computation-Intensive Applications , 2000, IEEE Trans. Computers.

[83]  Cheng Liu,et al.  QuickDough: A rapid FPGA loop accelerator design framework using soft CGRA overlay , 2015, 2015 International Conference on Field Programmable Technology (FPT).

[84]  Kunle Olukotun,et al.  Generating Configurable Hardware from Parallel Patterns , 2015, ASPLOS.

[85]  Masanori Hashimoto,et al.  Implementing Flexible Reliability in a Coarse-Grained Reconfigurable Architecture , 2013, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[86]  Yoav Etsion,et al.  Inter-Thread Communication in Multithreaded, Reconfigurable Coarse-Grain Arrays , 2018, 2018 51st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[87]  Kiyoung Choi,et al.  Design Space Exploration for Efficient Resource Utilization in Coarse-Grained Reconfigurable Architecture , 2010, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[88]  Mingyu Gao,et al.  HRL: Efficient and flexible reconfigurable logic for near-data processing , 2016, 2016 IEEE International Symposium on High Performance Computer Architecture (HPCA).

[89]  André DeHon,et al.  Fundamental Underpinnings of Reconfigurable Computing Architectures , 2015, Proceedings of the IEEE.

[90]  Bruce M. Maggs,et al.  Proceedings of the 28th Annual Hawaii International Conference on System Sciences- 1995 Models of Parallel Computation: A Survey and Synthesis , 2022 .

[91]  Yong Wang,et al.  SDA: Software-defined accelerator for large-scale DNN systems , 2014, 2014 IEEE Hot Chips 26 Symposium (HCS).

[92]  Franz Franchetti,et al.  Data reorganization in memory using 3D-stacked DRAM , 2015, 2015 ACM/IEEE 42nd Annual International Symposium on Computer Architecture (ISCA).

[93]  David Atienza,et al.  i-DPs CGRA: An Interleaved-Datapaths Reconfigurable Accelerator for Embedded Bio-Signal Processing , 2019, IEEE Embedded Systems Letters.

[94]  Ingrid Verbauwhede,et al.  Power and Fault Analysis Resistance in Hardware through Dynamic Reconfiguration , 2008, CHES.

[95]  Mark Horowitz,et al.  1.1 Computing's energy problem (and what we can do about it) , 2014, 2014 IEEE International Solid-State Circuits Conference Digest of Technical Papers (ISSCC).

[96]  Michael J. Flynn,et al.  Some Computer Organizations and Their Effectiveness , 1972, IEEE Transactions on Computers.

[97]  Seth Copen Goldstein,et al.  Pegasus: An Efficient Intermediate Representation , 2002 .

[98]  Masanori Hashimoto,et al.  Coarse-grained dynamically reconfigurable architecture with flexible reliability , 2009, 2009 International Conference on Field Programmable Logic and Applications.

[99]  Chenchen Deng,et al.  A novel approach using a minimum cost maximum flow algorithm for fault-tolerant topology reconfiguration in NoC architectures , 2015, The 20th Asia and South Pacific Design Automation Conference.

[100]  Annie Pérez,et al.  Celator: A Multi-algorithm Cryptographic Co-processor , 2008, 2008 International Conference on Reconfigurable Computing and FPGAs.

[101]  Leibo Liu,et al.  A 1.06-to-5.09 TOPS/W reconfigurable hybrid-neural-network processor for deep learning applications , 2017, 2017 Symposium on VLSI Circuits.

[102]  Rachel Courtland The end of the shrink , 2013, IEEE Spectrum.

[103]  R.H. Dennard,et al.  Design Of Ion-implanted MOSFET's with Very Small Physical Dimensions , 1974, Proceedings of the IEEE.

[104]  Gerald Estrin,et al.  Organization of computer systems: the fixed plus variable structure computer , 1960, IRE-AIEE-ACM '60 (Western).

[105]  Soojung Ryu,et al.  Design space exploration and implementation of a high performance and low area Coarse Grained Reconfigurable Processor , 2012, 2012 International Conference on Field-Programmable Technology.

[106]  Derek Chiou,et al.  Cryptoraptor: High throughput reconfigurable cryptographic processor , 2014, 2014 IEEE/ACM International Conference on Computer-Aided Design (ICCAD).

[107]  D. Burger,et al.  Memory Bandwidth Limitations of Future Microprocessors , 1996, 23rd Annual International Symposium on Computer Architecture (ISCA'96).

[108]  Samuel Williams,et al.  The Landscape of Parallel Computing Research: A View from Berkeley , 2006 .

[109]  Constantin Enea,et al.  Tractable Refinement Checking for Concurrent Objects , 2015, POPL.

[110]  Rudy Lauwereins,et al.  ADRES: An Architecture with Tightly Coupled VLIW Processor and Coarse-Grained Reconfigurable Matrix , 2003, FPL.

[111]  C. A. R. Hoare,et al.  Communicating sequential processes , 1978, CACM.

[112]  Tony Nowatzki,et al.  Software transparent dynamic binary translation for coarse-grain reconfigurable architectures , 2016, 2016 IEEE International Symposium on High Performance Computer Architecture (HPCA).

[113]  Scott Hauck,et al.  Reconfigurable computing: a survey of systems and software , 2002, CSUR.

[114]  Babak Falsafi,et al.  The HiPEAC Vision , 2010 .

[115]  Kunle Olukotun,et al.  Hardware system synthesis from Domain-Specific Languages , 2014, 2014 24th International Conference on Field Programmable Logic and Applications (FPL).

[116]  Scott A. Mahlke,et al.  Polymorphic Pipeline Array: A flexible multicore accelerator with virtualized execution for mobile multimedia applications , 2009, 2009 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[117]  Russell Tessier,et al.  Reconfigurable Computing Architectures , 2015, Proceedings of the IEEE.

[118]  S. Alexander Chin,et al.  An Architecture-Agnostic Integer Linear Programming Approach to CGRA Mapping , 2018, 2018 55th ACM/ESDA/IEEE Design Automation Conference (DAC).

[119]  Yao Wang,et al.  Aggressive pipelining of irregular applications on reconfigurable hardware , 2017, 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA).

[120]  Anupam Chattopadhyay,et al.  Ingredients of Adaptability: A Survey of Reconfigurable Processors , 2013, VLSI Design.

[121]  Scott A. Mahlke,et al.  Application-Specific Processing on a General-Purpose Core via Transparent Instruction Set Customization , 2004, 37th International Symposium on Microarchitecture (MICRO-37'04).

[122]  Thomas P. Parnell,et al.  Temporal correlation detection using computational phase-change memory , 2017, Nature Communications.

[123]  Gerd Ascheid,et al.  FLEXDET: Flexible, Efficient Multi-Mode MIMO Detection Using Reconfigurable ASIP , 2012, 2012 IEEE 20th International Symposium on Field-Programmable Custom Computing Machines.

[124]  Jack B. Dennis,et al.  A preliminary architecture for a basic data-flow processor , 1974, ISCA '75.

[125]  Hiroaki Takada,et al.  Rainbow: An OS Extension for Hardware Multitasking on Dynamically Partially Reconfigurable FPGAs , 2011, 2011 International Conference on Reconfigurable Computing and FPGAs.

[126]  Tughrul Arslan,et al.  The Reconfigurable Instruction Cell Array , 2008, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[127]  Reiner W. Hartenstein,et al.  A decade of reconfigurable computing: a visionary retrospective , 2001, Proceedings Design, Automation and Test in Europe. Conference and Exhibition 2001.

[128]  David A. Patterson,et al.  In-datacenter performance analysis of a tensor processing unit , 2017, 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA).

[129]  Hideharu Amano,et al.  A Survey on Dynamically Reconfigurable Processors , 2006, IEICE Trans. Commun..

[130]  Karthikeyan Sankaralingam,et al.  Efficient execution of memory access phases using dataflow specialization , 2015, 2015 ACM/IEEE 42nd Annual International Symposium on Computer Architecture (ISCA).

[131]  Milind Girkar,et al.  Comparative architectural characterization of SPEC CPU2000 and CPU2006 benchmarks on the intel® Core™ 2 Duo processor , 2008, 2008 International Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation.

[132]  Luigi Carro,et al.  Approximate On-The-Fly Coarse-Grained Reconfigurable Acceleration for General-Purpose Applications , 2018, 2018 55th ACM/ESDA/IEEE Design Automation Conference (DAC).

[133]  Kiyoung Choi,et al.  Automatic mapping of control-intensive kernels onto coarse-grained reconfigurable array architecture with speculative execution , 2010, 2010 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum (IPDPSW).

[134]  Daniel Sánchez,et al.  Jenga: Software-defined cache hierarchies , 2017, 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA).

[135]  Arthur H. Veen,et al.  Dataflow machine architecture , 1986, CSUR.

[136]  Muhammad Shafique,et al.  PX-CGRA: Polymorphic approximate coarse-grained reconfigurable architecture , 2018, 2018 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[137]  Feng Liu,et al.  DynaSpAM: Dynamic spatial architecture mapping using Out of Order instruction schedules , 2015, 2015 ACM/IEEE 42nd Annual International Symposium on Computer Architecture (ISCA).

[138]  Dean M. Tullsen,et al.  Data-triggered threads: Eliminating redundant computation , 2011, 2011 IEEE 17th International Symposium on High Performance Computer Architecture.

[139]  Seth Copen Goldstein,et al.  Tartan: evaluating spatial computation for whole program execution , 2006, ASPLOS XII.

[140]  Bernd Klauer,et al.  Operating System Concepts for Reconfigurable Computing: Review and Survey , 2016, Int. J. Reconfigurable Comput..

[141]  Jürgen Becker,et al.  A novel ADL-based compiler-centric software framework for reconfigurable mixed-ISA processors , 2011, 2011 International Conference on Embedded Computer Systems: Architectures, Modeling and Simulation.

[142]  Craig B. Zilles,et al.  Branch vanguard: Decomposing branch functionality into prediction and resolution instructions , 2015, 2015 ACM/IEEE 42nd Annual International Symposium on Computer Architecture (ISCA).

[143]  T. Sato,et al.  Implementation of dynamically reconfigurable processor DAPDNA-2 , 2005, 2005 IEEE VLSI-TSA International Symposium on VLSI Design, Automation and Test, 2005. (VLSI-TSA-DAT)..

[144]  Bertil Svensson,et al.  Evolution in architectures and programming methodologies of coarse-grained reconfigurable computing , 2009, Microprocess. Microsystems.

[145]  Bjorn De Sutter,et al.  Still Image Processing on Coarse-Grained Reconfigurable Array Architectures , 2007, 2007 IEEE/ACM/IFIP Workshop on Embedded Systems for Real-Time Multimedia.

[146]  Kiyoung Choi,et al.  Buffered compares: Excavating the hidden parallelism inside DRAM architectures with lightweight logic , 2016, 2016 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[147]  Jan M. Rabaey,et al.  A reconfigurable data-driven multiprocessor architecture for rapid prototyping of high throughput DSP algorithms , 1993, [1993] Proceedings of the Twenty-sixth Hawaii International Conference on System Sciences.

[148]  Karthikeyan Sankaralingam,et al.  Pushing the limits of accelerator efficiency while retaining programmability , 2016, 2016 IEEE International Symposium on High Performance Computer Architecture (HPCA).

[149]  Berin Martini,et al.  NeuFlow: A runtime reconfigurable dataflow processor for vision , 2011, CVPR 2011 WORKSHOPS.

[150]  Lizy K. John,et al.  Performance characterization of SPEC CPU benchmarks on intel's core microarchitecture based processor , 2007 .

[151]  Antonia Zhai,et al.  Exploring speculative parallelism in SPEC2006 , 2009, 2009 IEEE International Symposium on Performance Analysis of Systems and Software.

[152]  Seth Copen Goldstein,et al.  PipeRench: A Reconfigurable Architecture and Compiler , 2000, Computer.

[153]  V. Derudder,et al.  Mapping a multiple antenna SDM-OFDM receiver on the ADRES coarse-grained reconfigurable processor , 2005, IEEE Workshop on Signal Processing Systems Design and Implementation, 2005..

[154]  Nachiket Kapre,et al.  Design patterns for reconfigurable computing , 2004, 12th Annual IEEE Symposium on Field-Programmable Custom Computing Machines.

[155]  Eby G. Friedman,et al.  AC-DIMM: associative computing with STT-MRAM , 2013, ISCA.

[156]  Ian Watson,et al.  A prototype data flow computer with token labelling , 1899 .