Reconfigurable Buffer Structures for Coarse-Grained Reconfigurable Arrays

Coarse-Grained Reconfigurable Arrays (CGRAs) have emerged as a powerful solution to speedup computationally intensive applications. Heterogeneous MPSoC architectures containing such reconfigurable accelerators have the advantage of providing high flexibility, power-efficiency, and high performance. However, CGRAs may suffer from a data access bottleneck. To mitigate this problem, we present a reconfigurable buffer architecture for CGRAs. Here, the buffers can be configured at runtime to select between different schemes for memory access, i.e., addressable RAMs or pixel buffers. We showcase the benefits of our approach by prototyping a heterogeneous MPSoC architecture containing a RISC processor and a class of CGRA called Tightly Coupled Processor Arrays (TCPAs). The architecture is prototyped in FPGA technology. For basic image processing algorithms, we demonstrate that our proposed buffer structures for system integration allow to increase the memory bandwidth utilization and allow for a performance improvement of up to 7% in comparison to state-of-the-art solutions for image processing.

[1]  Jürgen Teich,et al.  System integration of tightly-coupled processor arrays using reconfigurable buffer structures , 2013, CF '13.

[2]  Walid A. Najjar,et al.  Input data reuse in compiling window operations onto reconfigurable hardware , 2004, LCTES '04.

[3]  Frank Hannig,et al.  Invasive Tightly-Coupled Processor Arrays , 2014, ACM Trans. Embed. Comput. Syst..

[4]  Jürgen Teich,et al.  PARO: Synthesis of Hardware Accelerators for Multi-Dimensional Dataflow-Intensive Applications , 2008, ARC.

[5]  Jürgen Teich,et al.  Invasive Algorithms and Architectures Invasive Algorithmen und Architekturen , 2008, it Inf. Technol..

[6]  Jürgen Teich,et al.  Power-Efficient Reconfiguration Control in Coarse-Grained Dynamically Reconfigurable Architectures , 2009, J. Low Power Electron..

[7]  Marc Reichenbach,et al.  A Generic VHDL Template for 2D Stencil Code Applications on FPGAs , 2012, 2012 IEEE 15th International Symposium on Object/Component/Service-Oriented Real-Time Distributed Computing Workshops.

[8]  Jürgen Teich,et al.  A highly parameterizable parallel processor array architecture , 2006, 2006 IEEE International Conference on Field Programmable Technology.

[9]  Vikram Bhatt,et al.  The GreenDroid Mobile Application Processor: An Architecture for Silicon's Dark Future , 2011, IEEE Micro.

[10]  Jürgen Teich,et al.  Mapping a class of dependence algorithms to coarse-grained reconfigurable arrays: architectural parameters and methodology , 2006, Int. J. Embed. Syst..

[11]  Jack Jean,et al.  Data Buffering and Allocation in Mapping Generalized Template Matching on Reconfigurable Systems , 2004, The Journal of Supercomputing.

[12]  Markus Weinhardt,et al.  PACT XPP—A Self-Reconfigurable Data Processing Architecture , 2004, The Journal of Supercomputing.

[13]  Jürgen Teich,et al.  Scheduling of partitioned regular algorithms on processor arrays with constrained resources , 1996, Proceedings of International Conference on Application Specific Systems, Architectures and Processors: ASAP '96.

[14]  Bjorn De Sutter,et al.  Architecture Enhancements for the ADRES Coarse-Grained Reconfigurable Array , 2008, HiPEAC.

[15]  Scott A. Mahlke,et al.  Efficient performance scaling of future CGRAs for mobile applications , 2012, 2012 International Conference on Field-Programmable Technology.