Design of a coarse-grained reconfigurable architecture with floating-point support and comparative study

With a huge increase in demand for various kinds of compute-intensive applications in electronic systems, researchers have focused on coarse-grained reconfigurable architectures because of their advantages: high performance and flexibility. This paper presents FloRA, a coarse-grained reconfigurable architecture with floating-point support. A two-dimensional array of integer processing elements in FloRA is configured at run-time to perform floating-point operations as well as integer operations. Fabricated using 130nm process, the total area overhead due to additional hardware for floating-point operations is about 7.4% compared to the previous architecture which does not support floating-point operations. The fabricated chip runs at 125MHz clock frequency and 1.2V power supply. Experiments show 11.6x speedup on average compared to ARM9 with a vector-floating-point unit for integer-only benchmark programs as well as programs containing floating-point operations. Compared with other similar approaches including XPP and Butter, the proposed architecture shows much higher performance for integer applications, while maintaining about half the performance of Butter for floating-point applications.

[1]  Jari Nurmi,et al.  A coarse-grain reconfigurable architecture for multimedia applications featuring subword computation capabilities , 2008, Journal of Real-Time Image Processing.

[2]  C. Brunelli,et al.  A VHDL model and Implementation of a Coarse-Grain Reconfigurable Coprocessor for a RISC Core , 2006, 2006 Ph.D. Research in Microelectronics and Electronics.

[3]  Jari Nurmi,et al.  A coarse-grain reconfigurable architecture for multimedia applications supporting subword and floating-point calculations , 2010, J. Syst. Archit..

[4]  Kiyoung Choi,et al.  DOMAIN-SPECIFIC OPTIMIZATION OF RECONFIGURABLE ARRAY ARCHITECTURE , 2006 .

[5]  José G. Delgado-Frias,et al.  A Medium-Grain Reconfigurable Architecture for DSP: VLSI Design, Benchmark Mapping, and Performance , 2008, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[6]  Qiwei Zhang,et al.  Implementing Non Power-of-Two FFTs on Coarse-Grain Reconfigurable Architectures , 2005, 2005 International Symposium on System-on-Chip.

[7]  Jürgen Becker,et al.  Design and implementation of a coarse-grained dynamically reconfigurable hardware architecture , 2001, Proceedings IEEE Computer Society Workshop on VLSI 2001. Emerging Technologies for VLSI Systems.

[8]  Kiyoung Choi,et al.  Design and Evaluation of a Coarse-Grained Reconfigurable Architecture , 2004 .

[9]  Fadi J. Kurdahi,et al.  MorphoSys: An Integrated Reconfigurable System for Data-Parallel and Computation-Intensive Applications , 2000, IEEE Trans. Computers.

[10]  Nozar Tabrizi,et al.  Interactive ray tracing using a SIMD reconfigurable architecture , 2002, 14th Symposium on Computer Architecture and High Performance Computing, 2002. Proceedings..

[11]  Yunheung Paek,et al.  Power-Conscious Configuration Cache Structure and Code Mapping for Coarse-Grained Reconfigurable Architecture , 2006, ISLPED'06 Proceedings of the 2006 International Symposium on Low Power Electronics and Design.

[12]  Kiyoung Choi,et al.  Chip implementation of a coarse-grained reconfigurable architecture supporting floating-point operations , 2008, 2008 International SoC Design Conference.

[13]  Hideharu Amano,et al.  An adaptive cryptographic accelerator for IPsec on dynamically reconfigurable processor , 2005, Proceedings. 2005 IEEE International Conference on Field-Programmable Technology, 2005..

[14]  Rudy Lauwereins,et al.  Architecture exploration for a reconfigurable architecture template , 2005, IEEE Design & Test of Computers.

[15]  Karl S. Hemmert,et al.  Architectural Modifications to Enhance the Floating-Point Performance of FPGAs , 2008, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[16]  R. Hartenstein,et al.  KressArray Xplorer: a new CAD environment to optimize reconfigurable datapath array architectures , 2000, Proceedings 2000. Design Automation Conference. (IEEE Cat. No.00CH37106).

[17]  Eberhard Schüler,et al.  Reconfigurable Parallel Computing Architecture for On-Board Data Processing , 2006, First NASA/ESA Conference on Adaptive Hardware and Systems (AHS'06).

[18]  Kiyoung Choi Coarse-Grained Reconfigurable Array: Architecture and Application Mapping , 2011, IPSJ Trans. Syst. LSI Des. Methodol..

[19]  Vivek Sarkar,et al.  Space-time scheduling of instruction-level parallelism on a raw machine , 1998, ASPLOS VIII.

[20]  D. Verkest,et al.  Very Wide Register: An Asymmetric Register File Organization for Low Power Embedded Processors , 2007, 2007 Design, Automation & Test in Europe Conference & Exhibition.

[21]  Sebastián López,et al.  Mapping Control-Intensive Video Kernels onto a Coarse-Grain Reconfigurable Architecture: the H.264/AVC Deblocking Filter , 2007, 2007 Design, Automation & Test in Europe Conference & Exhibition.

[22]  Kiyoung Choi,et al.  Resource sharing and pipelining in coarse-grained reconfigurable architecture for domain-specific optimization , 2005, Design, Automation and Test in Europe.

[23]  Jürgen Becker,et al.  Stream-based XPP Architectures in Adaptive System-on-Chip Integration , 2005 .

[24]  Nader Bagherzadeh,et al.  Design and implementation of a rendering algorithm in a SIMD reconfigurable architecture (MorphoSys) , 2006, Proceedings of the Design Automation & Test in Europe Conference.

[25]  Wayne Luk,et al.  Floating-Point FPGA: Architecture and Modeling , 2009, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[26]  V. Derudder,et al.  Mapping a multiple antenna SDM-OFDM receiver on the ADRES coarse-grained reconfigurable processor , 2005, IEEE Workshop on Signal Processing Systems Design and Implementation, 2005..

[27]  Kiyoung Choi,et al.  FloRA: Coarse-grained reconfigurable architecture with floating-point operation capability , 2009, 2009 International Conference on Field-Programmable Technology.