39.9 GOPs/watt multi-mode CGRA accelerator for a multi-standard basestation

This paper presents an industrial case study of using a Coarse Grain Reconfigurable Architecture (CGRA) for a multi-mode accelerator for two kernels: FFT for the LTE standard and the Correlation Pool for the UMTS standard to be executed in a mutually exclusive manner. The CGRA multi-mode accelerator achieved computational efficiency of 39.94 GOPS/watt (OP is multiply-add) and silicon efficiency of 56.20 GOPS/mm2. By analyzing the code and inferring the unused features of the fully programmable solution, an in-house developed tool was used to automatically customize the design to run just the two kernels and the two efficiency metrics improved to 49.05 GOPS/watt and 107.57 GOPS/mm2. Corresponding numbers for the ASIC implementation are 63.84 GOPS/watt and 90.91 GOPS/mm2. Though the ASIC's silicon and computational efficiency numbers are slightly better, the engineering efficiency of the pre-verified/characterized CGRA solution is at least 10X better than the ASIC solution.

[1]  Francisco D. Igual,et al.  Unleashing DSPs for General-Purpose HPC FLAME Working Note # 61 , 2012 .

[2]  Nader Bagherzadeh,et al.  Fast parallel FFT on a reconfigurable computation platform , 2003, Proceedings. 15th Symposium on Computer Architecture and High Performance Computing.

[3]  Ahmed Hemani,et al.  Synchronizing distributed state machines in a coarse grain reconfigurable architecture , 2011, 2011 International Symposium on System on Chip (SoC).

[4]  William J. Dally,et al.  A bandwidth-efficient architecture for media processing , 1998, Proceedings. 31st Annual ACM/IEEE International Symposium on Microarchitecture.

[5]  Muhammad Ali Shami Dynamically Reconfigurable Resource Array , 2012 .

[6]  Hyunseok Lee,et al.  SODA: A High-Performance DSP Architecture for Software-Defined Radio , 2007, IEEE Micro.

[7]  Cao Liang,et al.  Mapping Parallel FFT Algorithm onto SmartCell Coarse-Grained Reconfigurable Architecture , 2009, 2009 20th IEEE International Conference on Application-specific Systems, Architectures and Processors.