Variable pipeline structure for Coarse Grained Reconfigurable Array CMA

Cool mega-array (CMA) is a kind of coarse grained reconfigurable architecture (CGRA) which has shown its ability of ultra low-power computation. However, as CMA completely eliminates clock trees and registers, the performance improvement has been limited. In this paper, we introduce a variable pipeline structure to CMA with the minimum essential registers to provide more wide trade-off between performance and energy. Comparing with the baseline CMA (non-pipelined structure), an average of 77% improvement for performance was achieved with a small power overhead. Moreover, the energy efficiency was 1461 MOPS / mW at most which was about 2× that of the baseline structure. The best pipeline depth for an arbitrary energy-performance trade-off became selectable with only 11% area overhead.

[1]  S. Maegawa,et al.  Silicon on thin BOX: a new paradigm of the CMOSFET for low-power high-performance application featuring wide-range back-bias control , 2004, IEDM Technical Digest. IEEE International Electron Devices Meeting, 2004..

[2]  Manfred Glesner,et al.  The XPP Architecture and Its Co-simulation Within the Simulink Environment , 2004, FPL.

[3]  増山 滉一朗,et al.  Ultra Low Power Reconfigurable Accelerator CCSOTB , 2015 .

[4]  Hiroshi Nakamura,et al.  Cool Mega-Arrays: Ultralow-Power Reconfigurable Accelerator Chips , 2011, IEEE Micro.

[5]  Jeffrey M. Arnold,et al.  S5: the architecture and development flow of a software configurable processor , 2005, Proceedings. 2005 IEEE International Conference on Field-Programmable Technology, 2005..

[6]  Amano Hideharu,et al.  Ultra Low Power Reconfigurable Accelerator CC-SOTB2 , 2015 .

[7]  Nobuyuki Sugii,et al.  Ultralow-power LSI Technology with Silicon on Thin Buried Oxide (SOTB) CMOSFET , 2010 .

[8]  Mario Konijnenburg,et al.  Reliable and energy-efficient 1MHz 0.4V dynamically reconfigurable SoC for ExG applications in 40nm LP CMOS , 2013, 2013 IEEE International Solid-State Circuits Conference Digest of Technical Papers.

[9]  A. Tsai,et al.  PipeRench: A virtualized programmable datapath in 0.18 micron technology , 2002, Proceedings of the IEEE 2002 Custom Integrated Circuits Conference (Cat. No.02CH37285).

[10]  Hideharu Amano,et al.  Black-Diamond : a Retargetable Compiler using Graph with Configuration Bits for Dynamically Reconfigurable Architectures , 2008 .