Multi-granular Arithmetic in a Coarse-Grain Reconfigurable Architecture

Mismatch between operand width and hardware operation width is a source of energy inefficiency. This work proposes multi-granular arithmetic, which can adapt the hardware operation width to the application, preventing energy being wasted. In particular multi-granular arithmetic in the context of coarse-grain reconfigurable architectures is considered for the operations of addition, accumulation, multiplication, and multiply-accumulation. Using a silicon synthesis-toolflow it is shown that the multi-granular designs can perform narrow width operations, e.g. an 8-by-8 multiplication, much more efficiently than standard full-width circuits. For multiplication the required energy is reduced by up to 15 times under realistic conditions when compared to a full-width 32x32 multiplier.

[1]  Magnus Själander,et al.  An efficient twin-precision multiplier , 2004, IEEE International Conference on Computer Design: VLSI in Computers and Processors, 2004. ICCD 2004. Proceedings..

[2]  James E. Stine Digital Computer Arithmetic Datapath Design Using Verilog HDL , 2003 .

[3]  Anantha Chandrakasan,et al.  Quantifying and enhancing power awareness of VLSI systems , 2001, IEEE Trans. Very Large Scale Integr. Syst..

[4]  Israel Koren Computer arithmetic algorithms , 1993 .

[5]  Bruce A. Wooley,et al.  A Two's Complement Parallel Array Multiplication Algorithm , 1973, IEEE Transactions on Computers.

[6]  Michael Bedford Taylor,et al.  Is dark silicon useful? Harnessing the four horsemen of the coming dark silicon apocalypse , 2012, DAC Design Automation Conference 2012.