Fracturable DSP Block for Multi-context Reconfigurable Architectures

Multi-context architectures like NATURE enable low-power applications to leverage fast context switching for improved energy efficiency and lower area footprint. The NATURE architecture incorporates 16-bit reconfigurable DSP blocks for accelerating arithmetic computations; however, their fixed precision prevents efficient reuse in mixed-width arithmetic circuits. This paper presents an improved DSP block architecture for NATURE, with native support for temporal folding and run-time fracturability. The proposed DSP block can compute multiple sub-width operations in the same clock cycle and can dynamically switch between sub-width and full-width operations in different cycles. The NanoMap tool for mapping circuits onto NATURE is extended to exploit the fracturable multiplier unit incorporated in the DSP block. We demonstrate the efficiency of the proposed dynamically fracturable DSP block by implementing logic-intensive and compute-intensive benchmark applications. Our results illustrate that the fracturable DSP block can achieve a 53.7% reduction in DSP block utilization and a 42.5% reduction in area with a 122.5% reduction in power–delay product (P–D) without exploiting logic folding. We also observe an average reduction of 6.43% in P–D for circuits that utilize NATURE’s temporal folding compared to the existing full precision DSP block in NATURE, leading to highly compact, energy efficient designs.

[1]  Srivaths Ravi,et al.  Satisfiability-based test generation for nonseparable RTL controller-datapath circuits , 2006, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[2]  Wei Zhang,et al.  Reconfigurable DSP block design for dynamically reconfigurable architecture , 2014, 2014 IEEE International Symposium on Circuits and Systems (ISCAS).

[3]  Suhaib A. Fahmy,et al.  Mapping for Maximum Performance on FPGA DSP Blocks , 2016, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[4]  Vaughn Betz,et al.  VPR: A new packing, placement and routing tool for FPGA research , 1997, FPL.

[5]  Paolo Ienne,et al.  A flexible DSP block to enhance FPGA arithmetic performance , 2009, 2009 International Conference on Field-Programmable Technology.

[6]  Rached Tourki,et al.  Efficient Large Numbers Karatsuba-Ofman Multiplier Designs for Embedded Systems , 2009 .

[7]  Paolo Ienne,et al.  Highly Versatile DSP Blocks for Improved FPGA Arithmetic Performance , 2010, 2010 18th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines.

[8]  Douglas L. Maskell,et al.  The iDEA DSP Block-Based Soft Processor for FPGAs , 2014, TRETS.

[9]  Wei Zhang,et al.  A hybrid Nano/CMOS dynamically reconfigurable system—Part II: Design optimization flow , 2009, JETC.

[10]  Wei Zhang,et al.  A hybrid nano/CMOS dynamically reconfigurable system—Part I: Architecture , 2009, JETC.

[11]  Ian Vince McLoughlin,et al.  Square-rich fixed point polynomial evaluation on FPGAs , 2014, FPGA.

[12]  Wei Zhang,et al.  FDR 2.0: A Low-Power Dynamically Reconfigurable Architecture and Its FinFET Implementation , 2015, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[13]  Niraj K. Jha,et al.  Hierarchical test generation and design for testability methods for ASPPs and ASIPs , 1999, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..