Loop Unrolling for Energy Efficiency in Low-Cost Field-Programmable Gate Arrays

Field-programmable gate arrays (FPGAs) are used for a wide variety of computations in low-cost embedded systems. Although these systems often have modest performance constraints, their energy consumption must typically be limited. Many FPGA applications employ repetitive loops that cannot be straightforwardly split into parallel computations. Performing a loop sequentially generally requires high-speed clocks that consume considerable clock power and sometimes require clock generation using a phase-locked loop (PLL). Loop unrolling addresses the high-speed clock issue, but its use often leads to significant combinational glitch power. In this work, a computer-aided design (CAD) approach that unrolls loops for designs targeted to low-cost FPGAs is described. Our approach considers latency constraints in an effort to minimize energy consumption for loop-based computation. To reduce glitch power, a glitch-filtering approach is introduced that provides a balance between glitch reduction and design performance. Glitch-filter enable signals are generated and routed to the filters using resources best suited to the target FPGA. Our approach automatically inserts glitch filters and associated control logic into a design prior to processing with FPGA synthesis, place, and route tools. Our energy-saving loop-unrolling approach has been evaluated using five benchmarks often used in low-cost FPGAs. The energy-saving capabilities of the approach have been evaluated for an Intel Cyclone IV and a Xilinx Artix-7 FPGA using board-level power measurement. The use of unrolling and glitch filtering is shown to reduce energy by at least 65% for an Artix-7 device and 50% for a Cyclone IV device while meeting design latency constraints.

[1]  Stephen Dean Brown,et al.  Using Negative Edge Triggered FFs to Reduce Glitching Power in FPGA Circuits , 2007, 2007 44th ACM/IEEE Design Automation Conference.

[2]  Todor Stefanov,et al.  Optimal Loop Unrolling and Shifting for Reconfigurable Architectures , 2009, TRETS.

[3]  David Blaauw,et al.  In situ delay-slack monitor for high-performance processors using an all-digital self-calibrating 5ps resolution time-to-digital converter , 2010, 2010 IEEE International Solid-State Circuits Conference - (ISSCC).

[4]  Eduardo Boemo,et al.  Power estimations vs. power measurements in Spartan-6 devices , 2014, 2014 IX Southern Conference on Programmable Logic (SPL).

[5]  Jason Helge Anderson,et al.  Towards PVT-Tolerant Glitch-Free Operation in FPGAs , 2016, FPGA.

[6]  Csaba Andras Moritz,et al.  Parallelizing applications into silicon , 1999, Seventh Annual IEEE Symposium on Field-Programmable Custom Computing Machines (Cat. No.PR00375).

[7]  Anthony Collins Agile Mixed Signal Addresses Analog Design Challenges , 2011 .

[8]  Enric Musoll,et al.  Low-Power Array Multipliers with Transition-Retaining Barriers , 1995 .

[9]  Wayne Luk,et al.  The Impact of Pipelining on Energy per Operation in Field-Programmable Gate Arrays , 2004, FPL.

[10]  Martin D. F. Wong,et al.  A Routing Approach to Reduce Glitches in Low Power FPGAs , 2009, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[11]  Naehyuck Chang,et al.  Flip-flop insertion with shifted-phase clocks for FPGA power reduction , 2005, ICCAD-2005. IEEE/ACM International Conference on Computer-Aided Design, 2005..

[12]  Gabriel Caffarena,et al.  Tracking the pipelining-power rule along the FPGA technical literature , 2013, FPGAworld.

[13]  Jason Smith,et al.  The SIMON and SPECK lightweight block ciphers , 2015, 2015 52nd ACM/EDAC/IEEE Design Automation Conference (DAC).

[14]  Jason Helge Anderson,et al.  FPGA glitch power analysis and reduction , 2011, IEEE/ACM International Symposium on Low Power Electronics and Design.

[15]  Daniel E. Holcomb,et al.  Energy Efficient Loop Unrolling for Low-Cost FPGAs , 2017, 2017 IEEE 25th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM).

[16]  Daniel Holcomb,et al.  Energy Optimization of Unrolled Block Ciphers Using Combinational Checkpointing , 2016, RFIDSec.

[17]  Jinyuan Wu,et al.  Several Key Issues on Implementing Delay Line Based TDCs Using FPGAs , 2009, IEEE Transactions on Nuclear Science.

[18]  David Bol,et al.  Towards Green Cryptography: A Comparison of Lightweight Ciphers from the Energy Viewpoint , 2012, CHES.

[19]  Nathaniel Rollins Reducing Power in FPGA Designs Through Glitch Reduction , 2007 .

[20]  Chirag Ravishankar,et al.  FPGA Power Reduction by Guarded Evaluation Considering Logic Architecture , 2012, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[21]  Andrey Bogdanov,et al.  Round gating for low energy block ciphers , 2016, 2016 IEEE International Symposium on Hardware Oriented Security and Trust (HOST).

[22]  D. Coburn,et al.  High Speed Optical Wavefront Sensing with Low Cost FPGAs , 2008 .

[23]  Simon Heron,et al.  Encryption: Advanced Encryption Standard (AES) , 2009 .

[24]  Ray Andraka,et al.  A survey of CORDIC algorithms for FPGA based computers , 1998, FPGA '98.

[25]  Guy Lemieux,et al.  GlitchLess: Dynamic Power Minimization in FPGAs Through Edge Alignment and Glitch Filtering , 2008, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[26]  Pedro C. Diniz,et al.  Performance and area modeling of complete FPGA designs in the presence of loop transformations , 2003, 11th Annual IEEE Symposium on Field-Programmable Custom Computing Machines, 2003. FCCM 2003..