Fault-tolerance in FPGA focusing power reduction or performance enhancement

The purpose of this paper is to present a Fault-Tolerance methodology for FPGA-based designs, focusing power reduction or performance enhancement during on-field operation. The methodology is based on a new performance sensor which predictively detects errors in critical paths, either allowing power-supply voltage (VDD) to be reduced, or clock frequency (fclk) to be raised, driving power reduction or performance increase. The HDL sensor's functionality is defined by the designer, according to the target circuit configuration in the FPGA structure. The adaptive scheme uses an Automatic Voltage and Frequency Controller (AVFC) to modify fclk and/or VDD, while still guaranteeing safe operation. The built-in sensors identify performance deviations in pre-identified critical paths during circuit operation and along product lifetime, caused by parametric variations and/or aging. The fclk increase is made possible by reducing the pessimistic safety-margins defined by standard simulation tools to account for variability. The sensors delay margins are programmable, so an adequate delay margin can guarantee safe operation. Conversely, the same performance can be achieved with lower VDD. Simulation and experimental results with Virtex 5 and Spartan 6 FPGAs show that significant performance improvements (typically, 30%) can be achieved with this methodology.

[1]  Peter Y. K. Cheung,et al.  Achieving low-overhead fault tolerance for parallel accelerators with dynamic partial reconfiguration , 2014, 2014 24th International Conference on Field Programmable Logic and Applications (FPL).

[2]  João Paulo Teixeira,et al.  Modeling the Effect of Process, Power-Supply Voltage and Temperature Variations on the Timing Response of Nanometer Digital Circuits , 2012, J. Electron. Test..

[3]  Wayne Luk,et al.  Dynamic voltage scaling for commercial FPGAs , 2005, Proceedings. 2005 IEEE International Conference on Field-Programmable Technology, 2005..

[4]  João Paulo Teixeira,et al.  Aging-Aware Power or Frequency Tuning With Predictive Fault Detection , 2012, IEEE Design & Test of Computers.

[5]  Peter Y. K. Cheung,et al.  Fault tolerant methods for reliability in FPGAs , 2008, 2008 International Conference on Field Programmable Logic and Applications.

[6]  Erik G. Larsson,et al.  The Impact of Dynamic Voltage and Frequency Scaling on Multicore DSP Algorithm Design [Exploratory DSP] , 2011, IEEE Signal Processing Magazine.

[7]  Ming Zhang,et al.  Circuit Failure Prediction and Its Application to Transistor Aging , 2007, 25th IEEE VLSI Test Symposium (VTS'07).

[8]  John M. Emmert,et al.  A survey of fault tolerant methodologies for FPGAs , 2006, TODE.

[9]  David Blaauw,et al.  Adaptive design for nanometer technology , 2009, 2009 IEEE International Symposium on Circuits and Systems.

[10]  Sachin S. Sapatnekar,et al.  Adaptive techniques for overcoming performance degradation due to aging in digital circuits , 2009, 2009 Asia and South Pacific Design Automation Conference.