Reliability assessment of backward error recovery for SRAM-based FPGAs

Reliability is a major concern for embedded systems. Semiconductor devices used to implement them can suffer from various environmental perturbations. This is more evident when considering SRAM-based FPGA. Perturbations are very frequent and they can limit FPGA's usability. In this paper, a new fault tolerance approach is presented which try to take advantage of partial dynamic reconfiguration provided by SRAM-based FPGAs. The approach is based on the Backward Error Recovery to mitigate faults on the configuration layer by restoring the correct behavior of the application. Fault injection using emulation is used to evaluate the reliability of the proposed fault mitigation technique and its results are compared to those obtained when configuration scrubbing is used. An improvement of up to 12% for reliability and availability of the Design Under Test is observed.

[1]  Ricardo P. Jasinski,et al.  Fault-Tolerance Techniques for SRAM-Based FPGAs , 2007, Comput. J..

[2]  Eric McDonald Runtime FPGA Partial Reconfiguration , 2008, 2008 IEEE Aerospace Conference.

[3]  M. Wirthlin,et al.  Improving FPGA Design Robustness with Partial TMR , 2006, 2006 IEEE International Reliability Physics Symposium Proceedings.

[4]  Gabriel L. Nazar,et al.  Fast single-FPGA fault injection platform , 2012, 2012 IEEE International Symposium on Defect and Fault Tolerance in VLSI and Nanotechnology Systems (DFT).

[5]  Charles E. Stroud,et al.  Built-In Self-Test of Embedded SEU Detection Cores in Virtex-4 and Virtex-5 FPGAs , 2009, ESA.

[6]  John D. Corbett The Xilinx Isolation Design Flow for Fault-Tolerant Systems , 2013 .

[7]  Raoul Velazco,et al.  Reliability limits of TMR implemented in a SRAM-based FPGA: Heavy ion measures vs. fault injection predictions , 2010 .

[8]  Michael J. Wirthlin,et al.  FPGA partial reconfiguration via configuration scrubbing , 2009, 2009 International Conference on Field Programmable Logic and Applications.

[9]  A. Lesea,et al.  Effectiveness of Internal Versus External SEU Scrubbing Mitigation Strategies in a Xilinx FPGA: Design, Test, and Analysis , 2008, IEEE Transactions on Nuclear Science.

[10]  Vincent Rijmen,et al.  The Design of Rijndael: AES - The Advanced Encryption Standard , 2002 .

[11]  M. Caffrey,et al.  Domain Crossing Errors: Limitations on Single Device Triple-Modular Redundancy Circuits in Xilinx FPGAs , 2007, IEEE Transactions on Nuclear Science.

[12]  Sandi Habinc,et al.  Dynamic Partial Reconfiguration in Space Applications , 2009, 2009 NASA/ESA Conference on Adaptive Hardware and Systems.

[13]  Ronald D. Schrimpf,et al.  Radiation Effects in Microelectronics , 2007 .

[14]  M. Caffrey,et al.  Static Proton and Heavy Ion Testing of the Xilinx Virtex-5 Device , 2007, 2007 IEEE Radiation Effects Data Workshop.

[15]  Bertrand Granado,et al.  An efficient BER-based reliability method for SRAM-based FPGA , 2013, 2013 8th IEEE Design and Test Symposium.

[16]  M. Wirthlin,et al.  SEU-induced persistent error propagation in FPGAs , 2005, IEEE Transactions on Nuclear Science.

[17]  Bertrand Granado,et al.  Context-aware resources placement for SRAM-based FPGA to minimize checkpoint/recovery overhead , 2014, 2014 International Conference on ReConFigurable Computing and FPGAs (ReConFig14).