Repair of FPGA-Based Real-Time Systems With Variable Slacks

Field-programmable gate arrays (FPGAs) based on SRAM cells are an attractive alternative for real-time system designers, as they offer high density, low cost, and high performance. The use of SRAM cells in the FPGA’s configuration memory, while enabling these desirable characteristics, also creates a reliability hazard as RAM cells are susceptible to single-event upsets (SEUs). The usual approach is the use of double or triple redundancy allied with a correction mechanism, such as periodic scrubbing. Although scrubbing is an effective technique to remove SEU-induced errors, the repair of real-time systems presents specific challenges, such as avoiding failures by missing real-time deadlines. In this article, a novel approach is proposed to use a deadline-aware scrubbing scheme with negligible area costs that dynamically chooses the scrubbing starting position. Such a scheme allows us to avoid missing real-time deadlines while maximizing the repair probability given a bounded repair time. Our approach reduces the failure rate, considering the probability of missing deadlines due to faults, by 33.39% on average, with an average area cost of 1.23%.

[1]  Gabriel L. Nazar,et al.  Fine-grained error detection techniques for fast repair of FPGAs , 2013 .

[2]  Gabriel L. Nazar,et al.  Fast single-FPGA fault injection platform , 2012, 2012 IEEE International Symposium on Defect and Fault Tolerance in VLSI and Nanotechnology Systems (DFT).

[3]  Chen Wei Tseng,et al.  Correcting Single-Event Upsets in Virtex-II Platform FPGA Configuration Memory , 2007 .

[4]  Luca Sterpone,et al.  Recovery Time and Fault Tolerance Improvement for Circuits mapped on SRAM-based FPGAs , 2014, J. Electron. Test..

[5]  Luca Sterpone,et al.  On the optimal reconfiguration times for TMR circuits on SRAM based FPGAs , 2013, 2013 NASA/ESA Conference on Adaptive Hardware and Systems (AHS-2013).

[6]  M. Caffrey,et al.  Correcting single-event upsets through virtex partial configuration , 2000 .

[7]  Akash Kumar,et al.  Dynamically adaptive scrubbing mechanism for improved reliability in reconfigurable embedded systems , 2015, 2015 52nd ACM/EDAC/IEEE Design Automation Conference (DAC).

[8]  Mihalis Psarakis,et al.  Design and implementation of a self-healing processor on SRAM-based FPGAs , 2014, 2014 IEEE International Symposium on Defect and Fault Tolerance in VLSI and Nanotechnology Systems (DFT).

[9]  Gabriel L. Nazar,et al.  Accelerated FPGA repair through shifted scrubbing , 2013, 2013 23rd International Conference on Field programmable Logic and Applications.

[10]  Sanjay Ranka,et al.  Dynamic Reconfiguration in Real-Time Systems , 2013 .

[11]  Wayne Luk,et al.  Design Optimizations for Tiled Partially Reconfigurable Systems , 2011, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[12]  Mihalis Psarakis,et al.  Combining checkpointing and scrubbing in FPGA-based real-time systems , 2013, 2013 IEEE 31st VLSI Test Symposium (VTS).

[13]  L. Sterpone,et al.  A new analytical approach to estimate the effects of SEUs in TMR architectures implemented through SRAM-based FPGAs , 2005, IEEE Transactions on Nuclear Science.

[14]  Gabriel L. Nazar,et al.  Fine-Grained Fast Field-Programmable Gate Array Scrubbing , 2015, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[15]  Akash Kumar,et al.  Generic scrubbing-based architecture for custom error correction algorithms , 2015, 2015 International Symposium on Rapid System Prototyping (RSP).

[16]  Sanjay Ranka,et al.  Dynamic Reconfiguration in Real-Time Systems: Energy, Performance, and Thermal Perspectives , 2012 .

[17]  J. von Neumann,et al.  Probabilistic Logic and the Synthesis of Reliable Organisms from Unreliable Components , 1956 .

[18]  Maya Gokhale,et al.  Dynamic reconfiguration for management of radiation-induced faults in FPGAs , 2004, 18th International Parallel and Distributed Processing Symposium, 2004. Proceedings..

[19]  Gabriel L. Nazar Improving FPGA repair under real-time constraints , 2015, Microelectron. Reliab..

[20]  Gabriel L. Nazar,et al.  Low Cost Dynamic Scrubbing for Real-Time Systems , 2016, ARC.

[21]  Chiara Sandionigi,et al.  A Novel Design Methodology for Implementing Reliability-Aware Systems on SRAM-Based FPGAs , 2011, IEEE Transactions on Computers.

[22]  Mihalis Psarakis,et al.  Fault tolerant FPGA processor based on runtime reconfigurable modules , 2012, 2012 17th IEEE European Test Symposium (ETS).