Criticality-aware scrubbing mechanism for SRAM-based FPGAs

Scrubbing has been considered as an effective mechanism to provide fault-tolerance in Static-RAM (SRAM)-based Field Programmable Gate Arrays (FPGAs). However, the current scrubbing techniques execute without considering the criticality and timing of the user tasks implemented in the FPGA. They often do not execute the scrubbing process in the right instant, which minimizes the probability of each task being executed without transient faults. Moreover, these current solutions are not adapted to the tasks' fault-tolerance requirements, since they may not properly protect the most critical tasks in the system. However, if they do it, they waste resources with the less critical tasks. In this paper, a new scrubbing mechanism is proposed. This new approach adapts the scrubbing mechanism to the tasks' execution, by a proper scheduling and according to their criticality. A proposed heuristic finds a feasible scrubbing schedule for each hardware task. Firstly, the minimum scrubbing periods are computed according to the criticality of each implemented hardware task. Secondly, a proper scrubbing schedule following the EDL (Earliest Deadline as Late as possible) algorithm is found, maximizing the reliability of the system. The experimental results show up to 79% improvements on the system reliability, achieved without wasting scrubbing resources.

[1]  Maryline Chetto,et al.  Some Results of the Earliest Deadline Scheduling Algorithm , 1989, IEEE Transactions on Software Engineering.

[2]  J. Leung,et al.  A Note on Preemptive Scheduling of Periodic, Real-Time Tasks , 1980, Inf. Process. Lett..

[3]  Kaushik Roy,et al.  Soft-Error-Resilient FPGAs Using Built-In 2-D Hamming Product Code , 2012, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[4]  Gabriel L. Nazar,et al.  Accelerated FPGA repair through shifted scrubbing , 2013, 2013 23rd International Conference on Field programmable Logic and Applications.

[5]  Sanjoy K. Baruah,et al.  Towards the Design of Certifiable Mixed-criticality Systems , 2010, 2010 16th IEEE Real-Time and Embedded Technology and Applications Symposium.

[6]  Andrey Bogdanov,et al.  PRESENT: An Ultra-Lightweight Block Cipher , 2007, CHES.

[7]  Charles U. Martel,et al.  On non-preemptive scheduling of period and sporadic tasks , 1991, [1991] Proceedings Twelfth Real-Time Systems Symposium.

[8]  Marco Lanuzza,et al.  A self-hosting configuration management system to mitigate the impact of Radiation-Induced Multi-Bit Upsets in SRAM-based FPGAs , 2010, 2010 IEEE International Symposium on Industrial Electronics.

[9]  Bharadwaj Veeravalli,et al.  Aging-aware hardware-software task partitioning for reliable reconfigurable multiprocessor systems , 2013, 2013 International Conference on Compilers, Architecture and Synthesis for Embedded Systems (CASES).

[10]  Hermann Kopetz,et al.  Real-time systems , 2018, CSC '73.

[11]  Timo Hämäläinen,et al.  A parallel MPEG-4 encoder for FPGA based multiprocessor SoC , 2005, International Conference on Field Programmable Logic and Applications, 2005..

[12]  Dhiraj K. Pradhan,et al.  Matrix Codes for Reliable and Cost Efficient Memory Chips , 2011, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[13]  Rolf Ernst,et al.  Reliability analysis for MPSoCs with mixed-critical, hard real-time constraints , 2011, 2011 Proceedings of the Ninth IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS).

[14]  Michael J. Wirthlin,et al.  FPGA partial reconfiguration via configuration scrubbing , 2009, 2009 International Conference on Field Programmable Logic and Applications.

[15]  M. Caffrey,et al.  Correcting single-event upsets through virtex partial configuration , 2000 .