A novel methodology to increase fault tolerance in autonomous FPGA-based systems

Nowadays Field-Programmable Gate Arrays (FP-GAs) are increasingly used in critical applications. In these scenarios fault tolerance techniques are needed to increase system dependability and lifetime. This paper proposes a novel methodology to achieve autonomous fault tolerance in FPGA-based systems affected by permanent faults. A design flow is defined to help designers to build a system with increased lifetime and availability. The methodology exploits Dynamic Partial Reconfiguration (DPR) to relocate at run-time faulty modules implemented onto the FPGA. A partitioning method is also presented to provide a solution which maximizes the number of permanent faults the system can tolerate. Experimental results highlight the negligible performance degradation introduced by applying the proposed methodology, and the improvements with respect to state-of-the-art solutions.

[1]  Alexander Hofmann,et al.  An FPGA based on-board processor platform for space application , 2012, 2012 NASA/ESA Conference on Adaptive Hardware and Systems (AHS).

[2]  M. Wirthlin,et al.  Fault Tolerant ICAP Controller for High-Reliable Internal Scrubbing , 2008, 2008 IEEE Aerospace Conference.

[3]  Chiara Sandionigi,et al.  Increasing autonomous fault-tolerant FPGA-based systems' lifetime , 2012, 2012 17th IEEE European Test Symposium (ETS).

[4]  Edward J. McCluskey,et al.  Reconfigurable architecture for autonomous self-repair , 2004, IEEE Design & Test of Computers.

[5]  M. Caffrey,et al.  SEU Mitigation Techniques for Virtex FPGAs in Space Applications , 1999 .

[6]  Jörg Henkel,et al.  Test Strategies for Reliable Runtime Reconfigurable Architectures , 2013, IEEE Transactions on Computers.

[7]  Miodrag Potkonjak,et al.  Low overhead fault-tolerant FPGA systems , 1998, IEEE Trans. Very Large Scale Integr. Syst..

[8]  Luigi Carro,et al.  Designing fault-tolerant techniques for SRAM-based FPGAs , 2004, IEEE Design & Test of Computers.

[9]  Hiroyuki Ochi,et al.  Hot-swapping architecture with back-biased testing for mitigation of permanent faults in functional unit array , 2013, 2013 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[10]  Mikel Azkarate-askasua,et al.  A Roadmap for Autonomous Fault-Tolerant Systems , 2010, 2010 Conference on Design and Architectures for Signal and Image Processing (DASIP).

[11]  Edward J. McCluskey,et al.  Permanent fault repair for FPGAs with limited redundant area , 2001, Proceedings 2001 IEEE International Symposium on Defect and Fault Tolerance in VLSI Systems.

[12]  Roberto Ravasio,et al.  Error Correction Codes for Non-Volatile Memories , 2008 .

[13]  André DeHon,et al.  Variation and Aging Tolerance in FPGAs , 2011, Low-Power Variation-Tolerant Design in Nanometer Silicon.

[14]  Luca Sterpone,et al.  On-line testing of permanent radiation effects in reconfigurable systems , 2013, 2013 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[15]  Jie Zhang,et al.  Optimal Partial Reconfiguration for Permanent Fault Recovery on SRAM-Based FPGAs in Space Mission , 2013 .

[16]  Marco D. Santambrogio,et al.  SEU mitigation for sram-based fpgas through dynamic partial reconfiguration , 2007, GLSVLSI '07.

[17]  A. Lesea,et al.  The rosetta experiment: atmospheric soft error rate testing in differing technology FPGAs , 2005, IEEE Transactions on Device and Materials Reliability.

[18]  Niccolò Battezzati,et al.  Reconfigurable Field Programmable Gate Arrays for Mission-Critical Applications , 2014 .

[19]  Paolo Prinetto,et al.  FEMIP: A high performance FPGA-based features extractor & matcher for space applications , 2013, 2013 23rd International Conference on Field programmable Logic and Applications.

[20]  Chiara Sandionigi,et al.  Autonomous Fault-Tolerant Systems onto SRAM-based FPGA Platforms , 2013, J. Electron. Test..

[21]  Khaled Benkrid,et al.  A novel high-performance fault-tolerant ICAP controller , 2012, 2012 NASA/ESA Conference on Adaptive Hardware and Systems (AHS).