Non-intrusive fault tolerance in soft processors through circuit duplication

The flexibility introduced by Commercial-Off-The-Shelf (COTS) SRAM based FPGAs in on-board system designs make them an attractive option for military and aerospace applications. However, the advances towards the nanometer technology come together with a higher vulnerability of integrated circuits to radiation perturbations. In mission critical applications it is important to improve the reliability of applications by using fault-tolerance techniques. In this work, a non-intrusive fault tolerance technique has been developed. The proposed technique targets soft processors (e.g. LEON3), and its detection mechanism uses a Bus Monitor to compare output data of a main soft-processor with its redundant module. In case of a mismatch, an error signal is activated, triggering the proposed fault tolerance strategy. This approach shows to be more efficient than the state-of-the-art Triple Modular Redundancy (TMR) and Software Implemented Hardware Fault Tolerance (SIHFT) approaches in order to detect and to correct faults on the fly with low area overhead and with no major performance penalties. The chosen case study is an under development On-Board Computer (OBC) system, conceived to be employed in future missions of the Brazilian Institute of Space Research (INPE).

[1]  Michael Nicolaidis,et al.  Soft Errors in Modern Electronic Systems , 2010 .

[2]  C. Lopez-Ongil,et al.  Autonomous Fault Emulation: A New FPGA-Based Acceleration System for Hardness Evaluation , 2007, IEEE Transactions on Nuclear Science.

[3]  Antonio Dasilva,et al.  LEON3 ViP: A Virtual Platform with Fault Injection Capabilities , 2010, 2010 13th Euromicro Conference on Digital System Design: Architectures, Methods and Tools.

[4]  M.A. Aguirre,et al.  A New Approach to Estimate the Effect of Single Event Transients in Complex Circuits , 2007, IEEE Transactions on Nuclear Science.

[5]  S. P. Gimenez,et al.  Performance of electronic devices submitted to X-rays and high energy proton beams , 2012 .

[6]  N. Medina,et al.  Comparative study of the proton beam effects between the conventional and Circular-Gate MOSFETs , 2012 .

[7]  C. Carmichael,et al.  Single Event Upsets in Xilinx Virtex-4 FPGA Devices , 2006, 2006 IEEE Radiation Effects Data Workshop.

[8]  Vicente Baena,et al.  The Implementation of a FPGA Hardware Debugger System with Minimal System Overhead , 2004, FPL.

[9]  Raoul Velazco,et al.  A Survey on Fault Injection Techniques , 2004, Int. Arab J. Inf. Technol..

[10]  Zdenek Kotásek,et al.  Checker Design for On-line Testing of Xilinx FPGA Communication Protocols , 2007, 22nd IEEE International Symposium on Defect and Fault-Tolerance in VLSI Systems (DFT 2007).

[11]  Sandeep Kumar Pandey,et al.  Dynamic Partial Reconfigurable Embedded System to Achieve Hardware Flexibility Using 8051 Based RTOS on Xilinx FPGA , 2009, 2009 International Conference on Advances in Computing, Control, and Telecommunication Technologies.

[12]  Mario García-Valderas,et al.  Soft Error Sensitivity Evaluation of Microprocessors by Multilevel Emulation-Based Fault Injection , 2012, IEEE Transactions on Computers.

[13]  Zdenek Kotásek,et al.  Modern fault tolerant architectures based on partial dynamic reconfiguration in FPGAs , 2010, 13th IEEE Symposium on Design and Diagnostics of Electronic Circuits and Systems.

[14]  Zdenek Kotásek,et al.  Digital Systems Architectures Based on On-line Checkers , 2008, 2008 11th EUROMICRO Conference on Digital System Design Architectures, Methods and Tools.

[15]  Fabian Vargas,et al.  A new hybrid fault detection technique for systems-on-a-chip , 2006, IEEE Transactions on Computers.

[16]  K. H. Cirne,et al.  X-ray radiation effects in Overlapping Circular-Gate MOSFET's , 2011, 2011 12th European Conference on Radiation and Its Effects on Components and Systems.

[17]  Régis Leveugle,et al.  Reduced Instrumentation and Optimized Fault Injection Control for Dependability Analysis , 2006, 2006 IFIP International Conference on Very Large Scale Integration.

[18]  Ricardo P. Jasinski,et al.  Fault-Tolerance Techniques for SRAM-Based FPGAs , 2007, Comput. J..