New Techniques for Improving the Performance of the Lockstep Architecture for SEEs Mitigation in FPGA Embedded Processors

The growing availability of embedded processors inside FPGAs provides unprecedented flexibility for system designers. The use of such devices for space or mission critical applications, however, is being delayed by the lack of effective low cost techniques to mitigate radiation induced errors. In this paper a non invasive approach for the implementation of fault tolerant systems based on COTS processors embedded in FPGAs, using lockstep in conjunction with checkpoint and rollback recovery, is presented. The proposed approach does not require modifications in the processor architecture or in the application software. The experimental validation of this approach through fault injection is described, the corresponding results are discussed, and the addition of a write history table as a means to reduce the performance overhead imposed by previous implementations is proposed and evaluated.

[1]  Michael Nicolaidis,et al.  Embedded robustness IPs for transient-error-free ICs , 2002, IEEE Design & Test of Computers.

[2]  Robert Baumann,et al.  Soft errors in advanced computer systems , 2005, IEEE Design & Test of Computers.

[3]  Niraj K. Jha,et al.  Fault-tolerant computer system design , 1996, IEEE Parallel & Distributed Technology: Systems & Applications.

[4]  R. Velazco,et al.  Experimentally evaluating an automatic approach for generating safety-critical software with respect to transient errors , 2000 .

[5]  Fabian Vargas,et al.  A new hybrid fault detection technique for systems-on-a-chip , 2006, IEEE Transactions on Computers.

[6]  Edward J. McCluskey,et al.  Control-flow checking by software signatures , 2002, IEEE Trans. Reliab..

[7]  Massimo Violante,et al.  Fault Injection-based Reliability Evaluation of SoPCs , 2006, Eleventh IEEE European Test Symposium (ETS'06).

[8]  L. Sterpone,et al.  A new analytical approach to estimate the effects of SEUs in TMR architectures implemented through SRAM-based FPGAs , 2005, IEEE Transactions on Nuclear Science.

[9]  Edward J. McCluskey,et al.  Concurrent Error Detection Using Watchdog Processors - A Survey , 1988, IEEE Trans. Computers.

[10]  L. Sterpone,et al.  A New Mitigation Approach for Soft Errors in Embedded Processors , 2008, IEEE Transactions on Nuclear Science.

[11]  Tino Heijmen,et al.  Radiation-induced soft errors in digital circuits - A literature survey , 2002 .

[12]  Michel Pignol DMT and DT2: two fault-tolerant architectures developed by CNES for COTS-based spacecraft supercomputers , 2006, 12th IEEE International On-Line Testing Symposium (IOLTS'06).