Evaluation of a new low cost software level fault tolerance technique to cope with soft errors

Increasing soft error rates make the protection of combinational logic against transient faults in future technologies a major issue for the fault tolerance community. Since not every transient fault leads to an error at application level, software level fault tolerance has been proposed by several authors as a better approach. In this paper, a new software level technique to detect and correct errors due to transient faults is proposed and compared to a classic one, and the costs of detection and correction for both approaches are compared and discussed.

[1]  Dhiraj K. Pradhan,et al.  Fault-tolerant computer system design , 1996 .

[2]  Stephen B. Wicker,et al.  Reed-Solomon Codes and Their Applications , 1999 .

[3]  Luigi Carro,et al.  System Level Approaches for Mitigation of Long Duration Transient Faults in Future Technologies , 2007, 12th IEEE European Test Symposium (ETS'07).

[4]  P. Dodd,et al.  Production and propagation of single-event transients in high-speed digital logic ICs , 2004, IEEE Transactions on Nuclear Science.

[5]  Cecilia Metra,et al.  Multiple transient faults in logic: an issue for next generation ICs? , 2005, 20th IEEE International Symposium on Defect and Fault Tolerance in VLSI Systems (DFT'05).

[6]  Dhiraj K. Pradhan,et al.  Single element correction in sorting algorithms with minimum delay overhead , 2009, 2009 10th Latin American Test Workshop.