Going Beyond TMR for Protection Against Multiple Faults

Future technologies will present devices so small that they will be heavily influenced by electromagnetic noise and SEU induced errors. Since many soft errors might appear at the same time, classical fault tolerance techniques, such as TMR, will no longer provide reliable protection and will make new design approaches necessary. This study shows that the TMR approach has intrinsic weaknesses that impair its effectiveness in the presence of multiple faults, and proposes a new technique that provides better protection than TMR for single as well as multiple faults. The proposed technique is based on the use of some analog components among the digital circuits. We present results based on a multiplier, and show that the technique is scalable to withstand higher quantities of simultaneous faults.

[1]  Lorena Anghel,et al.  Cost reduction and evaluation of temporary faults detecting technique , 2000, DATE '00.

[2]  Luigi Carro,et al.  Arithmetic operators robust to multiple simultaneous upsets , 2004 .

[3]  Luigi Carro,et al.  Designing fault tolerant systems into SRAM-based FPGAs , 2003, Proceedings 2003. Design Automation Conference (IEEE Cat. No.03CH37451).

[4]  Massimo Violante,et al.  Coping with SEUs/SETs in microprocessors by means of low-cost solutions: a comparative study and experimental results , 2001 .

[5]  Massimo Violante,et al.  Coping with SEUs/SETs in microprocessors by means of low-cost solutions: a comparison study , 2002 .

[6]  Heinrich Theodor Vierhaus,et al.  Online Check and Recovery Techniques for Dependable Embedded Processors , 2001, IEEE Micro.

[7]  G. C. Messenger A summary review of displacement damage from high energy radiation in semiconductors and semiconductor devices , 1991, RADECS 91 First European Conference on Radiation and its Effects on Devices and Systems.

[8]  M. Nicolaidis,et al.  Evaluation of a soft error tolerance technique based on time and/or space redundancy , 2000, Proceedings 13th Symposium on Integrated Circuits and Systems Design (Cat. No.PR00843).

[9]  Yervant Zorian,et al.  2003 technology roadmap for semiconductors , 2004, Computer.

[10]  Luigi Carro,et al.  Arithmetic operators robust to multiple simultaneous upsets , 2004, 19th IEEE International Symposium on Defect and Fault Tolerance in VLSI Systems, 2004. DFT 2004. Proceedings..

[11]  Luigi Carro,et al.  An intrinsically robust technique for fault tolerance under multiple upsets , 2004, Proceedings. 10th IEEE International On-Line Testing Symposium.

[12]  Resve Saleh,et al.  Simulation and analysis of transient faults in digital circuits , 1992 .

[13]  Yusuf Leblebici,et al.  A highly fault tolerant PLA architecture for failure-prone nanometer CMOS and novel quantum device technologies , 2004, 19th IEEE International Symposium on Defect and Fault Tolerance in VLSI Systems, 2004. DFT 2004. Proceedings..

[14]  Jacob A. Abraham,et al.  On-line error detecting constant delay adder , 2003, 9th IEEE On-Line Testing Symposium, 2003. IOLTS 2003..

[15]  M. Nicolaidis,et al.  Cost reduction and evaluation of a temporary faults detecting technique , 2000, Proceedings Design, Automation and Test in Europe Conference and Exhibition 2000 (Cat. No. PR00537).

[16]  Luigi Carro,et al.  CACO-PS: a general purpose cycle-accurate configurable power simulator , 2003, 16th Symposium on Integrated Circuits and Systems Design, 2003. SBCCI 2003. Proceedings..

[17]  R. Ramaswami,et al.  Book Review: Design and Analysis of Fault-Tolerant Digital Systems , 1990 .

[19]  Barry W. Johnson Design & analysis of fault tolerant digital systems , 1988 .

[20]  Cristian Constantinescu,et al.  Trends and Challenges in VLSI Circuit Reliability , 2003, IEEE Micro.

[21]  Melvin A. Breuer,et al.  Defect and error tolerance in the presence of massive numbers of defects , 2004, IEEE Design & Test of Computers.