SIED: software implemented error detection

This paper presents a new error detection technique called software implemented error detection (SIED). The proposed method is based on a new control check flow scheme combined with software redundancy. The distinctive advantage of the SIED approach over other fault tolerance techniques is the fault coverage. SIED is able to cope with faults affecting data and the program control flow. By-applying the proposed approach on several benchmark programs, we evaluate the error detection capabilities by means of several fault injection experiments. Experimental results underline very good error detection capabilities for the obtained hardened version of selected benchmark programs.

[1]  Johan Karlsson,et al.  Two software techniques for on-line error detection , 1992, [1992] Digest of Papers. FTCS-22: The Twenty-Second International Symposium on Fault-Tolerant Computing.

[2]  Yvon Savaria,et al.  Reducing fault sensitivity of microprocessor-based systems by modifying workload structure , 1998, Proceedings 1998 IEEE International Symposium on Defect and Fault Tolerance in VLSI Systems (Cat. No.98EX223).

[3]  Raoul Velazco,et al.  Injecting bit flip faults by means of a purely software approach: a case studied , 2002, 17th IEEE International Symposium on Defect and Fault Tolerance in VLSI Systems, 2002. DFT 2002. Proceedings..

[4]  Edward J. McCluskey,et al.  ED4I: Error Detection by Diverse Data and Duplicated Instructions , 2002, IEEE Trans. Computers.

[5]  Bogdan Nicolescu,et al.  Detecting Soft Errors by a Purely Software Approach: Method, Tools and Experimental Results , 2003, DATE.

[6]  A.L. Hopkins,et al.  FTMP—A highly reliable fault-tolerant multiprocess for aircraft , 1978, Proceedings of the IEEE.

[7]  Algirdas Avizienis,et al.  The N-Version Approach to Fault-Tolerant Software , 1985, IEEE Transactions on Software Engineering.

[8]  Stephen S. Yau,et al.  An Approach to Concurrent Control Flow Checking , 1980, IEEE Transactions on Software Engineering.

[9]  Raoul Velazco,et al.  Two CMOS memory cells suitable for the design of SEU-tolerant VLSI circuits , 1994 .

[10]  Edward J. McCluskey,et al.  Control-flow checking by software signatures , 2002, IEEE Trans. Reliab..

[11]  Brian Randell,et al.  System structure for software fault tolerance , 1975, IEEE Transactions on Software Engineering.

[12]  Jr. Leonard R. Rockett An SEU-hardened CMOS data latch design , 1988 .

[13]  Henrique Madeira,et al.  On-Line Signature Learning and Checking , 1992 .

[14]  S. Whitaker,et al.  Low power SEU immune CMOS memory circuits , 1992 .

[15]  Edward J. McCluskey,et al.  Control-Flow Checking Using Watchdog Assists and Extended-Precision Checksums , 1990, IEEE Trans. Computers.

[16]  Masood Namjoo,et al.  Techniques for Concurrent Testing of VLSI Processor Operation , 1982, ITC.

[17]  John Paul Shen,et al.  On-Line Self-Monitoring Using Signatured Instruction Streams , 1983, International Test Conference.

[18]  R. Velazco,et al.  Experimentally evaluating an automatic approach for generating safety-critical software with respect to transient errors , 2000 .

[19]  M. Sonza-Reorda,et al.  A software fault tolerance method for safety-critical systems: effectiveness and drawbacks , 2002, Proceedings. 15th Symposium on Integrated Circuits and Systems Design.

[20]  T. P. Ma,et al.  Ionizing radiation effects in MOS devices and circuits , 1989 .

[21]  Michael Nicolaidis Time redundancy based soft-error tolerance to rescue nanometer technologies , 1999, Proceedings 17th IEEE VLSI Test Symposium (Cat. No.PR00146).

[22]  Brian Randell System structure for software fault tolerance , 1975 .