A software-based concurrent error detection technique for power PC processor-based embedded systems

This paper presents a behavior-based error detection technique called control flow checking using branch trace exceptions for powerPC processors family (CFCBTE). This technique is based on the branch trace exception feature available in the powerPC processors family for debugging purposes. This technique traces the target addresses of program branches at run-time and compares them with reference target addresses to detect possible violations caused by transient faults. The reference target addresses are derived by a preprocessor from the source program. The proposed technique is experimentally evaluated on a 32-bit powerPC microcontroller using software implemented fault injection (SWIFI). The results show that this technique detects about 91% of the injected control flow errors. The memory overhead is 39.16% on average, and the performance overhead varies between 110% and 304% depending on the workload used. This technique does not modify the program source code.

[1]  John Paul Shen,et al.  Processor Control Flow Monitoring Using Signatured Instruction Streams , 1987, IEEE Transactions on Computers.

[2]  M. Sonza-Reorda,et al.  A software fault tolerance method for safety-critical systems: effectiveness and drawbacks , 2002, Proceedings. 15th Symposium on Integrated Circuits and Systems Design.

[3]  Raphael R. Some,et al.  Experimental evaluation of a COTS system for space applications , 2002, Proceedings International Conference on Dependable Systems and Networks.

[4]  P. R. Croll,et al.  Developing safety-critical software within a CASE environment , 1991 .

[5]  Edward J. McCluskey,et al.  Error detection by duplicated instructions in super-scalar processors , 2002, IEEE Trans. Reliab..

[6]  Johan Karlsson,et al.  A comparison of simulation based and scan chain implemented fault injection , 1998, Digest of Papers. Twenty-Eighth Annual International Symposium on Fault-Tolerant Computing (Cat. No.98CB36224).

[7]  A. Avizienis A fault tolerance infrastructure for high-performance COTS-based computing in dependable space systems , 2004 .

[8]  Edward J. McCluskey,et al.  Concurrent Error Detection Using Watchdog Processors - A Survey , 1988, IEEE Trans. Computers.

[9]  Jan Torin,et al.  Evaluating processor-behavior and three error-detection mechanisms using physical fault-injection , 1995 .

[10]  Massimo Violante,et al.  Soft-error detection using control flow assertions , 2003, Proceedings 18th IEEE Symposium on Defect and Fault Tolerance in VLSI Systems.

[11]  Isabelle Puaut,et al.  Experimental evaluation of the fail-silent behavior of a distributed real-time run-time support built from COTS components , 2001, 2001 International Conference on Dependable Systems and Networks.

[12]  Suku Nair,et al.  Design and Evaluation of System-Level Checks for On-Line Control Flow Error Detection , 1999, IEEE Trans. Parallel Distributed Syst..

[13]  Douglas C. Schmidt,et al.  Multiparadigm scheduling for distributed real-time embedded computing , 2003, Proc. IEEE.

[14]  Edward J. McCluskey,et al.  Control-flow checking by software signatures , 2002, IEEE Trans. Reliab..

[15]  Seyed Ghassem Miremadi,et al.  Transient detection in COTS processors using software approach , 2006, Microelectron. Reliab..

[16]  Jacob A. Abraham,et al.  Evaluation of integrated system-level checks for on-line error detection , 1996, Proceedings of IEEE International Computer Performance and Dependability Symposium.

[17]  John P. Hayes,et al.  Low-cost on-line fault detection using control flow assertions , 2003, 9th IEEE On-Line Testing Symposium, 2003. IOLTS 2003..