CFCET: A hardware-based control flow checking technique in COTS processors using execution tracing

Abstract This paper presents a behavioral-based error detection technique called control flow checking by execution tracing (CFCET) to increase concurrent error detection capabilities of commercial off-the-shelf (COTS) processors. This technique traces the program jumps graph (PJG) at run-time and compares it with the reference jumps graph to detect possible violations caused by transient faults. The reference graph is driven by a preprocessor from the source program. The idea behind the CFCET is based on using an external watchdog processor (WDP) and also the internal execution tracing feature available in COTS processors to monitor the addresses of taken branches in a program, externally. This is done without any modification of application programs, thus, the program overhead is zero. This technique is analytically evaluated based on three different fault models. The results show that the error detection coverage varies between 79.74% and 96.43% depending on the different workload programs. The errors are detected with about zero latency. The external hardware overhead is about 3% using the Altera flex 10K30 FPGA and the execution time overhead is between 33.26% and 140.81% for different workload programs. The overheads have been measured experimentally by executing the workloads on a Pentium system.

[1]  Raphael R. Some,et al.  Experimental evaluation of a COTS system for space applications , 2002, Proceedings International Conference on Dependable Systems and Networks.

[2]  John Paul Shen,et al.  Processor Control Flow Monitoring Using Signatured Instruction Streams , 1987, IEEE Transactions on Computers.

[3]  Henrique Madeira,et al.  On-line signature learning and checking: experimental evaluation , 1991, [1991] Proceedings, Advanced Computer Technology, Reliable Systems and Applications.

[4]  P. R. Croll,et al.  Developing safety-critical software within a CASE environment , 1991 .

[5]  C. O. Newton Design assurance for airborne COTS hardware , 1997 .

[6]  Algirdas Avizienis,et al.  Assessment of the applicability of COTS microprocessors in high-confidence computing systems: a case study , 2000, Proceeding International Conference on Dependable Systems and Networks. DSN 2000.

[7]  E. Trujillo Military requirements constrain COTS utilization , 1995, Proceedings of 14th Digital Avionics Systems Conference.

[8]  Douglas C. Schmidt,et al.  Multiparadigm scheduling for distributed real-time embedded computing , 2003, Proc. IEEE.

[9]  Kristina Lundqvist,et al.  Real-time architecture analysis: a COTS perspective , 2002, Proceedings. The 21st Digital Avionics Systems Conference.

[10]  M. Rimen,et al.  Implicit signature checking , 1995, Twenty-Fifth International Symposium on Fault-Tolerant Computing. Digest of Papers.

[11]  Isabelle Puaut,et al.  Experimental evaluation of the fail-silent behavior of a distributed real-time run-time support built from COTS components , 2001, 2001 International Conference on Dependable Systems and Networks.

[12]  Barry W. Johnson,et al.  Safety-Critical Systems Built with COTS , 1996, Computer.

[13]  Régis Leveugle,et al.  A new approach to control flow checking without program modification , 1991, [1991] Digest of Papers. Fault-Tolerant Computing: The Twenty-First International Symposium.

[14]  Edward J. McCluskey,et al.  Error detection by duplicated instructions in super-scalar processors , 2002, IEEE Trans. Reliab..

[15]  P. Tso,et al.  Improved aircraft readiness through COTS , 1999, 1999 IEEE AUTOTESTCON Proceedings (Cat. No.99CH36323).

[16]  A. Avizienis A fault tolerance infrastructure for high-performance COTS-based computing in dependable space systems , 2004 .

[17]  Edward J. McCluskey,et al.  Control-flow checking by software signatures , 2002, IEEE Trans. Reliab..

[18]  S. Pizzica Meeting military system test requirements with the usage of COTS products , 1998, 17th DASC. AIAA/IEEE/SAE. Digital Avionics Systems Conference. Proceedings (Cat. No.98CH36267).

[19]  Massimo Violante,et al.  Soft-error detection using control flow assertions , 2003, Proceedings 18th IEEE Symposium on Defect and Fault Tolerance in VLSI Systems.

[20]  Seyed Ghassem Miremadi,et al.  Error Detection Enhancement in COTS Superscalar Processors with Performance Monitoring Features , 2004, J. Electron. Test..

[21]  Edward J. McCluskey,et al.  Concurrent Error Detection Using Watchdog Processors - A Survey , 1988, IEEE Trans. Computers.

[22]  Seyed Ghassem Miremadi,et al.  Transient detection in COTS processors using software approach , 2006, Microelectron. Reliab..

[23]  John Paul Shen,et al.  Continuous signature monitoring: low-cost concurrent detection of processor control errors , 1990, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..