Fault-Tolerance Techniques for Soft-Core Processors Using the Trace Interface

As microprocessors are increasingly used in safety-critical applications, there is a growing demand for effective fault-tolerance techniques that can mitigate the effects of soft errors while reducing intrusiveness and minimizing the impact on performance and power consumption. To this purpose, approaches that are based on monitoring the microprocessor operation through an external interface in a non-intrusive manner have recently been proposed. In this paper we focus on the use of the trace interface for on-line monitoring. This interface provides detailed information about the instructions executed by the processor and can be reused to support error detection and correction in several ways, including multi-processors in hardware redundancy, time redundancy and control-flow checking.

[1]  Jurgen Becker,et al.  HETA: Hybrid Error-Detection Technique Using Assertions , 2013, IEEE Transactions on Nuclear Science.

[2]  J R Azambuja,et al.  Detecting SEEs in Microprocessors Through a Non-Intrusive Hybrid Technique , 2011, IEEE Transactions on Nuclear Science.

[3]  Jacob A. Abraham,et al.  CEDA: control-flow error detection through assertions , 2006, 12th IEEE International On-Line Testing Symposium (IOLTS'06).

[4]  Matteo Sonza Reorda,et al.  A new solution to on-line detection of Control Flow Errors , 2014, 2014 IEEE 20th International On-Line Testing Symposium (IOLTS).

[5]  Michael Nicolaidis,et al.  Soft Errors in Modern Electronic Systems , 2010 .

[6]  Suku Nair,et al.  Design and Evaluation of System-Level Checks for On-Line Control Flow Error Detection , 1999, IEEE Trans. Parallel Distributed Syst..

[7]  Alfredo Benso,et al.  A C/C++ source-to-source compiler for dependable applications , 2000, Proceeding International Conference on Dependable Systems and Networks. DSN 2000.

[8]  Martin Hiller,et al.  Executable assertions for detecting data errors in embedded control systems , 2000, Proceeding International Conference on Dependable Systems and Networks. DSN 2000.

[9]  Jacob A. Abraham,et al.  ACCE: Automatic correction of control-flow errors , 2007, 2007 IEEE International Test Conference.

[10]  L. Sterpone,et al.  A New Hybrid Nonintrusive Error-Detection Technique Using Dual Control-Flow Monitoring , 2014, IEEE Transactions on Nuclear Science.

[11]  Edward J. McCluskey,et al.  Concurrent Error Detection Using Watchdog Processors - A Survey , 1988, IEEE Trans. Computers.

[12]  R. Velazco,et al.  Experimentally evaluating an automatic approach for generating safety-critical software with respect to transient errors , 2000 .

[13]  Mario García-Valderas,et al.  Soft Error Sensitivity Evaluation of Microprocessors by Multilevel Emulation-Based Fault Injection , 2012, IEEE Transactions on Computers.

[14]  Luis Entrena,et al.  Efficient Mitigation of Data and Control Flow Errors in Microprocessors , 2013, IEEE Transactions on Nuclear Science.

[15]  Régis Leveugle,et al.  A new approach to control flow checking without program modification , 1991, [1991] Digest of Papers. Fault-Tolerant Computing: The Twenty-First International Symposium.

[16]  Matteo Sonza Reorda,et al.  Exploiting the debug interface to support on-line test of control flow errors , 2013, 2013 IEEE 19th International On-Line Testing Symposium (IOLTS).

[17]  Jürgen Becker,et al.  A Fault Tolerant Approach to Detect Transient Faults in Microprocessors Based on a Non-Intrusive Reconfigurable Hardware , 2012, IEEE Transactions on Nuclear Science.

[18]  R. Leveugle,et al.  IDSM: An improved control flow checking approach with disjoint signature monitoring , 2009 .

[19]  Matteo Sonza Reorda,et al.  On the use of embedded debug features for permanent and transient fault resilience in microprocessors , 2012, Microprocess. Microsystems.

[20]  Marco Torchiano,et al.  Soft-error detection through software fault-tolerance techniques , 1999, Proceedings 1999 IEEE International Symposium on Defect and Fault Tolerance in VLSI Systems (EFT'99).

[21]  Heidrun Engel,et al.  Data flow transformations to detect results which are corrupted by hardware faults , 1996, Proceedings. IEEE High-Assurance Systems Engineering Workshop (Cat. No.96TB100076).

[22]  Matteo Sonza Reorda,et al.  Control flow checking through embedded debug interface , 2011 .

[23]  Sergei Devadze,et al.  FPGA-based synthetic instrumentation for board test , 2012, 2012 IEEE International Test Conference.

[24]  Joel Emer,et al.  A systematic methodology to compute the architectural vulnerability factors for a high-performance microprocessor , 2003, Proceedings. 36th Annual IEEE/ACM International Symposium on Microarchitecture, 2003. MICRO-36..

[25]  Matteo Sonza Reorda,et al.  An on-line fault detection technique based on embedded debug features , 2010, 2010 IEEE 16th International On-Line Testing Symposium.

[26]  Eduardo Chielle,et al.  Evaluating Selective Redundancy in Data-Flow Software-Based Techniques , 2013, IEEE Transactions on Nuclear Science.