A model of soft error effects in generic IP processors

When designing reliability-aware digital circuits, either hardware or software techniques may be adopted to provide a certain degree of failure detection/tolerance, caused by either hardware faults or soft-errors. These techniques are quite well established when working at a low abstraction level, whereas are currently under investigation when moving to higher abstraction levels, in order to cope with the increasing complexity of the systems being designed. This paper presents a model of soft error effects to be adopted when defining software-only techniques to achieve fault detection capabilities. The work identifies on a generic IP processor the misbehaviors caused by soft errors, classifies and analyzes them with respect to the possibility of detecting them by means of previously published approaches. An experimental validation of the proposed model is carried out on the Leon2 processor.

[1]  Bogdan Nicolescu,et al.  Detecting Soft Errors by a Purely Software Approach: Method, Tools and Experimental Results , 2003, DATE.

[2]  Seyed Ghassem Miremadi,et al.  Error Detection Enhancement in COTS Superscalar Processors with Performance Monitoring Features , 2004, J. Electron. Test..

[3]  James L. Walsh,et al.  IBM experiments in soft fails in computer electronics (1978-1994) , 1996, IBM J. Res. Dev..

[4]  Marco Torchiano,et al.  An experimental evaluation of the effectiveness of automatic rule-based transformations for safety-critical applications , 2000, Proceedings IEEE International Symposium on Defect and Fault Tolerance in VLSI Systems.

[5]  Massimo Violante,et al.  System safety through automatic high-level code transformations: an experimental evaluation , 2001, Proceedings Design, Automation and Test in Europe. Conference and Exhibition 2001.

[6]  P. R. Croll,et al.  Developing safety-critical software within a CASE environment , 1991 .

[7]  Donatella Sciuto,et al.  Reliable system specification for self-checking data-paths , 2005, Design, Automation and Test in Europe.

[8]  Marco Torchiano,et al.  A source-to-source compiler for generating dependable software , 2001, Proceedings First IEEE International Workshop on Source Code Analysis and Manipulation.

[9]  Bernie Mulgrew,et al.  IEEE International Symposium on Defect and Fault Tolerance in VLSI Systems , 1998 .

[10]  Donatella Sciuto,et al.  Reliable system co-design: the FIR case study , 2004, 19th IEEE International Symposium on Defect and Fault Tolerance in VLSI Systems, 2004. DFT 2004. Proceedings..

[11]  Suku Nair,et al.  Design and Evaluation of System-Level Checks for On-Line Control Flow Error Detection , 1999, IEEE Trans. Parallel Distributed Syst..

[12]  Suku Nair,et al.  Design of a portable control-flow checking technique , 1997, Proceedings 1997 High-Assurance Engineering Workshop.