Error Flow Model: Modeling and Analysis of Software Propagating Hardware Faults

Neither reliability models in reliability engineering nor in software reliability can be directly applied to describe the propagation of hardware errors in programs. This paper first sets up a computational data flow model, and then explains that a computational data flow graph for the program can be built, using the instruction set of URM (unlimited register machine) as an example. Upon the computational data flow model, the error flow model is set up. Errors are categorized into two kinds: Original errors and propagated errors. By analyzing the propagation rules of these two kinds of errors, 6 assumptions about error propagation are given, upon which the probabilities of errors at any time and at any place in a program can be calculated. At last, a sample of URM program is given to demonstrate the capability of the fault flow model.

[1]  Algirdas Avizienis,et al.  The N-Version Approach to Fault-Tolerant Software , 1985, IEEE Transactions on Software Engineering.

[2]  Edward J. McCluskey,et al.  Software implemented hardware fault tolerance , 2000 .

[3]  Brian Randell,et al.  System structure for software fault tolerance , 1975, IEEE Transactions on Software Engineering.

[4]  Edward J. McCluskey,et al.  Error detection by duplicated instructions in super-scalar processors , 2002, IEEE Trans. Reliab..

[5]  Massimo Violante,et al.  Software-Implemented Hardware Fault Tolerance , 2010 .

[6]  Jason R. C. Patterson,et al.  Accurate static branch prediction by value range propagation , 1995, PLDI '95.

[7]  Jacob A. Abraham,et al.  Algorithm-Based Fault Tolerance for Matrix Operations , 1984, IEEE Transactions on Computers.

[8]  Theme Feature Toward Systematic Design of Fault- Tolerant Systems , 1997 .

[9]  Dhiraj K. Pradhan,et al.  Fault Injection: A Method for Validating Computer-System Dependability , 1995, Computer.

[10]  Todd M. Austin,et al.  The SimpleScalar tool set, version 2.0 , 1997, CARN.

[11]  R. Velazco,et al.  Experimentally evaluating an automatic approach for generating safety-critical software with respect to transient errors , 2000 .

[12]  Michael R. Lyu,et al.  A Unified Scheme of Some Nonhomogenous Poisson Process Models for Software Reliability Estimation , 2003, IEEE Trans. Software Eng..

[13]  James L. Walsh,et al.  IBM experiments in soft fails in computer electronics (1978-1994) , 1996, IBM J. Res. Dev..

[14]  Marco Torchiano,et al.  A source-to-source compiler for generating dependable software , 2001, Proceedings First IEEE International Workshop on Source Code Analysis and Manipulation.

[15]  James R. Larus,et al.  Static branch frequency and program profile analysis , 1994, MICRO 27.

[16]  Jeff Tian,et al.  Integrating Time Domain and Input Domain Analyses of Software Reliability Using Tree-Based Models , 1995, IEEE Trans. Software Eng..

[17]  Michael D. Smith,et al.  Static correlated branch prediction , 1999, TOPL.

[18]  R. D. Gerke,et al.  Use of commercial off-the-shelf (COTS) for space applications , 2003 .

[19]  Edward J. McCluskey,et al.  ED4I: Error Detection by Diverse Data and Duplicated Instructions , 2002, IEEE Trans. Computers.