Fault Injection for Dependability Validation: A Methodology and Some Applications

The authors address the problem of validating the dependability of fault-tolerant computing systems, in particular, the validation of the fault-tolerance mechanisms. The proposed approach is based on the use of fault injection at the physical level on a hardware/software prototype of the system considered. The place of this approach in a validation-directed design process and with respect to related work on fault injection is clearly identified. The major requirements and problems related to the development and application of a validation methodology based on fault injection are presented and discussed. Emphasis is put on the definition, analysis, and use of the experimental dependability measures that can be obtained. The proposed methodology has been implemented through the realization of a general pin-level fault injection tool (MESSALINE), and its usefulness is demonstrated by the application of MESSALINE to the experimental validation of two systems: a subsystem of a centralized computerized interlocking system for railway control applications and a distributed system corresponding to the current implementation of the dependable communication system of the ESPRIT Delta-4 Project. >

[1]  Jacob A. Abraham Design and evaluation tools for fault-tolerant systems , 1987 .

[2]  William C. Carter Hardware and Software Dependability Evaluation: System Dependability , 1989, IFIP Congress.

[3]  Victor Carreño,et al.  A Fault Behavior Model for an Avionic Microprocessor: A Case Study , 1991 .

[4]  Joe W. Duran,et al.  Capture-Recapture Sampling for Estimating Software Error Content , 1981, IEEE Transactions on Software Engineering.

[5]  Paulo Veríssimo,et al.  The Delta-4 approach to dependability in open distributed computing systems , 1988, [1988] The Eighteenth International Symposium on Fault-Tolerant Computing. Digest of Papers.

[6]  Ravishankar K. Iyer,et al.  Measurement-Based Analysis of Error Latency , 1987, IEEE Transactions on Computers.

[7]  William E. Howden,et al.  Weak Mutation Testing and Completeness of Test Sets , 1982, IEEE Transactions on Software Engineering.

[8]  Johan Karlsson,et al.  Evaluation of error detection schemes using fault injection by heavy-ion radiation , 1989, [1989] The Nineteenth International Symposium on Fault-Tolerant Computing. Digest of Papers.

[9]  Richard J. Lipton,et al.  Hints on Test Data Selection: Help for the Practicing Programmer , 1978, Computer.

[10]  N. Singpurwalla,et al.  Methods for Statistical Analysis of Reliability and Life Data. , 1975 .

[11]  A. Avizienis,et al.  Dependable computing: From concepts to design diversity , 1986, Proceedings of the IEEE.

[12]  Daniel P. Siewiorek,et al.  FIAT-fault injection based automated testing environment , 1988, [1988] The Eighteenth International Symposium on Fault-Tolerant Computing. Digest of Papers.

[13]  Jerald F. Lawless,et al.  Statistical Models and Methods for Lifetime Data , 1983 .

[14]  Kang G. Shin,et al.  Measurement and Application of Fault Latency , 1986, IEEE Transactions on Computers.

[15]  M. Y. Hsiao,et al.  Model for Transient and Permanent Error-Detection and Fault-Isolation Coverage , 1982, IBM J. Res. Dev..

[16]  Ram Chillarege,et al.  Understanding large system failures-a fault injection experiment , 1989, [1989] The Nineteenth International Symposium on Fault-Tolerant Computing. Digest of Papers.

[17]  Edward J. McCluskey,et al.  Executable assertions and flight software , 1984 .

[18]  M. Alidrisi A simulation approach for computing systems reliability , 1987 .

[19]  Jean Arlat,et al.  Fault injection for dependability validation of fault-tolerant computing systems , 1989, [1989] The Nineteenth International Symposium on Fault-Tolerant Computing. Digest of Papers.

[20]  D. R. Powell A hierarchical approach to distributed computer-system dependability evaluation , 1986, J. Syst. Softw..

[21]  W.N. Toy,et al.  Fault-tolerant design of local ESS processors , 1978, Proceedings of the IEEE.

[22]  J. A. Acree On mutation , 1980 .

[23]  W. C. Carter,et al.  Reliability modeling techniques for self-repairing computer systems , 1969, ACM '69.

[24]  Kishor S. Trivedi,et al.  Coverage Modeling for Dependability Analysis of Fault-Tolerant Systems , 1989, IEEE Trans. Computers.