Fault injection for formal testing of fault tolerance

This study addresses the use of fault injection for explicitly removing design/implementation faults in complex fault-tolerance algorithms and mechanisms (FTAM), viz, fault-tolerance deficiency faults. A formalism is introduced to represent the FTAM by a set of assertions. This formalism enables an execution tree to be generated, where each path from the root to a leaf of the tree is a well-defined formula. The set of well-defined formulas constitutes a useful framework that fully characterizes the test sequence. The input patterns of the test sequence (fault and activation domains) then are determined to fewer specific structural criteria over the execution tree (activation of proper sets of paths). This provides a framework for generating a functional deterministic test for programs that implement complex FTAM. This methodology has been used to extend a debugging tool aimed at testing fault tolerance protocols developed by BULL France. It has been applied successfully to the injection of faults in the inter-replica protocol that supports the application-level fault-tolerance features of the architecture of the ESPRIT-funded Delta-4 project. The results of these experiments are analyzed in detail. In particular, even though the target protocol had been independently verified formally, the application of the proposed testing strategy revealed two fault-tolerance deficiency faults.

[1]  Ram Chillarege,et al.  Understanding large system failures-a fault injection experiment , 1989, [1989] The Nineteenth International Symposium on Fault-Tolerant Computing. Digest of Papers.

[2]  Ravishankar K. Iyer,et al.  Simulation of software behavior under hardware faults , 1993, FTCS-23 The Twenty-Third International Symposium on Fault-Tolerant Computing.

[3]  Jean Arlat,et al.  Fault injection for the formal testing of fault tolerance , 1992, [1992] Digest of Papers. FTCS-22: The Twenty-Second International Symposium on Fault-Tolerant Computing.

[4]  Aniello Cimitile,et al.  Reverse engineering: Algorithms for program graph production , 1991, Softw. Pract. Exp..

[5]  Jacob A. Abraham,et al.  FERRARI: A Flexible Software-Based Fault and Error Injection System , 1995, IEEE Trans. Computers.

[6]  J-C. Laprie,et al.  DEPENDABLE COMPUTING AND FAULT TOLERANCE : CONCEPTS AND TERMINOLOGY , 1995, Twenty-Fifth International Symposium on Fault-Tolerant Computing, 1995, ' Highlights from Twenty-Five Years'..

[7]  Chris J. Walter Evaluation and design of an ultra-reliable distributed architecture for fault tolerance , 1990 .

[8]  Dhiraj K. Pradhan,et al.  Fault Injection: A Method for Validating Computer-System Dependability , 1995, Computer.

[9]  Yves Crouzet,et al.  An experimental study on software structural testing: deterministic versus random input generation , 1991, [1991] Digest of Papers. Fault-Tolerant Computing: The Twenty-First International Symposium.

[10]  Michael Diaz,et al.  SEDOS: designing open distributed systems , 1989, IEEE Software.

[11]  P. Reynier,et al.  Active replication in Delta-4 , 1992, [1992] Digest of Papers. FTCS-22: The Twenty-Second International Symposium on Fault-Tolerant Computing.

[12]  Hermann Kopetz,et al.  Dependability: Basic Concepts and Terminology , 1992 .

[13]  Johan Karlsson,et al.  Fault injection into VHDL models: the MEFISTO tool , 1994 .

[14]  Jean Arlat,et al.  Experimental evaluation of the fault tolerance of an atomic multicast system , 1990 .

[15]  Ravishankar K. Iyer,et al.  FOCUS: An Experimental Environment for Fault Sensitivity Analysis , 1992, IEEE Trans. Computers.

[16]  Johan Karlsson,et al.  Evaluation of error detection schemes using fault injection by heavy-ion radiation , 1989, [1989] The Nineteenth International Symposium on Fault-Tolerant Computing. Digest of Papers.

[17]  Jacques Voiron,et al.  Verification of protocol specificaitons using the CESAR system , 1985, PSTV.

[18]  Jean Arlat,et al.  Fault Injection for Dependability Validation: A Methodology and Some Applications , 1990, IEEE Trans. Software Eng..

[19]  Daniel P. Siewiorek,et al.  FIAT-fault injection based automated testing environment , 1988, [1988] The Eighteenth International Symposium on Fault-Tolerant Computing. Digest of Papers.

[20]  David A. Yaskin,et al.  Fault tolerance testing in the Advanced Automation System , 1991, [1991] Digest of Papers. Fault-Tolerant Computing: The Twenty-First International Symposium.

[21]  I. Bey,et al.  Delta-4: A Generic Architecture for Dependable Distributed Computing , 1991, Research Reports ESPRIT.

[22]  Yinong Chen,et al.  Evaluation of deterministic fault injection for fault-tolerant protocol testing , 1991, [1991] Digest of Papers. Fault-Tolerant Computing: The Twenty-First International Symposium.