A Kernel-Based Communication Fault Injector for Dependability Testing of Distributed Systems

Software-implemented fault injection is a powerful strategy to test fault-tolerant protocols in distributed environments. In this paper, we present ComFIRM, a communication fault injection tool we developed which minimizes the probe effect on the tested protocols. ComFIRM explores the possibility to insert code directly inside the Linux kernel in the lowest level of the protocol stack through the load of modules. The tool injects faults directly into the message exchange subsystem, allowing the definition of test scenarios from a wide fault model that can affect messages being sent and/or received. Additionally, the tool is demonstrated in an experiment which applies the fault injector to evaluate the behavior of a group membership service under communication faults.

[1]  Magnus,et al.  Linux Kernel Internals with Cdrom , 1997 .

[2]  Xavier Défago,et al.  Group communication based on standard interfaces , 2003, Second IEEE International Symposium on Network Computing and Applications, 2003. NCA 2003..

[3]  Kang G. Shin,et al.  Software fault injection and its application in distributed systems , 1993, FTCS-23 The Twenty-Third International Symposium on Fault-Tolerant Computing.

[4]  Eitan Farchi,et al.  Automatic simulation of network problems in UDP-based Java programs , 2004, 18th International Parallel and Distributed Processing Symposium, 2004. Proceedings..

[5]  Kang G. Shin,et al.  DOCTOR: an integrated software fault injection environment for distributed real-time systems , 1995, Proceedings of 1995 IEEE International Computer Performance and Dependability Symposium.

[6]  Flaviu Cristian,et al.  Understanding fault-tolerant distributed systems , 1991, CACM.

[7]  Farnam Jahanian,et al.  Probing and fault injection of protocol implementations , 1995, Proceedings of 15th International Conference on Distributed Computing Systems.

[8]  Luigi Rizzo,et al.  Dummynet: a simple approach to the evaluation of network protocols , 1997, CCRV.

[9]  Farnam Jahanian,et al.  Testing of fault-tolerant and real-time distributed systems via protocol fault injection , 1996, Proceedings of Annual Symposium on Fault Tolerant Computing.

[10]  Robert Magnus,et al.  Linux Kernel Internals , 1996 .

[11]  Gabriela Jacques-Silva,et al.  FIONA: a fault injector for dependability evaluation of Java-based network applications , 2004, Third IEEE International Symposium on Network Computing and Applications, 2004. (NCA 2004). Proceedings..

[12]  Bela Ban JavaGroups-Group communication patterns in Java , 1998 .

[13]  Mark Carson,et al.  NIST Net: a Linux-based network emulation tool , 2003, CCRV.

[14]  张哉根,et al.  Leu-M , 1991 .