HSFI: Accurate Fault Injection Scalable to Large Code Bases

When software fault injection is used, faults are typically inserted at the binary or source level. The former is fast but provides poor fault accuracy while the latter cannot scale to large code bases because the program must be rebuilt for each experiment. Alternatives that avoid rebuilding incur large run-time overheads by applying fault injection decisions at run-time. HSFI, our new design, injects faults with all context information from the source level and applies fault injection decisions efficiently on the binary. It places markers in the original code that can be recognized after code generation. We implemented a tool according to the new design and evaluated the time taken per fault injection experiment when using operating systems as targets. We can perform experiments more quickly than other source-based approaches, achieving performance that come close to that of binary-level fault injection while retaining the benefits of source-level fault injection.

[1]  Marco Vieira,et al.  On the emulation of software faults by software fault injection , 2000, Proceeding International Conference on Dependable Systems and Networks. DSN 2000.

[2]  Neeraj Suri,et al.  No PAIN, No Gain? The Utility of PArallel Fault INjections , 2015, 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering.

[3]  Ravishankar K. Iyer,et al.  FINE: A Fault Injection and Monitoring Environment for Tracing the UNIX System Behavior under Faults , 1993, IEEE Trans. Software Eng..

[4]  Domenico Cotroneo,et al.  Experimental Analysis of Binary-Level Software Fault Injection in Complex Software , 2012, 2012 Ninth European Dependable Computing Conference.

[5]  Domenico Cotroneo,et al.  Testing techniques selection based on ODC fault types and software metrics , 2013, J. Syst. Softw..

[6]  Marco Vieira,et al.  Adaptive Failure Prediction for Computer Systems: A Framework and a Case Study , 2015, 2015 IEEE 16th International Symposium on High Assurance Systems Engineering.

[7]  Ram Chillarege,et al.  Generation of an error set that emulates software faults based on field data , 1996, Proceedings of Annual Symposium on Fault Tolerant Computing.

[8]  Vikram S. Adve,et al.  LLVM: a compilation framework for lifelong program analysis & transformation , 2004, International Symposium on Code Generation and Optimization, 2004. CGO 2004..

[9]  Bjarne Stroustrup,et al.  Rejuvenating C++ programs through demacrofication , 2012, 2012 28th IEEE International Conference on Software Maintenance (ICSM).

[10]  Henrique Madeira,et al.  Characterization of operating systems behavior in the presence of faulty drivers through software fault emulation , 2002, 2002 Pacific Rim International Symposium on Dependable Computing, 2002. Proceedings..

[11]  Koushik Sen,et al.  PREFAIL: a programmable tool for multiple-failure injection , 2011, OOPSLA '11.

[12]  Mark Harman,et al.  An Analysis and Survey of the Development of Mutation Testing , 2011, IEEE Transactions on Software Engineering.

[13]  Cristiano Giuffrida,et al.  EDFI: A Dependable Fault Injection Tool for Dependability Benchmarking Experiments , 2013, 2013 IEEE 19th Pacific Rim International Symposium on Dependable Computing.

[14]  Yves Le Traon,et al.  Trivial Compiler Equivalence: A Large Scale Empirical Study of a Simple, Fast and Effective Equivalent Mutant Detection Technique , 2015, 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering.

[15]  John L. Henning SPEC CPU2006 benchmark descriptions , 2006, CARN.

[16]  Eliane Martins,et al.  Injection of faults at component interfaces and inside the component code: are they equivalent? , 2006, 2006 Sixth European Dependable Computing Conference.

[17]  Erik van der Kouwe,et al.  Finding fault with fault injection: an empirical exploration of distortion in fault injection experiments , 2014, Software Quality Journal.

[18]  Karthik Pattabiraman,et al.  LLFI: An Intermediate Code-Level Fault Injection Tool for Hardware Faults , 2015, 2015 IEEE International Conference on Software Quality, Reliability and Security.

[19]  Neeraj Suri,et al.  simFI: From single to simultaneous software fault injections , 2013, 2013 43rd Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN).

[20]  Erik van der Kouwe,et al.  A Methodology to Efficiently Compare Operating System Stability , 2015, 2015 IEEE 16th International Symposium on High Assurance Systems Engineering.

[21]  Henrique Madeira,et al.  Xception: A Technique for the Experimental Evaluation of Dependability in Modern Computers , 1998, IEEE Trans. Software Eng..

[22]  Neeraj Suri,et al.  An empirical study of injected versus actual interface errors , 2014, ISSTA 2014.

[23]  Kwang-Ting Cheng,et al.  SCEMIT: A SystemC error and mutation injection tool , 2010, Design Automation Conference.

[24]  Byoungju Choi,et al.  High-performance mutation testing , 1993, J. Syst. Softw..

[25]  Tao Xie,et al.  XEMU: an efficient QEMU based binary mutation testing framework for embedded software , 2012, EMSOFT '12.

[26]  Lionel C. Briand,et al.  Is mutation an appropriate tool for testing experiments? , 2005, ICSE.

[27]  Karthik Pattabiraman,et al.  Fine-Grained Characterization of Faults Causing Long Latency Crashes in Programs , 2015, 2015 45th Annual IEEE/IFIP International Conference on Dependable Systems and Networks.

[28]  Herbert Bos,et al.  Fault isolation for device drivers , 2009, 2009 IEEE/IFIP International Conference on Dependable Systems & Networks.

[29]  Jane Huffman Hayes,et al.  Toward Extended Change Types for Analyzing Software Faults , 2014, 2014 14th International Conference on Quality Software.

[30]  Peter M. Chen,et al.  The systematic improvement of fault tolerance in the Rio file cache , 1999, Digest of Papers. Twenty-Ninth Annual International Symposium on Fault-Tolerant Computing (Cat. No.99CB36352).

[31]  Takeshi Yoshimura,et al.  Do Injected Faults Cause Real Failures? A Case Study of Linux , 2014, 2014 IEEE International Symposium on Software Reliability Engineering Workshops.

[32]  Jacob A. Abraham,et al.  FERRARI: A Flexible Software-Based Fault and Error Injection System , 1995, IEEE Trans. Computers.

[33]  Pascale Thévenod-Fosse,et al.  Software error analysis: a real case study involving real faults and mutations , 1996, ISSTA '96.

[34]  George Candea,et al.  Fast black-box testing of system recovery code , 2012, EuroSys '12.

[35]  Henrique Madeira,et al.  Emulation of Software Faults: A Field Data Study and a Practical Approach , 2006, IEEE Transactions on Software Engineering.

[36]  Takeshi Yoshimura,et al.  Is Linux Kernel Oops Useful or Not? , 2012, HotDep.

[37]  Pedro Costa,et al.  Practical and representative faultloads for large-scale software systems , 2015, J. Syst. Softw..

[38]  Christos D. Antonopoulos,et al.  GemFI: A Fault Injection Tool for Studying the Behavior of Applications on Unreliable Substrates , 2014, 2014 44th Annual IEEE/IFIP International Conference on Dependable Systems and Networks.

[39]  George Candea,et al.  LFI: A practical and general library-level fault injector , 2009, 2009 IEEE/IFIP International Conference on Dependable Systems & Networks.

[40]  Peter W. Glynn,et al.  Stochastic Simulation: Algorithms and Analysis , 2007 .

[41]  Andrea C. Arpaci-Dusseau,et al.  FATE and DESTINI: A Framework for Cloud Recovery Testing , 2011, NSDI.

[42]  Jean Arlat,et al.  Fault injection for dependability validation of fault-tolerant computing systems , 1989, [1989] The Nineteenth International Symposium on Fault-Tolerant Computing. Digest of Papers.

[43]  Domenico Cotroneo,et al.  On Fault Representativeness of Software Fault Injection , 2013, IEEE Transactions on Software Engineering.

[44]  Erik van der Kouwe,et al.  On the Soundness of Silence: Investigating Silent Failures Using Fault Injection Experiments , 2014, 2014 Tenth European Dependable Computing Conference.

[45]  Domenico Cotroneo,et al.  Fault Injection for Software Certification , 2013, IEEE Security & Privacy.

[46]  Yuanyuan Zhou,et al.  Have things changed now?: an empirical study of bug characteristics in modern open source software , 2006, ASID '06.

[47]  Domenico Cotroneo,et al.  Assessing Direct Monitoring Techniques to Analyze Failures of Critical Industrial Systems , 2014, 2014 IEEE 25th International Symposium on Software Reliability Engineering.