Evaluating and Improving Fault Localization

Most fault localization techniques take as input a faulty program, and produce as output a ranked list of suspicious code locations at which the program may be defective. When researchers propose a new fault localization technique, they typically evaluate it on programs with known faults. The technique is scored based on where in its output list the defective code appears. This enables the comparison of multiple fault localization techniques to determine which one is better. Previous research has evaluated fault localization techniques using artificial faults, generated either by mutation tools or manually. In other words, previous research has determined which fault localization techniques are best at finding artificial faults. However, it is not known which fault localization techniques are best at finding real faults. It is not obvious that the answer is the same, given previous work showing that artificial faults have both similarities to and differences from real faults. We performed a replication study to evaluate 10 claims in the literature that compared fault localization techniques (from the spectrum-based and mutation-based families). We used 2995 artificial faults in 6 real-world programs. Our results support 7 of the previous claims as statistically significant, but only 3 as having non-negligible effect sizes. Then, we evaluated the same 10 claims, using 310 real faults from the 6 programs. Every previous result was refuted or was statistically and practically insignificant. Our experiments show that artificial faults are not useful for predicting which fault localization techniques perform best on real faults. In light of these results, we identified a design space that includes many previously-studied fault localization techniques as well as hundreds of new techniques. We experimentally determined which factors in the design space are most important, using an overall set of 395 real faults. Then, we extended this design space with new techniques. Several of our novel techniques outperform all existing techniques, notably in terms of ranking defective code in the top-5 or top-10 reports.

[1]  A.J.C. van Gemund,et al.  On the Accuracy of Spectrum-based Fault Localization , 2007, Testing: Academic and Industrial Conference Practice and Research Techniques - MUTATION (TAICPART-MUTATION 2007).

[2]  Rui Abreu,et al.  Using HTML5 visualizations in software fault localization , 2013, 2013 First IEEE Working Conference on Software Visualization (VISSOFT).

[3]  David Lo,et al.  Should I follow this fault localization tool’s output? , 2014, Empirical Software Engineering.

[4]  Phyllis G. Frankl,et al.  Empirical evaluation of the textual differencing regression testing technique , 1998, Proceedings. International Conference on Software Maintenance (Cat. No. 98CB36272).

[5]  Markus Stumptner,et al.  Model-Based Debugging or How to Diagnose Programs Automatically , 2002, IEA/AIE.

[6]  Erica Mealy,et al.  BegBunch: benchmarking for C bug detection tools , 2009, DEFECTS '09.

[7]  Martin Monperrus,et al.  Test case purification for improving fault localization , 2014, SIGSOFT FSE.

[8]  Thomas J. Ostrand,et al.  Experiments on the effectiveness of dataflow- and control-flow-based test adequacy criteria , 1994, Proceedings of 16th International Conference on Software Engineering.

[9]  David Lo,et al.  Practitioners' expectations on automated fault localization , 2016, ISSTA.

[10]  Mark David Weiser,et al.  Program slices: formal, psychological, and practical investigations of an automatic program abstraction method , 1979 .

[11]  René Just,et al.  The major mutation framework: efficient and scalable mutation analysis for Java , 2014, ISSTA 2014.

[12]  Fan Long,et al.  An Analysis of the Search Spaces for Generate and Validate Patch Generation Systems , 2016, 2016 IEEE/ACM 38th International Conference on Software Engineering (ICSE).

[13]  Alessandro Orso,et al.  Are automated debugging techniques actually helping programmers? , 2011, ISSTA '11.

[14]  James A. Jones,et al.  Fault density, fault types, and spectra-based fault localization , 2015, Empirical Software Engineering.

[15]  Michael D. Ernst,et al.  Defects4J: a database of existing faults to enable controlled testing studies for Java programs , 2014, ISSTA 2014.

[16]  Shriram Krishnamurthi,et al.  Automated Fault Localization Using Potential Invariants , 2003, ArXiv.

[17]  Akbar Siami Namin,et al.  The use of mutation in testing experiments and its sensitivity to external threats , 2011, ISSTA '11.

[18]  Rui Abreu,et al.  GZoltar: an eclipse plug-in for testing and debugging , 2012, 2012 Proceedings of the 27th IEEE/ACM International Conference on Automated Software Engineering.

[19]  David Lo,et al.  Theory and Practice, Do They Match? A Case with Spectrum-Based Fault Localization , 2013, 2013 IEEE International Conference on Software Maintenance.

[20]  Lionel C. Briand,et al.  Is mutation an appropriate tool for testing experiments? , 2005, ICSE.

[21]  Alex Groce,et al.  Mutations: How Close are they to Real Faults? , 2014, 2014 IEEE 25th International Symposium on Software Reliability Engineering.

[22]  Peter Zoeteweij,et al.  Spectrum-Based Multiple Fault Localization , 2009, 2009 IEEE/ACM International Conference on Automated Software Engineering.

[23]  Andreas Zeller,et al.  Simplifying and Isolating Failure-Inducing Input , 2002, IEEE Trans. Software Eng..

[24]  Yuhua Qi,et al.  Using automated program repair for evaluating the effectiveness of fault localization techniques , 2013, ISSTA.

[25]  Mark Harman,et al.  An Analysis and Survey of the Development of Mutation Testing , 2011, IEEE Transactions on Software Engineering.

[26]  Shujuan Jiang,et al.  HSFal: Effective fault localization using hybrid spectrum of full slices and execution slices , 2014, J. Syst. Softw..

[27]  Lei Zhao,et al.  A Crosstab-based Statistical Method for Effective Fault Localization , 2008, 2008 1st International Conference on Software Testing, Verification, and Validation.

[28]  Rui Abreu,et al.  A Survey on Software Fault Localization , 2016, IEEE Transactions on Software Engineering.

[29]  Raúl A. Santelices,et al.  Lightweight fault-localization using multiple coverage types , 2009, 2009 IEEE 31st International Conference on Software Engineering.

[30]  Peter Zoeteweij,et al.  A practical evaluation of spectrum-based fault localization , 2009, J. Syst. Softw..

[31]  Shriram Krishnamurthi,et al.  Automated Fault Localization Using Potential Invariants , 2003, ArXiv.

[32]  Mary Jean Harrold,et al.  Empirical evaluation of the tarantula automatic fault-localization technique , 2005, ASE.

[33]  Rui Abreu,et al.  A Low-Cost Approximate Minimal Hitting Set Algorithm and its Application to Model-Based Diagnosis , 2009, SARA.

[34]  W. Eric Wong,et al.  The DStar Method for Effective Software Fault Localization , 2014, IEEE Transactions on Reliability.

[35]  Shin Yoo,et al.  Ask the Mutants: Mutating Faulty Programs for Fault Localization , 2014, 2014 IEEE Seventh International Conference on Software Testing, Verification and Validation.

[36]  Rui Abreu,et al.  Threats to the validity and value of empirical assessments of the accuracy of coverage-based fault locators , 2013, ISSTA.

[37]  René Just,et al.  MAJOR: An efficient and extensible tool for mutation analysis in a Java compiler , 2011, 2011 26th IEEE/ACM International Conference on Automated Software Engineering (ASE 2011).

[38]  Lionel C. Briand,et al.  Using Mutation Analysis for Assessing and Comparing Testing Coverage Criteria , 2006, IEEE Transactions on Software Engineering.

[39]  Michael D. Ernst,et al.  Are mutants a valid substitute for real faults in software testing? , 2014, SIGSOFT FSE.

[40]  Andreas Zeller,et al.  Covering and Uncovering Equivalent Mutants , 2013, Softw. Test. Verification Reliab..

[41]  Lee Naish,et al.  A model for spectra-based software diagnosis , 2011, TSEM.

[42]  Chao Liu,et al.  Statistical Debugging: A Hypothesis Testing-Based Approach , 2006, IEEE Transactions on Software Engineering.

[43]  Shin Hong,et al.  Mutation-Based Fault Localization for Real-World Multilingual Programs (T) , 2015, 2015 30th IEEE/ACM International Conference on Automated Software Engineering (ASE).

[44]  Pascale Thévenod-Fosse,et al.  Software error analysis: a real case study involving real faults and mutations , 1996, ISSTA '96.

[45]  W. Eric Wong,et al.  Effect of test set minimization on fault detection effectiveness , 1998 .

[46]  Gregg Rothermel,et al.  On the Use of Mutation Faults in Empirical Assessments of Test Case Prioritization Techniques , 2006, IEEE Transactions on Software Engineering.

[47]  Michael D. Ernst,et al.  Evaluating & improving fault localization techniques , 2016 .

[48]  James H. Andrews,et al.  Evaluating the Accuracy of Fault Localization Techniques , 2009, 2009 IEEE/ACM International Conference on Automated Software Engineering.

[49]  Yves Le Traon,et al.  Metallaxis‐FL: mutation‐based fault localization , 2015, Softw. Test. Verification Reliab..

[50]  Baowen Xu,et al.  A brief survey of program slicing , 2005, SOEN.