Properties of Effective Metrics for Coverage-Based Statistical Fault Localization

In this paper, we investigate several coverage-based statistical fault localization metrics that have performed well in recent comparisons of many metrics, in order to better understand the properties of effective metrics. We first algebraically and probabilistically analyze the metrics to identify their key elements. Then we report on an empirical study we conducted to assess the relative importance of those elements. The results suggest that the most effective metrics contain a product of two terms: one that estimates the failure-causing effect of a program element (possibly with confounding bias) and one that weights the first term based on the evidence for the existence of faults in other program elements.

[1]  Joe D. Warren,et al.  The program dependence graph and its use in optimization , 1987, TOPL.

[2]  Yiyu Yao,et al.  An Analysis of Quantitative Measures Associated with Rules , 1999, PAKDD.

[3]  A.J.C. van Gemund,et al.  On the Accuracy of Spectrum-based Fault Localization , 2007, Testing: Academic and Industrial Conference Practice and Research Techniques - MUTATION (TAICPART-MUTATION 2007).

[4]  J. Robins,et al.  Estimating causal effects from epidemiological data , 2006, Journal of Epidemiology and Community Health.

[5]  Jaideep Srivastava,et al.  Selecting the right interestingness measure for association patterns , 2002, KDD.

[6]  Ross Gore,et al.  Reducing confounding bias in predicate-level statistical debugging metrics , 2012, 2012 34th International Conference on Software Engineering (ICSE).

[7]  Martin Monperrus,et al.  Learning to Combine Multiple Ranking Metrics for Fault Localization , 2014, 2014 IEEE International Conference on Software Maintenance and Evolution.

[8]  Daniel Kroening,et al.  Evaluation of Measures for Statistical Fault Localisation and an Optimising Scheme , 2015, FASE.

[9]  Peter A. Flach,et al.  Rule Evaluation Measures: A Unifying View , 1999, ILP.

[10]  Rui Abreu,et al.  Threats to the validity and value of empirical assessments of the accuracy of coverage-based fault locators , 2013, ISSTA.

[11]  Peter Zoeteweij,et al.  A practical evaluation of spectrum-based fault localization , 2009, J. Syst. Softw..

[12]  Y. Benjamini,et al.  Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .

[13]  Andy Podgurski,et al.  Mitigating the confounding effects of program dependences for effective fault localization , 2011, ESEC/FSE '11.

[14]  Sudipto Ghosh,et al.  Tester Feedback Driven Fault Localization , 2012, 2012 IEEE Fifth International Conference on Software Testing, Verification and Validation.

[15]  Peter Kulchyski and , 2015 .

[16]  Johannes Fürnkranz,et al.  Foundations of Rule Learning , 2012, Cognitive Technologies.

[17]  Andy Podgurski,et al.  The Importance of Being Positive in Causal Statistical Fault Localization: Important Properties of Baah et al.'s CSFL Regression Model , 2015, 2015 IEEE/ACM 1st International Workshop on Complex Faults and Failures in Large Software Systems (COUFLESS).

[18]  Andy Podgurski,et al.  NUMFL: Localizing Faults in Numerical Software Using a Value-Based Causal Model , 2015, 2015 IEEE 8th International Conference on Software Testing, Verification and Validation (ICST).

[19]  John T. Stasko,et al.  Visualization of test information to assist fault localization , 2002, ICSE '02.

[20]  Rui Abreu,et al.  On the empirical evaluation of similarity coefficients for spreadsheets fault localization , 2014, Automated Software Engineering.

[21]  Willi Klösgen,et al.  Explora: A Multipattern and Multistrategy Discovery Assistant , 1996, Advances in Knowledge Discovery and Data Mining.

[22]  Michael I. Jordan,et al.  Scalable statistical bug isolation , 2005, PLDI '05.

[23]  Feng Cao,et al.  MFL: Method-Level Fault Localization with Causal Inference , 2013, 2013 IEEE Sixth International Conference on Software Testing, Verification and Validation.

[24]  Lee Naish,et al.  Study of the relationship of bug consistency with respect to performance of spectra metrics , 2009, 2009 2nd IEEE International Conference on Computer Science and Information Technology.

[25]  Howard J. Hamilton,et al.  Interestingness measures for data mining: A survey , 2006, CSUR.

[26]  Lee Naish,et al.  A model for spectra-based software diagnosis , 2011, TSEM.

[27]  C. Tappert,et al.  A Survey of Binary Similarity and Distance Measures , 2010 .

[28]  Baowen Xu,et al.  A theoretical analysis of the risk evaluation formulas for spectrum-based fault localization , 2013, TSEM.

[29]  Xiaofeng Xu,et al.  Ties within Fault Localization rankings: Exposing and Addressing the Problem , 2011, Int. J. Softw. Eng. Knowl. Eng..

[30]  David Lo,et al.  Extended comprehensive study of association measures for fault localization , 2014, J. Softw. Evol. Process..