MFL: Method-Level Fault Localization with Causal Inference

Recent studies have shown that use of causal inference techniques for reducing confounding bias improves the effectiveness of statistical fault localization (SFL) at the level of program statements. However, with very large programs and test suites, the overhead of statement-level causal SFL may be excessive. Moreover cost evaluations of statement-level SFL techniques generally are based on a questionable assumption-that software developers can consistently recognize faults when examining statements in isolation. To address these issues, we propose and evaluate a novel method-level SFL technique called MFL, which is based on causal inference methodology. In addition to reframing SFL at the method level, our technique incorporates a new algorithm for selecting covariates to use in adjusting for confounding bias. This algorithm attempts to ensure that such covariates satisfy the conditional exchangeability and positivity properties required for identifying causal effects with observational data. We present empirical results indicating that our approach is more effective than four method-level versions of well-known SFL techniques and that our confounder selection algorithm is superior to two alternatives.

[1]  Thomas Ball,et al.  What's in a region?: or computing control dependence regions in near-linear time for reducible control flow , 1993, LOPL.

[2]  Feng Cao,et al.  Bayesian Hierarchical Reinforcement Learning , 2012, NIPS.

[3]  Klemens Böhm,et al.  Mining Edge-Weighted Call Graphs to Localise Software Bugs , 2008, ECML/PKDD.

[4]  Susan L. Graham,et al.  Gprof: A call graph execution profiler , 1982, SIGPLAN '82.

[5]  Andy Podgurski,et al.  Mitigating the confounding effects of program dependences for effective fault localization , 2011, ESEC/FSE '11.

[6]  H. Andy Podgurski,et al.  Exploiting user feedback to facilitate observation-based testing , 2009 .

[7]  Andy Podgurski,et al.  Causal inference for statistical fault localization , 2010, ISSTA '10.

[8]  A.J.C. van Gemund,et al.  On the Accuracy of Spectrum-based Fault Localization , 2007, Testing: Academic and Industrial Conference Practice and Research Techniques - MUTATION (TAICPART-MUTATION 2007).

[9]  Huseyin Naci,et al.  The critical role of observational evidence in comparative effectiveness research. , 2010, Health affairs.

[10]  Joe D. Warren,et al.  The program dependence graph and its use in optimization , 1987, TOPL.

[11]  David Lo,et al.  Comprehensive evaluation of association measures for fault localization , 2010, 2010 IEEE International Conference on Software Maintenance.

[12]  Chao Liu,et al.  SOBER: statistical model-based bug localization , 2005, ESEC/FSE-13.

[13]  J. Pearl Causal diagrams for empirical researchRejoinder to Discussions of ‘Causal diagrams for empirical research’ , 1995 .

[14]  J. Robins,et al.  Estimating causal effects from epidemiological data , 2006, Journal of Epidemiology and Community Health.

[15]  Mary Jean Harrold,et al.  Empirical evaluation of the tarantula automatic fault-localization technique , 2005, ASE.

[16]  Thierry Coupaye,et al.  ASM: a code manipulation tool to implement adaptable systems , 2002 .

[17]  J. Pearl Causality: Models, Reasoning and Inference , 2000 .

[18]  Michael I. Jordan,et al.  Scalable statistical bug isolation , 2005, PLDI '05.

[19]  Frank Tip,et al.  Directed test generation for effective fault localization , 2010, ISSTA '10.

[20]  Ross Gore,et al.  Reducing confounding bias in predicate-level statistical debugging metrics , 2012, 2012 34th International Conference on Software Engineering (ICSE).

[21]  Hong Cheng,et al.  Identifying bug signatures using discriminative graph mining , 2009, ISSTA.

[22]  Raluca Mihăescu,et al.  RE: ‘‘TRENDS IN ASTHMA PREVALENCE AND INCIDENCE IN ONTARIO, , 2011 .

[23]  Iris Vessey,et al.  Expertise in Debugging Computer Programs: A Process Analysis , 1984, Int. J. Man Mach. Stud..

[24]  Andy Podgurski,et al.  Algorithms and tool support for dynamic information flow analysis , 2009, Inf. Softw. Technol..

[25]  Kristin E. Porter,et al.  Diagnosing and responding to violations in the positivity assumption , 2012, Statistical methods in medical research.

[26]  Tsong Yueh Chen,et al.  How Well Do Test Case Prioritization Techniques Support Statistical Fault Localization , 2009, 2009 33rd Annual IEEE International Computer Software and Applications Conference.

[27]  Christopher Winship,et al.  Counterfactuals and Causal Inference: Methods and Principles for Social Research , 2007 .

[28]  J. Pearl Causal diagrams for empirical research , 1995 .

[29]  John T. Stasko,et al.  Visualization of test information to assist fault localization , 2002, ICSE '02.