Mitigating the confounding effects of program dependences for effective fault localization

Dynamic program dependences are recognized as important factors in software debugging because they contribute to triggering the effects of faults and propagating the effects to a program's output. The effects of dynamic dependences also produce significant confounding bias when statistically estimating the causal effect of a statement on the occurrence of program failures, which leads to poor fault localization results. This paper presents a novel causal-inference technique for fault localization that accounts for the effects of dynamic data and control dependences and thus, significantly reduces confounding bias during fault localization. The technique employs a new dependence-based causal model together with matching of test executions based on their dynamic dependences. The paper also presents empirical results indicating that the new technique performs significantly better than existing statistical fault-localization techniques as well as our previous fault localization technique based on causal-inference methodology.

[1]  David W. Binkley,et al.  Program slicing , 2008, 2008 Frontiers of Software Maintenance.

[2]  Frank Tip,et al.  Directed test generation for effective fault localization , 2010, ISSTA '10.

[3]  A.J.C. van Gemund,et al.  On the Accuracy of Spectrum-based Fault Localization , 2007, Testing: Academic and Industrial Conference Practice and Research Techniques - MUTATION (TAICPART-MUTATION 2007).

[4]  H. Cleve,et al.  Locating causes of program failures , 2005, Proceedings. 27th International Conference on Software Engineering, 2005. ICSE 2005..

[5]  T. H. Tse,et al.  Capturing propagation of infected program states , 2009, ESEC/FSE '09.

[6]  Joseph Robert Horgan,et al.  Dynamic program slicing , 1990, PLDI '90.

[7]  Michael I. Jordan,et al.  Scalable statistical bug isolation , 2005, PLDI '05.

[8]  Christopher Winship,et al.  Counterfactuals and Causal Inference: Methods and Principles for Social Research , 2007 .

[9]  John T. Stasko,et al.  Visualization of test information to assist fault localization , 2002, ICSE '02.

[10]  Gregg Rothermel,et al.  Supporting Controlled Experimentation with Testing Techniques: An Infrastructure and its Potential Impact , 2005, Empirical Software Engineering.

[11]  Alfred V. Aho,et al.  The Transitive Reduction of a Directed Graph , 1972, SIAM J. Comput..

[12]  David Lo,et al.  Comprehensive evaluation of association measures for fault localization , 2010, 2010 IEEE International Conference on Software Maintenance.

[13]  Raúl A. Santelices,et al.  Lightweight fault-localization using multiple coverage types , 2009, 2009 IEEE 31st International Conference on Software Engineering.

[14]  Ross Gore,et al.  Causal Program Slicing , 2009, 2009 ACM/IEEE/SCS 23rd Workshop on Principles of Advanced and Distributed Simulation.

[15]  Robert J. Flassig,et al.  TRANSWESD: inferring cellular networks with transitive reduction , 2010, Bioinform..

[16]  George C. Necula,et al.  CIL: Intermediate Language and Tools for Analysis and Transformation of C Programs , 2002, CC.

[17]  Steven P. Reiss,et al.  Fault localization with nearest neighbor queries , 2003, 18th IEEE International Conference on Automated Software Engineering, 2003. Proceedings..

[18]  Debra J. Richardson,et al.  An Analysis of Test Data Selection Criteria Using the RELAY Model of Fault Detection , 1993, IEEE Trans. Software Eng..

[19]  Andy Podgurski,et al.  Causal inference for statistical fault localization , 2010, ISSTA '10.

[20]  Gregory Gutin,et al.  Digraphs - theory, algorithms and applications , 2002 .

[21]  J. Pearl Causality: Models, Reasoning and Inference , 2000 .

[22]  P. Holland Statistics and Causal Inference , 1985 .

[23]  Joe D. Warren,et al.  The program dependence graph and its use in optimization , 1987, TOPL.

[24]  Bjorn De Sutter,et al.  Matching Control Flow of Program Versions , 2007, 2007 IEEE International Conference on Software Maintenance.

[25]  Shirley Dex,et al.  JR 旅客販売総合システム(マルス)における運用及び管理について , 1991 .

[26]  Elizabeth A Stuart,et al.  Matching methods for causal inference: A review and a look forward. , 2010, Statistical science : a review journal of the Institute of Mathematical Statistics.

[27]  A. Zeller Isolating cause-effect chains from computer programs , 2002, SIGSOFT '02/FSE-10.

[28]  Lori A. Clarke,et al.  A Formal Model of Program Dependences and Its Implications for Software Testing, Debugging, and Maintenance , 1990, IEEE Trans. Software Eng..

[29]  Elaine J. Weyuker,et al.  Selecting Software Test Data Using Data Flow Information , 1985, IEEE Transactions on Software Engineering.

[30]  Xiangyu Zhang,et al.  Pruning dynamic slices with confidence , 2006, PLDI '06.

[31]  Jasjeet S. Sekhon,et al.  Multivariate and Propensity Score Matching Software with Automated Balance Optimization: The Matching Package for R , 2008 .

[32]  Chao Liu,et al.  SOBER: statistical model-based bug localization , 2005, ESEC/FSE-13.

[33]  Eugene H. Spafford,et al.  Critical slicing for software fault localization , 1996, ISSTA '96.