Statistical Debugging: A Hypothesis Testing-Based Approach

Manual debugging is tedious, as well as costly. The high cost has motivated the development of fault localization techniques, which help developers search for fault locations. In this paper, we propose a new statistical method, called SOBER, which automatically localizes software faults without any prior knowledge of the program semantics. Unlike existing statistical approaches that select predicates correlated with program failures, SOBER models the predicate evaluation in both correct and incorrect executions and regards a predicate as fault-relevant if its evaluation pattern in incorrect executions significantly diverges from that in correct ones. Featuring a rationale similar to that of hypothesis testing, SOBER quantifies the fault relevance of each predicate in a principled way. We systematically evaluate SOBER under the same setting as previous studies. The result clearly demonstrates the effectiveness: SOBER could help developers locate 68 out of the 130 faults in the Siemens suite by examining no more than 10 percent of the code, whereas the cause transition approach proposed by Holger et al. [2005] and the statistical approach by Liblit et al. [2005] locate 34 and 52 faults, respectively. Moreover, the effectiveness of SOBER is also evaluated in an "imperfect world", where the test suite is either inadequate or only partially labeled. The experiments indicate that SOBER could achieve competitive quality under these harsh circumstances. Two case studies with grep 2.2 and bc 1.06 are reported, which shed light on the applicability of SOBER on reasonably large programs

[1]  Sudheendra Hangal,et al.  Tracking down software bugs using automatic anomaly detection , 2002, ICSE '02.

[2]  Bin Wang,et al.  Automated support for classifying software failure reports , 2003, 25th International Conference on Software Engineering, 2003. Proceedings..

[3]  Chao Liu,et al.  SOBER: statistical model-based bug localization , 2005, ESEC/FSE-13.

[4]  Klaus Havelund,et al.  Model Checking Programs , 2004, Automated Software Engineering.

[5]  Andreas Zeller,et al.  Visualizing Memory Graphs , 2001, Software Visualization.

[6]  Anish Arora,et al.  Book Review: Verification of Sequential and Concurrent Programs by Krzysztof R. Apt and Ernst-Riidiger Olderog (Springer-Verlag New York, 1997) , 1998, SIGA.

[7]  Yannis Smaragdakis,et al.  JCrasher: an automatic robustness tester for Java , 2004, Softw. Pract. Exp..

[8]  Steven P. Reiss,et al.  Fault localization with nearest neighbor queries , 2003, 18th IEEE International Conference on Automated Software Engineering, 2003. Proceedings..

[9]  David Leon,et al.  Finding failures by cluster analysis of execution profiles , 2001, Proceedings of the 23rd International Conference on Software Engineering. ICSE 2001.

[10]  Mary Lou Soffa,et al.  Jazz: A Tool for Demand-Driven Structural Testing , 2005, CC.

[11]  Michael I. Jordan,et al.  Scalable statistical bug isolation , 2005, PLDI '05.

[12]  Dawson R. Engler,et al.  Bugs as deviant behavior: a general approach to inferring errors in systems code , 2001, SOSP.

[13]  Michael D. Ernst,et al.  Eclat: Automatic Generation and Classification of Test Inputs , 2005, ECOOP.

[14]  Xiangyu Zhang,et al.  Locating faulty code using failure-inducing chops , 2005, ASE.

[15]  A. Zeller Isolating cause-effect chains from computer programs , 2002, SIGSOFT '02/FSE-10.

[16]  Sarfraz Khurshid,et al.  Korat: automated testing based on Java predicates , 2002, ISSTA '02.

[17]  Ernst-Rüdiger Olderog,et al.  Verification of Sequential and Concurrent Programs , 1991, Texts and Monographs in Computer Science.

[18]  Andreas Zeller,et al.  Simplifying and Isolating Failure-Inducing Input , 2002, IEEE Trans. Software Eng..

[19]  Frank Tip,et al.  A survey of program slicing techniques , 1994, J. Program. Lang..

[20]  Edmund M. Clarke,et al.  Model Checking , 1999, Handbook of Automated Reasoning.

[21]  G. Rothermel,et al.  An empirical study of fault localization for end-user programmers , 2005, Proceedings. 27th International Conference on Software Engineering, 2005. ICSE 2005..

[22]  John T. Stasko,et al.  Visualization of test information to assist fault localization , 2002, ICSE '02.

[23]  Gregg Rothermel,et al.  Supporting Controlled Experimentation with Testing Techniques: An Infrastructure and its Potential Impact , 2005, Empirical Software Engineering.

[24]  Gregg Rothermel,et al.  An empirical investigation of the relationship between spectra differences and regression faults , 2000, Softw. Test. Verification Reliab..

[25]  Shriram Krishnamurthi,et al.  Automated Fault Localization Using Potential Invariants , 2003, ArXiv.

[26]  Shriram Krishnamurthi,et al.  Automated Fault Localization Using Potential Invariants 1 , 2003 .

[27]  Dawson R. Engler,et al.  Proceedings of the 5th Symposium on Operating Systems Design and Implementation Cmc: a Pragmatic Approach to Model Checking Real Code , 2022 .

[28]  Mary Jean Harrold,et al.  Empirical evaluation of the tarantula automatic fault-localization technique , 2005, ASE.

[29]  Joseph Robert Horgan,et al.  Fault localization using execution slices and dataflow tests , 1995, Proceedings of Sixth International Symposium on Software Reliability Engineering. ISSRE'95.

[30]  Michael I. Jordan,et al.  Bug isolation via remote program sampling , 2003, PLDI.

[31]  Brad A. Myers,et al.  Designing the whyline: a debugging interface for asking questions about program behavior , 2004, CHI.

[32]  Andy Chou,et al.  Bugs as Inconsistent Behavior: A General Approach to Inferring Errors in Systems Code. , 2001, SOSP 2001.

[33]  H. Cleve,et al.  Locating causes of program failures , 2005, Proceedings. 27th International Conference on Software Engineering, 2005. ICSE 2005..

[34]  William G. Griswold,et al.  Dynamically discovering likely program invariants to support program evolution , 1999, Proceedings of the 1999 International Conference on Software Engineering (IEEE Cat. No.99CB37002).

[35]  Roland Mittermeir,et al.  Spreadsheet Debugging , 2003, ArXiv.

[36]  Iris Vessey,et al.  Expertise in Debugging Computer Programs: A Process Analysis , 1984, Int. J. Man Mach. Stud..

[37]  G. Casella,et al.  Statistical Inference , 2003, Encyclopedia of Social Network Analysis and Mining.

[38]  Iris Vessey,et al.  Expertise in Debugging Computer Programs , 1984 .

[39]  Yuriy Brun,et al.  Finding latent code errors via machine learning over program executions , 2004, Proceedings. 26th International Conference on Software Engineering.

[40]  Thomas J. Ostrand,et al.  Experiments on the effectiveness of dataflow- and control-flow-based test adequacy criteria , 1994, Proceedings of 16th International Conference on Software Engineering.

[41]  Gregg Rothermel,et al.  Empirical Studies of a Safe Regression Test Selection Technique , 1998, IEEE Trans. Software Eng..

[42]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .