A Crosstab-based Statistical Method for Effective Fault Localization

Fault localization is the most expensive activity in program debugging. Traditional ad-hoc methods can be time-consuming and ineffective because they rely on programmers' intuitive guesswork, which may neither be accurate nor reliable. A better solution is to utilize a systematic and statistically well-defined method to automatically identify suspicious code that should be examined for possible fault locations. We present a crosstab-based statistical method using the coverage information of each executable statement and the execution result (success or failure) with respect to each test case. A crosstab is constructed for each executable statement and a statistic is computed to determine the suspiciousness of the corresponding statement. Statements with a higher suspiciousness are more likely to contain bugs and should be examined before those with a lower suspiciousness. Three case studies using the Siemens suite, the Space program, and the Unix suite, respectively, are conducted. Our results suggest that the crosstab-based method is effective in fault localization and performs better (in terms of a smaller percentage of executable statements that have to be examined until the first statement containing the fault is reached) than other methods such as Tarantula. The difference in efficiency (computational time) between these two methods is very small.

[1]  Mary Jean Harrold,et al.  Debugging in Parallel , 2007, ISSTA '07.

[2]  Xiangyu Zhang,et al.  Locating faults through automated predicate switching , 2006, ICSE.

[3]  Eugene H. Spafford,et al.  Debugging with dynamic slicing and backtracking , 1993, Softw. Pract. Exp..

[4]  Joseph Robert Horgan,et al.  Dynamic program slicing , 1990, PLDI '90.

[5]  Michael I. Jordan,et al.  Scalable statistical bug isolation , 2005, PLDI '05.

[6]  Iris Vessey,et al.  Expertise in Debugging Computer Programs , 1984 .

[7]  Chao Liu,et al.  Failure proximity: a fault localization-based approach , 2006, SIGSOFT '06/FSE-14.

[8]  Andreas Zeller,et al.  Simplifying and Isolating Failure-Inducing Input , 2002, IEEE Trans. Software Eng..

[9]  R. Fisher The Advanced Theory of Statistics , 1943, Nature.

[10]  Iris Vessey,et al.  Expertise in Debugging Computer Programs: A Process Analysis , 1984, Int. J. Man Mach. Stud..

[11]  Michael I. Jordan,et al.  Statistical debugging: simultaneous identification of multiple bugs , 2006, ICML '06.

[12]  Yu Qi,et al.  Effective program debugging based on execution slices and inter-block data dependency , 2006, J. Syst. Softw..

[13]  James Joseph Biundo,et al.  Analysis of Contingency Tables , 1969 .

[14]  Thomas J. Ostrand,et al.  Experiments on the effectiveness of dataflow- and control-flow-based test adequacy criteria , 1994, Proceedings of 16th International Conference on Software Engineering.

[15]  A. Zeller Isolating cause-effect chains from computer programs , 2002, SIGSOFT '02/FSE-10.

[16]  Mark Weiser,et al.  Programmers use slices when debugging , 1982, CACM.

[17]  H. Cleve,et al.  Locating causes of program failures , 2005, Proceedings. 27th International Conference on Software Engineering, 2005. ICSE 2005..

[18]  Yu Qi,et al.  Smart debugging software architectural design in SDL , 2005, J. Syst. Softw..

[19]  Chao Liu,et al.  Statistical Debugging: A Hypothesis Testing-Based Approach , 2006, IEEE Transactions on Software Engineering.

[20]  Steven P. Reiss,et al.  Fault localization with nearest neighbor queries , 2003, 18th IEEE International Conference on Automated Software Engineering, 2003. Proceedings..

[21]  Kai-Yuan Cai,et al.  Effective Fault Localization using Code Coverage , 2007, 31st Annual International Computer Software and Applications Conference (COMPSAC 2007).

[22]  Joseph Robert Horgan,et al.  Fault localization using execution slices and dataflow tests , 1995, Proceedings of Sixth International Symposium on Software Reliability Engineering. ISSRE'95.

[23]  Chap T. Le,et al.  Applied Categorical Data Analysis , 1998 .

[24]  Joseph Robert Horgan,et al.  Effect of Test Set Minimization on Fault Detection Effectiveness , 1995, 1995 17th International Conference on Software Engineering.

[25]  Mary Jean Harrold,et al.  Empirical evaluation of the tarantula automatic fault-localization technique , 2005, ASE.

[26]  Bin Wang,et al.  Automated support for classifying software failure reports , 2003, 25th International Conference on Software Engineering, 2003. Proceedings..