A theoretical analysis of the risk evaluation formulas for spectrum-based fault localization

An important research area of Spectrum-Based Fault Localization (SBFL) is the effectiveness of risk evaluation formulas. Most previous studies have adopted an empirical approach, which can hardly be considered as sufficiently comprehensive because of the huge number of combinations of various factors in SBFL. Though some studies aimed at overcoming the limitations of the empirical approach, none of them has provided a completely satisfactory solution. Therefore, we provide a theoretical investigation on the effectiveness of risk evaluation formulas. We define two types of relations between formulas, namely, equivalent and better. To identify the relations between formulas, we develop an innovative framework for the theoretical investigation. Our framework is based on the concept that the determinant for the effectiveness of a formula is the number of statements with risk values higher than the risk value of the faulty statement. We group all program statements into three disjoint sets with risk values higher than, equal to, and lower than the risk value of the faulty statement, respectively. For different formulas, the sizes of their sets are compared using the notion of subset. We use this framework to identify the maximal formulas which should be the only formulas to be used in SBFL.

[1]  A. Zeller Isolating cause-effect chains from computer programs , 2002, SIGSOFT '02/FSE-10.

[2]  Sudipto Ghosh,et al.  Proximity based weighting of test cases to improve spectrum based fault localization , 2011, 2011 26th IEEE/ACM International Conference on Automated Software Engineering (ASE 2011).

[3]  Gregg Rothermel,et al.  An empirical investigation of the relationship between spectra differences and regression faults , 2000, Softw. Test. Verification Reliab..

[4]  Scott N. Woodfield,et al.  Evaluating the effectiveness of reliability-assurance techniques , 1989, J. Syst. Softw..

[5]  James R. Larus,et al.  The use of program profiling for software maintenance with applications to the year 2000 problem , 1997, ESEC '97/FSE-5.

[6]  Baowen Xu,et al.  Spectrum-Based Fault Localization: Testing Oracles are No Longer Mandatory , 2011, 2011 11th International Conference on Quality Software.

[7]  Byoungju Choi,et al.  A family of code coverage-based heuristics for effective fault localization , 2010, J. Syst. Softw..

[8]  Baowen Xu,et al.  Isolating Suspiciousness from Spectrum-Based Fault Localization Techniques , 2010, 2010 10th International Conference on Quality Software.

[9]  Lee Naish,et al.  Study of the relationship of bug consistency with respect to performance of spectra metrics , 2009, 2009 2nd IEEE International Conference on Computer Science and Information Technology.

[10]  Michael I. Jordan,et al.  Scalable statistical bug isolation , 2005, PLDI '05.

[11]  Peter Zoeteweij,et al.  An Evaluation of Similarity Coefficients for Software Fault Localization , 2006, 2006 12th Pacific Rim International Symposium on Dependable Computing (PRDC'06).

[12]  Mary Jean Harrold,et al.  Empirical evaluation of the tarantula automatic fault-localization technique , 2005, ASE.

[13]  Lee Naish,et al.  The effectiveness of using non redundant test cases with program spectra for bug localization , 2009, 2009 2nd IEEE International Conference on Computer Science and Information Technology.

[14]  Lee Naish,et al.  A model for spectra-based software diagnosis , 2011, TSEM.

[15]  Lee Naish,et al.  Spectral Debugging with Weights and Incremental Ranking , 2009, 2009 16th Asia-Pacific Software Engineering Conference.

[16]  Lei Zhao,et al.  A Crosstab-based Statistical Method for Effective Fault Localization , 2008, 2008 1st International Conference on Software Testing, Verification, and Validation.

[17]  Alex Aiken,et al.  Cooperative Bug Isolation , 2007 .

[18]  Andreas Zeller,et al.  Lightweight Defect Localization for Java , 2005, ECOOP.

[19]  David Leon,et al.  Finding failures by cluster analysis of execution profiles , 2001, Proceedings of the 23rd International Conference on Software Engineering. ICSE 2001.

[20]  Giovanni Denaro,et al.  ACM Transactions on Software Engineering and Methodology : Volume 22, Nomor 4, 2013 , 2014 .

[21]  Gregg Rothermel,et al.  An empirical investigation of program spectra , 1998, PASTE '98.

[22]  XuBaowen,et al.  A theoretical analysis of the risk evaluation formulas for spectrum-based fault localization , 2013 .

[23]  Mary Jean Harrold,et al.  An empirical study of the effects of test-suite reduction on fault localization , 2008, 2008 ACM/IEEE 30th International Conference on Software Engineering.

[24]  Chao Liu,et al.  Failure proximity: a fault localization-based approach , 2006, SIGSOFT '06/FSE-14.

[25]  Gregg Rothermel,et al.  An empirical investigation of the relationship between spectra differences and regression faults , 2000 .

[26]  Mary Jean Harrold,et al.  Debugging in Parallel , 2007, ISSTA '07.

[27]  Alessandro Orso,et al.  Are automated debugging techniques actually helping programmers? , 2011, ISSTA '11.

[28]  Chao Liu,et al.  Statistical Debugging: A Hypothesis Testing-Based Approach , 2006, IEEE Transactions on Software Engineering.

[29]  Kai-Yuan Cai,et al.  Effective Fault Localization using Code Coverage , 2007, 31st Annual International Computer Software and Applications Conference (COMPSAC 2007).

[30]  Bin Wang,et al.  Automated support for classifying software failure reports , 2003, 25th International Conference on Software Engineering, 2003. Proceedings..

[31]  John T. Stasko,et al.  Visualization of test information to assist fault localization , 2002, ICSE '02.

[32]  Yu Qi,et al.  Effective program debugging based on execution slices and inter-block data dependency , 2006, J. Syst. Softw..

[33]  Joseph Robert Horgan,et al.  Fault localization using execution slices and dataflow tests , 1995, Proceedings of Sixth International Symposium on Software Reliability Engineering. ISSRE'95.

[34]  Eric A. Brewer,et al.  Pinpoint: problem determination in large, dynamic Internet services , 2002, Proceedings International Conference on Dependable Systems and Networks.

[35]  Andy Podgurski,et al.  Causal inference for statistical fault localization , 2010, ISSTA '10.

[36]  Tsong Yueh Chen,et al.  How Well Do Test Case Prioritization Techniques Support Statistical Fault Localization , 2009, 2009 33rd Annual IEEE International Computer Software and Applications Conference.

[37]  A.J.C. van Gemund,et al.  On the Accuracy of Spectrum-based Fault Localization , 2007, Testing: Academic and Industrial Conference Practice and Research Techniques - MUTATION (TAICPART-MUTATION 2007).

[38]  Raúl A. Santelices,et al.  Lightweight fault-localization using multiple coverage types , 2009, 2009 IEEE 31st International Conference on Software Engineering.

[39]  Peter Zoeteweij,et al.  A practical evaluation of spectrum-based fault localization , 2009, J. Syst. Softw..

[40]  Michael I. Jordan,et al.  Statistical debugging: simultaneous identification of multiple bugs , 2006, ICML.

[41]  James A. Jones,et al.  On the influence of multiple faults on coverage-based fault localization , 2011, ISSTA '11.