Extending the Theoretical Fault Localization Effectiveness Hierarchy with Empirical Results at Different Code Abstraction Levels

Spectrum-based fault localization techniques are semi-automated program debugging techniques that address the bottleneck of finding suspicious program locations for diagnosis. They assess the fault suspiciousness of individual program locations based on the code coverage data achieved by executing the program under debugging over a test suite. A program location can be viewed at different abstraction levels, such as a statement in the source code or an instruction compiled from the source code. In general, a program location at one code abstraction level can be transformed into zero to more program locations at another abstraction level. Although programmers usually debug at the source code level, the code is actually executed at a lower level. It is unclear whether the same techniques applied at different code abstraction levels may achieve consistent results. In this paper, we study a suite of spectrum-based fault localization techniques at both the source and instruction code levels in the context of an existing theoretical hierarchy to assess whether their effectiveness is consistent across the two levels. Our study extends the theoretical hierarchy with empirically validated relationships across two code abstraction levels toward an integration of the theory and practice of fault localization.

[1]  David W. Binkley,et al.  Program slicing , 2008, 2008 Frontiers of Software Maintenance.

[2]  Lee Naish,et al.  A model for spectra-based software diagnosis , 2011, TSEM.

[3]  A. Jefferson Offutt,et al.  Is bytecode instrumentation as good as source code instrumentation: An empirical study with industrial tools (Experience Report) , 2013, 2013 IEEE 24th International Symposium on Software Reliability Engineering (ISSRE).

[4]  Brian Everitt,et al.  Graphical Techniques for Multivariate Data. , 1978 .

[5]  W. A. Scott,et al.  Reliability of Content Analysis ; The Case of Nominal Scale Cording , 1955 .

[6]  T. H. Tse,et al.  Capturing propagation of infected program states , 2009, ESEC/FSE '09.

[7]  Andreas Zeller,et al.  Lightweight Defect Localization for Java , 2005, ECOOP.

[8]  A. S. Meyer,et al.  Comparison of similarity coefficients used for cluster analysis with dominant markers in maize (Zea mays L) , 2004 .

[9]  Janusz W. Laski,et al.  Dynamic slicing of computer programs , 1990, J. Syst. Softw..

[10]  A. E. Maxwell,et al.  Deriving coefficients of reliability and agreement for ratings. , 1968, The British journal of mathematical and statistical psychology.

[11]  Fernando C. Lourenço,et al.  Binary-based similarity measures for categorical data and their application in Self- Organizing Maps , 2004 .

[12]  Kai-Yuan Cai,et al.  Effective Fault Localization using Code Coverage , 2007, 31st Annual International Computer Software and Applications Conference (COMPSAC 2007).

[13]  Ting Chen,et al.  Statistical debugging using compound boolean predicates , 2007, ISSTA '07.

[14]  P. Jaccard,et al.  Etude comparative de la distribution florale dans une portion des Alpes et des Jura , 1901 .

[15]  P. F. Russell,et al.  On Habitat and Association of Species of Anopheline Larvae in South-eastern Madras. , 1940 .

[16]  L. A. Goodman,et al.  Measures of association for cross classifications , 1979 .

[17]  L. R. Dice Measures of the Amount of Ecologic Association Between Species , 1945 .

[18]  J. Fleiss Estimating the accuracy of dichotomous judgments , 1965, Psychometrika.

[19]  Harish Patil,et al.  Pin: building customized program analysis tools with dynamic instrumentation , 2005, PLDI '05.

[20]  Mary Jean Harrold,et al.  An empirical study of the effects of test-suite reduction on fault localization , 2008, 2008 ACM/IEEE 30th International Conference on Software Engineering.

[21]  D J Rogers,et al.  A Computer Program for Classifying Plants. , 1960, Science.

[22]  Margaret H. Dunham,et al.  Data Mining: Introductory and Advanced Topics , 2002 .

[23]  T. H. Tse,et al.  Fault localization through evaluation sequences , 2010, J. Syst. Softw..

[24]  E. Rogot,et al.  A proposed index for measuring agreement in test-retest studies. , 1966, Journal of chronic diseases.

[25]  Rajiv Gupta,et al.  Fault localization using value replacement , 2008, ISSTA '08.

[26]  Mary Jean Harrold,et al.  Empirical evaluation of the tarantula automatic fault-localization technique , 2005, ASE.

[27]  Richard W. Hamming,et al.  Error detecting and error correcting codes , 1950 .

[28]  Trishul M. Chilimbi,et al.  HOLMES: Effective statistical debugging via efficient path profiling , 2009, 2009 IEEE 31st International Conference on Software Engineering.

[29]  Lee Naish,et al.  Study of the relationship of bug consistency with respect to performance of spectra metrics , 2009, 2009 2nd IEEE International Conference on Computer Science and Information Technology.

[30]  Sudhanshu K. Mishra The Most Representative Composite Rank Ordering of Multi-Attribute Objects by the Particle Swarm Optimization , 2009 .

[31]  Gregg Rothermel,et al.  Supporting Controlled Experimentation with Testing Techniques: An Infrastructure and its Potential Impact , 2005, Empirical Software Engineering.

[32]  Tsong Yueh Chen,et al.  How well does test case prioritization integrate with statistical fault localization? , 2012, Inf. Softw. Technol..

[33]  Michael R. Anderberg,et al.  Cluster Analysis for Applications , 1973 .

[34]  A. Zeller Isolating cause-effect chains from computer programs , 2002, SIGSOFT '02/FSE-10.

[35]  Baowen Xu,et al.  A theoretical analysis of the risk evaluation formulas for spectrum-based fault localization , 2013, TSEM.

[36]  A. Ochiai Zoogeographical Studies on the Soleoid Fishes Found in Japan and its Neighbouring Regions-III , 1957 .

[37]  Raúl A. Santelices,et al.  Lightweight fault-localization using multiple coverage types , 2009, 2009 IEEE 31st International Conference on Software Engineering.

[38]  Xiangyu Zhang,et al.  Locating faults through automated predicate switching , 2006, ICSE.

[39]  Peter Zoeteweij,et al.  A practical evaluation of spectrum-based fault localization , 2009, J. Syst. Softw..

[40]  Alberto Sánchez Automatic Error Detection Techniques Based on Dynamic Invariants THESIS submitted in partial fulfillment of the requirements for the degree of MASTER OF , 2007 .

[41]  Jacob Cohen A Coefficient of Agreement for Nominal Scales , 1960 .

[42]  Michael I. Jordan,et al.  Scalable statistical bug isolation , 2005, PLDI '05.