Fusion fault localizers

Many spectrum-based fault localization techniques have been proposed to measure how likely each program element is the root cause of a program failure. For various bugs, the best technique to localize the bugs may differ due to the characteristics of the buggy programs and their program spectra. In this paper, we leverage the diversity of existing spectrum-based fault localization techniques to better localize bugs using data fusion methods. Our proposed approach consists of three steps: score normalization, technique selection, and data fusion. We investigate two score normalization methods, two technique selection methods, and five data fusion methods resulting in twenty variants of Fusion Localizer. Our approach is bug specific in which the set of techniques to be fused are adaptively selected for each buggy program based on its spectra. Also, it requires no training data, i.e., execution traces of the past buggy programs. We evaluate our approach on a common benchmark dataset and a dataset consisting of real bugs from three medium to large programs. Our evaluation demonstrates that our approach can significantly improve the effectiveness of existing state-of-the-art fault localization techniques. Compared to these state-of-the-art techniques, the best variants of Fusion Localizer can statistically significantly reduce the amount of code to be inspected to find all bugs. Our best variants can increase the proportion of bugs localized when developers only inspect the top 10% most suspicious program elements by more than 10% and increase the number of bugs that can be successfully localized when developers only inspect up to 10 program blocks by more than 20%.

[1]  Hong Cheng,et al.  Bug Signature Minimization and Fusion , 2011, 2011 IEEE 13th International Symposium on High-Assurance Systems Engineering.

[2]  H. Cleve,et al.  Locating causes of program failures , 2005, Proceedings. 27th International Conference on Software Engineering, 2005. ICSE 2005..

[3]  David Lo,et al.  Extended comprehensive study of association measures for fault localization , 2014, J. Softw. Evol. Process..

[4]  Alessandro Orso,et al.  Are automated debugging techniques actually helping programmers? , 2011, ISSTA '11.

[5]  David Lo,et al.  Will Fault Localization Work for These Failures? An Automated Approach to Predict Effectiveness of Fault Localization Tools , 2013, 2013 IEEE International Conference on Software Maintenance.

[6]  John Dunnion,et al.  ProbFuse: a probabilistic approach to data fusion , 2006, SIGIR.

[7]  Steven P. Reiss,et al.  Fault localization with nearest neighbor queries , 2003, 18th IEEE International Conference on Automated Software Engineering, 2003. Proceedings..

[8]  David Lo,et al.  Cross-language bug localization , 2014, ICPC 2014.

[9]  Sarfraz Khurshid,et al.  Localization of faults in software programs using Bernoulli divergences , 2012, 2012 International Symposium on Information Theory and its Applications.

[10]  Baowen Xu,et al.  A theoretical analysis of the risk evaluation formulas for spectrum-based fault localization , 2013, TSEM.

[11]  David Lo,et al.  Compositional Vector Space Models for Improved Bug Localization , 2014, 2014 IEEE International Conference on Software Maintenance and Evolution.

[12]  Lee Naish,et al.  A model for spectra-based software diagnosis , 2011, TSEM.

[13]  Claire Le Goues,et al.  GenProg: A Generic Method for Automatic Software Repair , 2012, IEEE Transactions on Software Engineering.

[14]  Martin P. Robillard,et al.  Non-essential changes in version histories , 2011, 2011 33rd International Conference on Software Engineering (ICSE).

[15]  Rabia Nuray-Turan,et al.  Automatic ranking of information retrieval systems using data fusion , 2006, Inf. Process. Manag..

[16]  David Lo,et al.  Comprehensive evaluation of association measures for fault localization , 2010, 2010 IEEE International Conference on Software Maintenance.

[17]  Raúl A. Santelices,et al.  Lightweight fault-localization using multiple coverage types , 2009, 2009 IEEE 31st International Conference on Software Engineering.

[18]  Andreas Zeller,et al.  Why Programs Fail, Second Edition: A Guide to Systematic Debugging , 2009 .

[19]  Michael I. Jordan,et al.  Scalable statistical bug isolation , 2005, PLDI '05.

[20]  David Lo,et al.  Interactive fault localization leveraging simple user feedback , 2012, 2012 28th IEEE International Conference on Software Maintenance (ICSM).

[21]  Mark Harman,et al.  Provably Optimal and Human-Competitive Results in SBSE for Spectrum Based Fault Localisation , 2013, SSBSE.

[22]  F. Wilcoxon Individual Comparisons by Ranking Methods , 1945 .

[23]  Avinash C. Kak,et al.  Retrieval from software libraries for bug localization: a comparative study of generic and composite text models , 2011, MSR '11.

[24]  Hong Cheng,et al.  Identifying bug signatures using discriminative graph mining , 2009, ISSTA.

[25]  Xiangyu Zhang,et al.  Locating faults through automated predicate switching , 2006, ICSE.

[26]  Peter Zoeteweij,et al.  A practical evaluation of spectrum-based fault localization , 2009, J. Syst. Softw..

[27]  Mary Jean Harrold,et al.  Empirical evaluation of the tarantula automatic fault-localization technique , 2005, ASE.

[28]  Thomas Zimmermann,et al.  Extraction of bug localization benchmarks from history , 2007, ASE.

[29]  Edward A. Fox,et al.  Combination of Multiple Searches , 1993, TREC.

[30]  Javed A. Aslam,et al.  Models for metasearch , 2001, SIGIR '01.

[31]  John T. Stasko,et al.  Visualization of test information to assist fault localization , 2002, ICSE '02.

[32]  Edward A. Fox,et al.  Combining Evidence from Multiple Searches , 1992, TREC.

[33]  Jian Zhou,et al.  Where should the bugs be fixed? More accurate information retrieval-based bug localization based on bug reports , 2012, 2012 34th International Conference on Software Engineering (ICSE).

[34]  Sarfraz Khurshid,et al.  Improving bug localization using structured information retrieval , 2013, 2013 28th IEEE/ACM International Conference on Automated Software Engineering (ASE).

[35]  Gregg Rothermel,et al.  Supporting Controlled Experimentation with Testing Techniques: An Infrastructure and its Potential Impact , 2005, Empirical Software Engineering.

[36]  Sarfraz Khurshid,et al.  A family of generalized entropies and its application to software fault localization , 2012, 2012 6th IEEE International Conference Intelligent Systems.

[37]  Chao Liu,et al.  SOBER: statistical model-based bug localization , 2005, ESEC/FSE-13.

[38]  Xiangyu Zhang,et al.  Locating faulty code using failure-inducing chops , 2005, ASE.

[39]  Andreas Zeller,et al.  Simplifying and Isolating Failure-Inducing Input , 2002, IEEE Trans. Software Eng..

[40]  Thomas J. Ostrand,et al.  Experiments on the effectiveness of dataflow- and control-flow-based test adequacy criteria , 1994, Proceedings of 16th International Conference on Software Engineering.

[41]  Andreas Zeller,et al.  Why Programs Fail: A Guide to Systematic Debugging , 2005 .

[42]  David Lo,et al.  Version history, similar report, and structure: putting them together for improved bug localization , 2014, ICPC 2014.

[43]  Shin Yoo,et al.  Evolving Human Competitive Spectra-Based Fault Localisation Techniques , 2012, SSBSE.

[44]  Alessandro Orso,et al.  Rapid: Identifying Bug Signatures to Support Debugging Activities , 2008, 2008 23rd IEEE/ACM International Conference on Automated Software Engineering.

[45]  Michael I. Jordan,et al.  Bug isolation via remote program sampling , 2003, PLDI.

[46]  Shengli Wu,et al.  Data Fusion in Information Retrieval , 2012, Adaptation, Learning, and Optimization.

[47]  David Lo,et al.  Search-based fault localization , 2011, 2011 26th IEEE/ACM International Conference on Automated Software Engineering (ASE 2011).

[48]  Trishul M. Chilimbi,et al.  HOLMES: Effective statistical debugging via efficient path profiling , 2009, 2009 IEEE 31st International Conference on Software Engineering.