IRBFL: An Information Retrieval Based Fault Localization Approach

Identifying the location of faults in real-world programs is one of the most costly processes during software debugging. In order to reduce debugging effort, many fault localization techniques have been proposed. One of the most widely studied technique is called Spectrum-based fault localization (SBFL), which uses the coverage information and execution results of test cases to do fault localization. Most SBFL techniques only consider the binary coverage information and ignore the execution frequency, so their fault localization accuracy is limited, especially when faults occur in the iteration entities or loop bodies. In this paper, we propose IRBFL, a novel fault localization technique based on information retrieval to extract information from execution frequencies of program entities. IRBFL uses mutation analysis to reduce the low suspicious classes, and then it adopts information retrieval techniques to calculate the suspiciousness value. We evaluate IRBFL on 205 real-world faults from 5 programs in Defects4J benchmark. The experimental results show that our proposed method outperforms the other five state-of-the-art SBFL techniques. More specifically, no matter in single-fault or multi-fault programs, IRBFL can identify 2 to 3 times more faulty methods than the other five SBFL techniques when checking the top 1 method. More empirical results in terms of other metrics, including acc@3, acc@5, EXAM, MRR, and MAP, also indicate that IRBFL technique is better than the other five SBFL techniques.

[1]  Richard J. Lipton,et al.  Hints on Test Data Selection: Help for the Practicing Programmer , 1978, Computer.

[2]  Alessandro Orso,et al.  Are automated debugging techniques actually helping programmers? , 2011, ISSTA '11.

[3]  Jeffrey M. Voas,et al.  PIE: A Dynamic Failure-Based Technique , 1992, IEEE Trans. Software Eng..

[4]  Alistair Moffat,et al.  Rank-biased precision for measurement of retrieval effectiveness , 2008, TOIS.

[5]  Sarfraz Khurshid,et al.  On the Effectiveness of Information Retrieval Based Bug Localization for C Programs , 2014, 2014 IEEE International Conference on Software Maintenance and Evolution.

[6]  David Lo,et al.  Information retrieval and spectrum based bug localization: better together , 2015, ESEC/SIGSOFT FSE.

[7]  Alessandro Orso,et al.  Evaluating the usefulness of IR-based fault localization techniques , 2015, ISSTA.

[8]  Lars Grunske,et al.  A learning-to-rank based fault localization approach using likely invariants , 2016, ISSTA.

[9]  Michael D. Ernst,et al.  Defects4J: a database of existing faults to enable controlled testing studies for Java programs , 2014, ISSTA 2014.

[10]  W. Eric Wong,et al.  The DStar Method for Effective Software Fault Localization , 2014, IEEE Transactions on Reliability.

[11]  Martha Larson,et al.  CLiMF: learning to maximize reciprocal rank with collaborative less-is-more filtering , 2012, RecSys.

[12]  Zuohua Ding,et al.  Fault localization based on statement frequency , 2016, Inf. Sci..

[13]  F. Pukelsheim The Three Sigma Rule , 1994 .

[14]  John T. Stasko,et al.  Visualization of test information to assist fault localization , 2002, ICSE '02.

[15]  Na Meng,et al.  How Does Execution Information Help with Information-Retrieval Based Bug Localization? , 2017, 2017 IEEE/ACM 25th International Conference on Program Comprehension (ICPC).

[16]  René Just,et al.  The major mutation framework: efficient and scalable mutation analysis for Java , 2014, ISSTA 2014.

[17]  Jian Zhou,et al.  Where should the bugs be fixed? More accurate information retrieval-based bug localization based on bug reports , 2012, 2012 34th International Conference on Software Engineering (ICSE).

[18]  Sarfraz Khurshid,et al.  Improving bug localization using structured information retrieval , 2013, 2013 28th IEEE/ACM International Conference on Automated Software Engineering (ASE).

[19]  Peter Zoeteweij,et al.  An Evaluation of Similarity Coefficients for Software Fault Localization , 2006, 2006 12th Pacific Rim International Symposium on Dependable Computing (PRDC'06).

[20]  Yong Liu,et al.  An optimal mutation execution strategy for cost reduction of mutation-based fault localization , 2018, Inf. Sci..