Boosting Bug-Report-Oriented Fault Localization with Segmentation and Stack-Trace Analysis

To deal with post-release bugs, many software projects set up public bug repositories for users all over the world to report bugs that they have encountered. Recently, researchers have proposed various information retrieval based approaches to localizing faults based on bug reports. In these approaches, source files are processed as single units, where noise in large files may affect the accuracy of fault localization. Furthermore, bug reports often contain stack-trace information, but existing approaches often treat this information as plain text. In this paper, we propose to use segmentation and stack-trace analysis to improve the performance of bug localization. Specifically, given a bug report, we divide each source code file into a series of segments and use the segment most similar to the bug report to represent the file. We also analyze the bug report to identify possible faulty files in a stack trace and favor these files in our retrieval. According to our empirical results, our approach is able to significantly improve Bug Locator, a representative fault localization approach, on all the three software projects (i.e., Eclipse, AspectJ, and SWT) used in our empirical evaluation. Furthermore, segmentation and stack-trace analysis are complementary to each other for boosting the performance of bug-report-oriented fault localization.

[1]  Ellen M. Voorhees,et al.  The TREC-8 Question Answering Track Report , 1999, TREC.

[2]  Hinrich Schütze,et al.  Book Reviews: Foundations of Statistical Natural Language Processing , 1999, CL.

[3]  Onaiza Maqbool,et al.  Bug Prioritization to Facilitate Bug Report Triage , 2012, Journal of Computer Science and Technology.

[4]  Thomas Zimmermann,et al.  Extraction of bug localization benchmarks from history , 2007, ASE.

[5]  Václav Rajlich,et al.  Case study of feature location using dependence graph , 2000, Proceedings IWPC 2000. 8th International Workshop on Program Comprehension.

[6]  Tao Xie,et al.  An approach to detecting duplicate bug reports using natural language and execution information , 2008, 2008 ACM/IEEE 30th International Conference on Software Engineering.

[7]  David Lo,et al.  Duplicate bug report detection with a combination of information retrieval and topic modeling , 2012, 2012 Proceedings of the 27th IEEE/ACM International Conference on Automated Software Engineering.

[8]  Martin P. Robillard,et al.  Concern graphs: finding and describing concerns using structural program dependencies , 2002, Proceedings of the 24th International Conference on Software Engineering. ICSE 2002.

[9]  Rahul Premraj,et al.  Do stack traces help developers fix bugs? , 2010, 2010 7th IEEE Working Conference on Mining Software Repositories (MSR 2010).

[10]  Ken-ichi Matsumoto,et al.  Locating Source Code to Be Fixed Based on Initial Bug Reports - A Case Study on the Eclipse Project , 2012, 2012 Fourth International Workshop on Empirical Software Engineering in Practice.

[11]  Jian Zhou,et al.  Where should the bugs be fixed? More accurate information retrieval-based bug localization based on bug reports , 2012, 2012 34th International Conference on Software Engineering (ICSE).

[12]  Sarfraz Khurshid,et al.  Improving bug localization using structured information retrieval , 2013, 2013 28th IEEE/ACM International Conference on Automated Software Engineering (ASE).

[13]  Yann-Gaël Guéhéneuc,et al.  Feature Location Using Probabilistic Ranking of Methods Based on Execution Scenarios and Information Retrieval , 2007, IEEE Transactions on Software Engineering.

[14]  Gail C. Murphy,et al.  Who should fix this bug? , 2006, ICSE.

[15]  Razvan C. Bunescu,et al.  Learning to rank relevant files for bug reports using domain knowledge , 2014, SIGSOFT FSE.

[16]  Denys Poshyvanyk,et al.  Feature location via information retrieval based filtering of a single scenario execution trace , 2007, ASE.

[17]  Avinash C. Kak,et al.  Retrieval from software libraries for bug localization: a comparative study of generic and composite text models , 2011, MSR '11.

[18]  Thomas Zimmermann,et al.  What Makes a Good Bug Report? , 2008, IEEE Transactions on Software Engineering.

[19]  Andrian Marcus,et al.  On the Use of Stack Traces to Improve Text Retrieval-Based Bug Localization , 2014, 2014 IEEE International Conference on Software Maintenance and Evolution.

[20]  Andrian Marcus,et al.  An information retrieval approach to concept location in source code , 2004, 11th Working Conference on Reverse Engineering.

[21]  Hinrich Schütze,et al.  Introduction to information retrieval , 2008 .

[22]  Rainer Koschke,et al.  Locating Features in Source Code , 2003, IEEE Trans. Software Eng..

[23]  Bogdan Dit,et al.  Feature location in source code: a taxonomy and survey , 2013, J. Softw. Evol. Process..

[24]  Minghui Zhou,et al.  How commercial involvement affects open source projects: three case studies on issue reporting , 2013, Science China Information Sciences.

[25]  Avinash C. Kak,et al.  Assisting code search with automatic Query Reformulation for bug localization , 2013, 2013 10th Working Conference on Mining Software Repositories (MSR).

[26]  Thomas Zimmermann,et al.  Duplicate bug reports considered harmful … really? , 2008, 2008 IEEE International Conference on Software Maintenance.

[27]  Jian Zhou,et al.  Learning to rank duplicate bug reports , 2012, CIKM.

[28]  Gail C. Murphy,et al.  Summarizing software artifacts: a case study of bug reports , 2010, 2010 ACM/IEEE 32nd International Conference on Software Engineering.

[29]  Ken-ichi Matsumoto,et al.  Using Co-change Histories to Improve Bug Localization Performance , 2013, 2013 14th ACIS International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing.

[30]  Emily Hill,et al.  Exploring the neighborhood with dora to expedite software maintenance , 2007, ASE '07.

[31]  David Lo,et al.  Multi-abstraction Concern Localization , 2013, 2013 IEEE International Conference on Software Maintenance.

[32]  Norman Wilde,et al.  Software reconnaissance: Mapping program features to code , 1995, J. Softw. Maintenance Res. Pract..

[33]  Hong Mei,et al.  A survey on bug-report analysis , 2015, Science China Information Sciences.

[34]  Abraham Bernstein,et al.  Software process data quality and characteristics: a historical view on open and closed source projects , 2009, IWPSE-Evol '09.

[35]  Letha H. Etzkorn,et al.  Bug localization using latent Dirichlet allocation , 2010, Inf. Softw. Technol..

[36]  Tim Menzies,et al.  On the use of relevance feedback in IR-based concept location , 2009, 2009 IEEE International Conference on Software Maintenance.

[37]  Andreas Zeller,et al.  Predicting faults from cached history , 2008, ISEC '08.

[38]  Alfred V. Aho,et al.  CERBERUS: Tracing Requirements to Source Code Using Information Retrieval, Dynamic Analysis, and Program Analysis , 2008, 2008 16th IEEE International Conference on Program Comprehension.

[39]  Yann-Gaël Guéhéneuc,et al.  Combining Probabilistic Ranking and Latent Semantic Indexing for Feature Identification , 2006, 14th IEEE International Conference on Program Comprehension (ICPC'06).

[40]  W. Bruce Croft,et al.  Search Engines - Information Retrieval in Practice , 2009 .

[41]  Norman E. Fenton,et al.  Quantitative Analysis of Faults and Failures in a Complex Software System , 2000, IEEE Trans. Software Eng..

[42]  Siau-Cheng Khoo,et al.  A discriminative model approach for accurate duplicate bug report retrieval , 2010, 2010 ACM/IEEE 32nd International Conference on Software Engineering.

[43]  Susan T. Dumais,et al.  Improving information retrieval using latent semantic indexing , 1988 .

[44]  Wei Zhao,et al.  SNIAFL: towards a static non-interactive approach to feature location , 2004, Proceedings. 26th International Conference on Software Engineering.

[45]  Andreas Zeller,et al.  Where Should We Fix This Bug? A Two-Phase Recommendation Model , 2013, IEEE Transactions on Software Engineering.

[46]  Yann-Gaël Guéhéneuc,et al.  Feature identification: a novel approach and a case study , 2005, 21st IEEE International Conference on Software Maintenance (ICSM'05).

[47]  Marc Roper,et al.  Bug localisation through diverse sources of information , 2013, 2013 IEEE International Symposium on Software Reliability Engineering Workshops (ISSREW).