Locus: Locating bugs from software changes

Various information retrieval (IR) based techniques have been proposed recently to locate bugs automatically at the file level. However, their usefulness is often compromised by the coarse granularity of files and the lack of contextual information. To address this, we propose to locate bugs using software changes, which offer finer granularity than files and provide important contextual clues for bug-fixing. We observe that bug inducing changes can facilitate the bug fixing process. For example, it helps triage the bug fixing task to the developers who committed the bug inducing changes or enables developers to fix bugs by reverting these changes. Our study further identifies that change logs and the naturally small granularity of changes can help boost the performance of IR-based bug localization. Motivated by these observations, we propose an IR-based approach Locus to locate bugs from software changes, and evaluate it on six large open source projects. The results show that Locus outperforms existing techniques at the source file level localization significantly. MAP and MRR in particular have been improved, on average, by 20.1% and 20.5%, respectively. Locus is also capable of locating the inducing changes within top 5 for 41.0% of the bugs. The results show that Locus can significantly reduce the number of lines needing to be scanned to locate the bug compared with existing techniques.

[1]  Andrian Marcus,et al.  On the Relationship between the Vocabulary of Bug Reports and Source Code , 2013, 2013 IEEE International Conference on Software Maintenance.

[2]  H. B. Mann,et al.  On a Test of Whether one of Two Random Variables is Stochastically Larger than the Other , 1947 .

[3]  Ding Yuan,et al.  How do fixes become bugs? , 2011, ESEC/FSE '11.

[4]  Jonathan I. Maletic,et al.  What's a Typical Commit? A Characterization of Open Source Software Repositories , 2008, 2008 16th IEEE International Conference on Program Comprehension.

[5]  Anh Tuan Nguyen,et al.  Combining Deep Learning with Information Retrieval to Localize Buggy Files for Bug Reports (N) , 2015, 2015 30th IEEE/ACM International Conference on Automated Software Engineering (ASE).

[6]  Rongxin Wu,et al.  ReLink: recovering links between bugs and changes , 2011, ESEC/FSE '11.

[7]  Thomas Zimmermann,et al.  Automatic Identification of Bug-Introducing Changes , 2006, 21st IEEE/ACM International Conference on Automated Software Engineering (ASE'06).

[8]  Andrian Marcus,et al.  On the Use of Stack Traces to Improve Text Retrieval-Based Bug Localization , 2014, 2014 IEEE International Conference on Software Maintenance and Evolution.

[9]  Andreas Zeller,et al.  Where Should We Fix This Bug? A Two-Phase Recommendation Model , 2013, IEEE Transactions on Software Engineering.

[10]  Xiao Ma,et al.  eDoctor : Automatically Diagnosing Abnormal Battery Drain Issues on Smartphones , 2013 .

[11]  Lu Zhang,et al.  Boosting Bug-Report-Oriented Fault Localization with Segmentation and Stack-Trace Analysis , 2014, 2014 IEEE International Conference on Software Maintenance and Evolution.

[12]  Witold Pedrycz,et al.  A comparative analysis of the efficiency of change metrics and static code attributes for defect prediction , 2008, 2008 ACM/IEEE 30th International Conference on Software Engineering.

[13]  Razvan C. Bunescu,et al.  Learning to rank relevant files for bug reports using domain knowledge , 2014, SIGSOFT FSE.

[14]  Andreas Zeller,et al.  Predicting faults from cached history , 2008, ISEC '08.

[15]  Lu Zhang,et al.  A history-based matching approach to identification of framework evolution , 2012, 2012 34th International Conference on Software Engineering (ICSE).

[16]  Hung Viet Nguyen,et al.  A topic-based approach for narrowing the search space of buggy files from a bug report , 2011, 2011 26th IEEE/ACM International Conference on Automated Software Engineering (ASE 2011).

[17]  Andreas Zeller,et al.  Simplifying and Isolating Failure-Inducing Input , 2002, IEEE Trans. Software Eng..

[18]  Harvey P. Siy,et al.  Predicting Fault Incidence Using Software Change History , 2000, IEEE Trans. Software Eng..

[19]  Rongxin Wu,et al.  CrashLocator: locating crashing faults based on crash stacks , 2014, ISSTA 2014.

[20]  Frank Tip,et al.  Chianti: a tool for change impact analysis of java programs , 2004, OOPSLA.

[21]  K LukinsStacy,et al.  Bug localization using latent Dirichlet allocation , 2010 .

[22]  Thomas Zimmermann,et al.  Improving bug triage with bug tossing graphs , 2009, ESEC/FSE '09.

[23]  Premkumar T. Devanbu,et al.  BugCache for inspections: hit or miss? , 2011, ESEC/FSE '11.

[24]  A.J.C. van Gemund,et al.  On the Accuracy of Spectrum-based Fault Localization , 2007, Testing: Academic and Industrial Conference Practice and Research Techniques - MUTATION (TAICPART-MUTATION 2007).

[25]  Robert S. Arnold,et al.  Software Change Impact Analysis , 1996 .

[26]  David Lo,et al.  Understanding Widespread Changes: A Taxonomic Study , 2013, 2013 17th European Conference on Software Maintenance and Reengineering.

[27]  Letha H. Etzkorn,et al.  Bug localization using latent Dirichlet allocation , 2010, Inf. Softw. Technol..

[28]  David Lo,et al.  Information retrieval and spectrum based bug localization: better together , 2015, ESEC/SIGSOFT FSE.

[29]  Alessandro Orso,et al.  Evaluating the usefulness of IR-based fault localization techniques , 2015, ISSTA.

[30]  Avinash C. Kak,et al.  Retrieval from software libraries for bug localization: a comparative study of generic and composite text models , 2011, MSR '11.

[31]  Harald C. Gall,et al.  Change Distilling:Tree Differencing for Fine-Grained Source Code Change Extraction , 2007, IEEE Transactions on Software Engineering.

[32]  Audris Mockus,et al.  A large-scale empirical study of just-in-time quality assurance , 2013, IEEE Transactions on Software Engineering.

[33]  Andreas Zeller,et al.  When do changes induce fixes? , 2005, ACM SIGSOFT Softw. Eng. Notes.

[34]  Alessandro Orso,et al.  Are automated debugging techniques actually helping programmers? , 2011, ISSTA '11.

[35]  Ellen M. Voorhees,et al.  The TREC-8 Question Answering Track Report , 1999, TREC.

[36]  Jian Zhou,et al.  Where should the bugs be fixed? More accurate information retrieval-based bug localization based on bug reports , 2012, 2012 34th International Conference on Software Engineering (ICSE).

[37]  Sarfraz Khurshid,et al.  Improving bug localization using structured information retrieval , 2013, 2013 28th IEEE/ACM International Conference on Automated Software Engineering (ASE).

[38]  Hinrich Schütze,et al.  Book Reviews: Foundations of Statistical Natural Language Processing , 1999, CL.

[39]  Thomas Zimmermann,et al.  Extraction of bug localization benchmarks from history , 2007, ASE.

[40]  David Lo,et al.  Version history, similar report, and structure: putting them together for improved bug localization , 2014, ICPC 2014.

[41]  Yi Zhang,et al.  Classifying Software Changes: Clean or Buggy? , 2008, IEEE Transactions on Software Engineering.

[42]  Sarfraz Khurshid,et al.  Localizing failure-inducing program edits based on spectrum information , 2011, 2011 27th IEEE International Conference on Software Maintenance (ICSM).

[43]  Martin P. Robillard,et al.  Non-essential changes in version histories , 2011, 2011 33rd International Conference on Software Engineering (ICSE).