Learning to Classify Bug Reports into Components

Bug reports in widely used defect tracking systems contains standard and mandatory fields like product name, component name, version number and operating system. Such fields provide important information required by developers during bug fixing. Previous research shows that bug reporters often assign incorrect values for such fields which cause problems and delays in bug fixing. We conduct an empirical study on the issue of incorrect component assignments or component reassignments in bug reports. We perform a case study on open-source Eclipse and Mozilla projects and report results on various aspects such as the percentage of reassignments, distribution across number of assignments until closure of a bug and time difference between creation and reassignment event. We perform a series of experiments using a machine learning framework for two prediction tasks: categorizing a given bug report into a pre-defined list of components and predicting whether a given bug report will be reassigned. Experimental results demonstrate correlation between terms present in bug reports (textual documents) and components which can be used as linguistic indicators for the task of component prediction. We study component reassignment graphs and reassignment probabilities and investigate their usefulness for the task of component reassignment prediction.

[1]  Thomas Zimmermann,et al.  Quality of bug reports in Eclipse , 2007, eclipse '07.

[2]  Thomas Zimmermann,et al.  What Makes a Good Bug Report? , 2008, IEEE Transactions on Software Engineering.

[3]  Thomas Zimmermann,et al.  Improving bug tracking systems , 2009, 2009 31st International Conference on Software Engineering - Companion Volume.

[4]  Lucila Ohno-Machado,et al.  Natural language processing: an introduction , 2011, J. Am. Medical Informatics Assoc..

[5]  Thomas Zimmermann,et al.  Frequently Asked Questions in Bug Reports , 2009 .

[6]  Thomas Zimmermann,et al.  Improving bug triage with bug tossing graphs , 2009, ESEC/FSE '09.

[7]  Westley Weimer,et al.  Modeling bug report quality , 2007, ASE '07.

[8]  Chao Liu,et al.  An Approach to Improving Bug Assignment with Bug Tossing Graphs and Bug Similarities , 2011, J. Softw..

[9]  Thomas Zimmermann,et al.  Extracting structural information from bug reports , 2008, MSR '08.

[10]  Philip J. Guo,et al.  "Not my bug!" and other reasons for software bug report reassignments , 2011, CSCW.

[11]  Iulian Neamtiu,et al.  Fine-grained incremental learning and multi-feature tossing graphs to improve bug triaging , 2010, 2010 IEEE International Conference on Software Maintenance.