Improving Defect Localization by Classifying the Affected Asset Using Machine Learning

A vital part of a defect’s resolution is the task of defect localization. Defect localization is the task of finding the exact location of the defect in the system. The defect report, in particular, the asset attribute, helps the person assigned to handle the problem to limit the search space when investigating the exact location of the defect. However, research has shown that oftentimes reporters initially assign values to these attributes that provide incorrect information. In this paper, we propose and evaluate the way of automatically identifying the location of a defect using machine learning to classify the source asset. By training an Support-Vector-Machine (SVM) classifier with features constructed from both categorical and textual attributes of the defect reports we achieved an accuracy of 58.52% predicting the source asset. However, when we trained an SVM to provide a list of recommendations rather than a single prediction, the recall increased to up to 92.34%. Given these results, we conclude that software development teams can use these algorithms to predict up to ten potential locations, but already with three predicted locations, the teams can get useful results with the accuracy of over 70%.

[1]  Qing Wang,et al.  An empirical study on bug assignment automation using Chinese bug data , 2009, ESEM 2009.

[2]  Jan Bosch,et al.  EXPERIENCED BENEFITS OF CONTINUOUS INTEGRATION IN INDUSTRY SOFTWARE PRODUCT DEVELOPMENT: A CASE STUDY , 2013, ICSE 2013.

[3]  Jan Bosch,et al.  Enablers and inhibitors for speed with reuse , 2012, SPLC '12.

[4]  Miroslaw Staron,et al.  Predicting Short-Term Defect Inflow in Large Software Projects - An Initial Evaluation , 2007, EASE.

[5]  Miroslaw Staron,et al.  Predicting weekly defect inflow in large software projects based on project planning and test status , 2008, Inf. Softw. Technol..

[6]  Jan Bosch,et al.  Speed, Data, and Ecosystems: The Future of Software Engineering , 2016, IEEE Software.

[7]  David Lo,et al.  Accurate developer recommendation for bug resolution , 2013, 2013 20th Working Conference on Reverse Engineering (WCRE).

[8]  Per Runeson,et al.  Detection of Duplicate Defect Reports Using Natural Language Processing , 2007, 29th International Conference on Software Engineering (ICSE'07).

[9]  Nicholas Jalbert,et al.  Automated duplicate detection for bug tracking systems , 2008, 2008 IEEE International Conference on Dependable Systems and Networks With FTCS and DCC (DSN).

[10]  David Broman,et al.  Automated bug assignment: Ensemble-based machine learning in large scale industrial contexts , 2016, Empirical Software Engineering.

[11]  Shadi Banitaan,et al.  TRAM: An approach for assigning bug reports using their Metadata , 2013, 2013 Third International Conference on Communications and Information Technology (ICCIT).

[12]  Iulian Neamtiu,et al.  The Journal of Systems and Software 85 (2012) 2275–2292 Contents lists available at SciVerse ScienceDirect The Journal of Systems and Software , 2022 .

[13]  Neetu Sardana,et al.  Machine Learning or Information Retrieval Techniques for Bug Triaging: Which is better? , 2017, e Informatica Softw. Eng. J..

[14]  Siau-Cheng Khoo,et al.  A discriminative model approach for accurate duplicate bug report retrieval , 2010, 2010 ACM/IEEE 32nd International Conference on Software Engineering.

[15]  Gail C. Murphy,et al.  Who should fix this bug? , 2006, ICSE.

[16]  Silvio Romero de Lemos Meira,et al.  Challenges and opportunities for software change request repositories: a systematic mapping study , 2014, J. Softw. Evol. Process..

[17]  David Lo,et al.  Improved Duplicate Bug Report Identification , 2012, 2012 16th European Conference on Software Maintenance and Reengineering.

[18]  Rakesh Rana,et al.  A framework for adoption of machine learning in industry for software defect prediction , 2014, 2014 9th International Conference on Software Engineering and Applications (ICSOFT-EA).

[19]  Bart Baesens,et al.  Benchmarking Classification Models for Software Defect Prediction: A Proposed Framework and Novel Findings , 2008, IEEE Transactions on Software Engineering.

[20]  Geoffrey I. Webb,et al.  Encyclopedia of Machine Learning , 2011, Encyclopedia of Machine Learning.

[21]  Leif Jonsson Increasing anomaly handling efficiency in large organizations using applied machine learning , 2013, 2013 35th International Conference on Software Engineering (ICSE).