Towards an Improvement of Bug Severity Classification

Predicting the severity of bugs has been found in past research to improve triaging and the bug resolution process. For this reason, many classification/prediction approaches emerged over the years to provide an automated reasoning over severity classes. In this paper, we use text mining together with bi-grams and feature selection to improve the classification of bugs in severe/non-severe classes. We adopt the Naïve Bayes (NB) classifier considering Mozilla and Eclipse datasets commonly used in related works. Overall, the results show that the application of bi-grams can improve slightly the performance of the classifier, but feature selection can be more effective to determine the most informative terms and bi-grams. The results are in any case project-dependent, as in some cases the addition of bi-grams may worsen the performance.

[1]  Bart Goethals,et al.  Predicting the severity of a reported bug , 2010, 2010 7th IEEE Working Conference on Mining Software Repositories (MSR 2010).

[2]  Cheng-Zen Yang,et al.  An Empirical Study on Improving Severity Prediction of Defect Reports Using Feature Selection , 2012, 2012 19th Asia-Pacific Software Engineering Conference.

[3]  Tao Xie,et al.  An approach to detecting duplicate bug reports using natural language and execution information , 2008, 2008 ACM/IEEE 30th International Conference on Software Engineering.

[4]  Patrick F. Reidy An Introduction to Latent Semantic Analysis , 2009 .

[5]  Giancarlo Succi,et al.  Modelling Failures Occurrences of Open Source Software with Reliability Growth , 2010, OSS.

[6]  Akito Monden,et al.  Defect Data Analysis Based on Extended Association Rule Mining , 2007, Fourth International Workshop on Mining Software Repositories (MSR'07:ICSE Workshops 2007).

[7]  Lucas D. Panjer Predicting Eclipse Bug Lifetimes , 2007, Fourth International Workshop on Mining Software Repositories (MSR'07:ICSE Workshops 2007).

[8]  K. K. Chaturvedi,et al.  Determining Bug severity using machine learning techniques , 2012, 2012 CSI Sixth International Conference on Software Engineering (CONSEG).

[9]  Gail C. Murphy,et al.  Who should fix this bug? , 2006, ICSE.

[10]  Tom Fawcett,et al.  An introduction to ROC analysis , 2006, Pattern Recognit. Lett..

[11]  David Lo,et al.  DRONE: Predicting Priority of Reported Bugs by Multi-factor Analysis , 2013, ICSM.

[12]  Wenhuang Liu,et al.  Rare Class Mining: Progress and Prospect , 2009, 2009 Chinese Conference on Pattern Recognition.

[13]  David Lo,et al.  Information Retrieval Based Nearest Neighbor Classification for Fine-Grained Bug Severity Prediction , 2012, 2012 19th Working Conference on Reverse Engineering.

[14]  Serge Demeyer,et al.  Comparing Mining Algorithms for Predicting the Severity of a Reported Bug , 2011, 2011 15th European Conference on Software Maintenance and Reengineering.

[15]  Tim Menzies,et al.  Automated severity assessment of software defect reports , 2008, 2008 IEEE International Conference on Software Maintenance.

[16]  Tao Xie,et al.  Identifying security bug reports via text mining: An industrial case study , 2010, 2010 7th IEEE Working Conference on Mining Software Repositories (MSR 2010).

[17]  David A. Hull Improving text retrieval for the routing problem using latent semantic indexing , 1994, SIGIR '94.

[18]  Michel R. V. Chaudron,et al.  Automated prediction of defect severity based on codifying design knowledge using ontologies , 2012, 2012 First International Workshop on Realizing AI Synergies in Software Engineering (RAISE).