Abstract Performance of the bug prediction model is directly depends on the misclassification of bug reports. Misclassification issue surely scarifies the accuracy of the system. To resolve this issue the manual examination of bug reports are required, but it is very time consuming and tedious job for a developer and tester. In this paper the hybrid approach of merging text mining, natural language processing and machine learning techniques is used to identify bug report as bug or non-bug. The four incorporates fields with textual fields are added to bug reports to improve the performance of classifier. TF-IDF and Bigram feature extraction methods are used with feature selection and K-nearest neighbor (K-NN) classifier. The performance of the proposed system is evaluated by using Precision, Recall and F-measure by using five datasets. It is observed that the performance of K-NN classifier is changed according to the dataset and addition of bigram method improve the performance of classifier.
[1]
Elaine J. Weyuker,et al.
Predicting the location and number of faults in large software systems
,
2005,
IEEE Transactions on Software Engineering.
[2]
Wessel Kraaij,et al.
Evaluation and analysis of term scoring methods for term extraction
,
2016,
Information Retrieval Journal.
[3]
Rajni Mohana,et al.
Anaphora Resolution in Hindi: Issues and Directions
,
2016
.
[4]
Yu Zhou,et al.
Combining text mining and data mining for bug report classification
,
2016,
J. Softw. Evol. Process..
[5]
Byungjeong Lee,et al.
Bug Severity Prediction by Classifying Normal Bugs with Text and Meta-Field Information
,
2016
.
[6]
Gitika Sharma,et al.
A Novel Way of Assessing Software Bug Severity Using Dictionary of Critical Terms
,
2015
.