Weight similarity measurement model based, object oriented approach for bug databases mining to detect similar and duplicate bugs

In this paper data mining is applied on bug database to discover the similar and duplicate bugs. Whenever a new bug will be entered in the bug database through bug tracking system, it will be matched against the existing bugs and duplicate and similar bugs will be mined from the bug database. Similar kind of bugs are resolved in almost in same manners. So if a bug is found somewhere similar to other existing bug which is already resolved then its resolution will take less time, since some of the bug analysis part is similar to existing one, hence it will save time. In the existing tradition developers must have to manually identify duplicate bug reports, but this identification process is time-consuming and exacerbates the already high cost of software maintenance. So if the similar and duplicate bugs can be found out using some approach it will be a cost and time saving activity. Based on this concept a weight similarity measurement model based object orinted approach is described here in this paper to discover similar and duplicate bugs in the bug database.

[1]  Akito Monden,et al.  Defect Data Analysis Based on Extended Association Rule Mining , 2007, Fourth International Workshop on Mining Software Repositories (MSR'07:ICSE Workshops 2007).

[2]  Euripides G. M. Petrakis,et al.  Information Retrieval by Semantic Similarity , 2006, Int. J. Semantic Web Inf. Syst..

[3]  Tao Xie,et al.  An approach to detecting duplicate bug reports using natural language and execution information , 2008, 2008 ACM/IEEE 30th International Conference on Software Engineering.

[4]  Euripides G. M. Petrakis,et al.  Semantic similarity methods in wordNet and their application to information retrieval on the web , 2005, WIDM '05.

[5]  Michael McGill,et al.  Introduction to Modern Information Retrieval , 1983 .

[6]  William R. Hersh,et al.  Information Retrieval: A Health and Biomedical Perspective , 2002 .

[7]  Nicholas Jalbert,et al.  Automated duplicate detection for bug tracking systems , 2008, 2008 IEEE International Conference on Dependable Systems and Networks With FTCS and DCC (DSN).

[8]  James H. Martin,et al.  Speech and language processing: an introduction to natural language processing, computational linguistics, and speech recognition, 2nd Edition , 2000, Prentice Hall series in artificial intelligence.

[9]  Javed A. Aslam,et al.  An information-theoretic measure for document similarity , 2003, SIGIR.

[10]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.