An Approach for Predicting Bug Triage using Data Reduction Methods

Most of the software companies need to deal with large number of software bugs each and every day. Software bugs are inevitable and fixing software bugs is an expensive task. The proposed system employs the combination of data reduction techniques that is feature selection algorithm (FS) and instance selection algorithm (IS) in order to shrink the bug data set and also to upgrade the accuracy of bug triage. Predictive model is used to determine the order of reduction techniques for a new bug data set, i.e., to choose between FS to IS or IS to FS. The aim of effective bug triaging software is to assign potentially skilled developers to new coming bug reports. To decrease the manual and time cost, text classification techniques are applied to accomplish automatic bug triage approach aims to precisely predict the developer to solve or fix the new bug report. The proposed system performance is verified using Mozilla bug data set. To exhibit the effectiveness, scales of data set is reduced by using data reduction technique in order to decrease the time and labor cost, improve the accuracy of bug triage with high-quality bug data in software development and maintenance.

[1]  Oscar Nierstrasz,et al.  Assigning bug reports using a vocabulary-based expertise model of developers , 2009, 2009 6th IEEE International Working Conference on Mining Software Repositories.

[2]  Sunghun Kim,et al.  Reducing Features to Improve Code Change-Based Bug Prediction , 2013, IEEE Transactions on Software Engineering.

[3]  Thomas Zimmermann,et al.  Information needs in bug reports: improving cooperation between developers and users , 2010, CSCW '10.

[4]  Shari Lawrence Pfleeger,et al.  Software Metrics : A Rigorous and Practical Approach , 1998 .

[5]  Gail C. Murphy,et al.  Who should fix this bug? , 2006, ICSE.

[6]  Emerson R. Murphy-Hill,et al.  The design of bug fixes , 2013, 2013 35th International Conference on Software Engineering (ICSE).

[7]  Gail C. Murphy,et al.  Automatic bug triage using text categorization , 2004, SEKE.

[8]  He Jiang,et al.  Towards Training Set Reduction for Bug Triage , 2011, 2011 IEEE 35th Annual Computer Software and Applications Conference.

[9]  Tao Xie,et al.  An approach to detecting duplicate bug reports using natural language and execution information , 2008, 2008 ACM/IEEE 30th International Conference on Software Engineering.

[10]  Bin Li,et al.  A survey on instance selection for active learning , 2012, Knowledge and Information Systems.

[11]  Yoav Freund,et al.  Experiments with a New Boosting Algorithm , 1996, ICML.

[12]  Yiming Yang,et al.  High-performing feature selection for text classification , 2002, CIKM '02.

[13]  He Jiang,et al.  Towards Effective Bug Triage with Software Data Reduction Techniques , 2017, IEEE Transactions on Knowledge and Data Engineering.

[14]  Siau-Cheng Khoo,et al.  Towards more accurate retrieval of duplicate bug reports , 2011, 2011 26th IEEE/ACM International Conference on Automated Software Engineering (ASE 2011).

[15]  He Jiang,et al.  Developer prioritization in bug repositories , 2012, 2012 34th International Conference on Software Engineering (ICSE).

[16]  Philip J. Guo,et al.  Characterizing and predicting which bugs get reopened , 2012, 2012 34th International Conference on Software Engineering (ICSE).

[17]  Jun Yan,et al.  Automatic Bug Triage using Semi-Supervised Text Classification , 2017, SEKE.

[18]  Tony R. Martinez,et al.  Reduction Techniques for Instance-Based Learning Algorithms , 2000, Machine Learning.