The Way Ahead for Bug-fix time Prediction

The bug-fix time i.e. the time to fix a bug after the bug was introduced is an important factor for bug related analysis, such as measuring software quality or coordinating development effort during bug triaging. Previous work has proposed many bug-fix time prediction models that use various bug attributes (number of developers who participated in fixing the bug, bug severity, bug-opener’s reputation, number of patches) for predicting the fix time of a newly reported bug. In this paper, we have investigated the associations between bug attributes and the bug-fix time. We have proposed two approaches to apply association rule mining. In the first approach, we have used Apriori algorithm to predict the fix time of a newly coming bug based on the bug’s severity, priority summary terms and assignee. In second approach, we have used k-means clustering method to get groups of correlated variables followed by association rule mining inside each cluster. We have collected 1,695 bug reports of three products namely AddOnSDK, Thunderbird and Bugzilla of Mozilla open source project to mine association rules. Results show that for given set of bug attributes, we can predict the bug-fix time for newly coming bugs which will help in software quality improvement. A large number of association rules having high confidence and support with higher severity and priority as antecedents and short bug-fix time as consequent show that more important bugs are fixed without any delay. This information is useful in determining software quality. We also observe that our approach for bug-fix time prediction will be helpful in bug triaging by assigning a bug to the most potential and experienced assignee who will solve the bug in minimum time period. This will again help in software quality improvement. In nutshell, we can say that association rule mining based bug-fix time prediction can help managers to improve the software development process. Keywords—Bug-fix time; Apriori algorithm; Association rule mining; k-means Clustering

[1]  D HerbslebJames,et al.  Two case studies of open source software development , 2002 .

[2]  Iulian Neamtiu,et al.  Bug-fix time prediction models: can we do better? , 2011, MSR '11.

[3]  J. Herbsleb,et al.  Two case studies of open source software development: Apache and Mozilla , 2002, TSEM.

[4]  Ingo Mierswa,et al.  YALE: rapid prototyping for complex data mining tasks , 2006, KDD '06.

[5]  Mladen A. Vouk,et al.  On predicting the time taken to correct bug reports in open source projects , 2009, 2009 IEEE International Conference on Software Maintenance.

[6]  Kamal Ali,et al.  Partial Classification Using Association Rules , 1997, KDD.

[7]  Meera Sharma,et al.  Understanding the meaning of bug attributes and prediction models , 2013, I-CARE '13.

[8]  Tomasz Imielinski,et al.  Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.

[9]  Wynne Hsu,et al.  Integrating Classification and Association Rule Mining , 1998, KDD.

[10]  Meera Sharma,et al.  Bug Assignee Prediction Using Association Rule Mining , 2015, ICCSA.

[11]  Jinyan Li,et al.  CAEP: Classification by Aggregating Emerging Patterns , 1999, Discovery Science.

[12]  Qiang Yang,et al.  Mining web logs for prediction models in WWW caching and prefetching , 2001, KDD '01.

[13]  Andreas Zeller,et al.  Mining Version Histories to Guide Software Changes , 2004 .

[14]  Martin F. Porter,et al.  An algorithm for suffix stripping , 1997, Program.

[15]  Qinbao Song,et al.  Software defect association mining and defect correction effort prediction , 2006, IEEE Transactions on Software Engineering.

[16]  Annie T. T. Ying,et al.  Predicting source code changes by mining revision history , 2003 .

[17]  Mohamed Kholief,et al.  Improving bug fix-time prediction model by filtering out outliers , 2013, 2013 The International Conference on Technological Advances in Electrical, Electronics and Computer Engineering (TAEECE).

[18]  Sunghun Kim,et al.  How long did it take to fix bugs? , 2006, MSR '06.

[19]  Alexandre Villeminot,et al.  Combined use of association rules mining and clustering methods to find relevant links between binary rare attributes in a large data set , 2007, Comput. Stat. Data Anal..

[20]  Harald C. Gall,et al.  Predicting the fix time of bugs , 2010, RSSE '10.

[21]  Ke Wang,et al.  Frequent-subsequence-based prediction of outer membrane proteins , 2003, KDD '03.

[22]  Ke Wang,et al.  Growing decision trees on support-less association rules , 2000, KDD '00.

[23]  Jiawei Han,et al.  CPAR: Classification based on Predictive Association Rules , 2003, SDM.

[24]  Ke Wang,et al.  Building Hierarchical Classifiers Using Class Proximity , 1999, VLDB.

[25]  Westley Weimer,et al.  Modeling bug report quality , 2007, ASE '07.

[26]  Foutse Khomh,et al.  An Empirical Study on Factors Impacting Bug Fixing Time , 2012, 2012 19th Working Conference on Reverse Engineering.