Studying re-opened bugs in open source software

Bug fixing accounts for a large amount of the software maintenance resources. Generally, bugs are reported, fixed, verified and closed. However, in some cases bugs have to be re-opened. Re-opened bugs increase maintenance costs, degrade the overall user-perceived quality of the software and lead to unnecessary rework by busy practitioners. In this paper, we study and predict re-opened bugs through a case study on three large open source projects—namely Eclipse, Apache and OpenOffice. We structure our study along four dimensions: (1) the work habits dimension (e.g., the weekday on which the bug was initially closed), (2) the bug report dimension (e.g., the component in which the bug was found) (3) the bug fix dimension (e.g., the amount of time it took to perform the initial fix) and (4) the team dimension (e.g., the experience of the bug fixer). We build decision trees using the aforementioned factors that aim to predict re-opened bugs. We perform top node analysis to determine which factors are the most important indicators of whether or not a bug will be re-opened. Our study shows that the comment text and last status of the bug when it is initially closed are the most important factors related to whether or not a bug will be re-opened. Using a combination of these dimensions, we can build explainable prediction models that can achieve a precision between 52.1–78.6 % and a recall in the range of 70.5–94.1 % when predicting whether a bug will be re-opened. We find that the factors that best indicate which bugs might be re-opened vary based on the project. The comment text is the most important factor for the Eclipse and OpenOffice projects, while the last status is the most important one for Apache. These factors should be closely examined in order to reduce maintenance cost due to re-opened bugs.

[1]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[2]  Bart Goethals,et al.  Predicting the severity of a reported bug , 2010, 2010 7th IEEE Working Conference on Mining Software Repositories (MSR 2010).

[3]  Lin Tan,et al.  Do time of day and developer experience affect commit bugginess? , 2011, MSR '11.

[4]  Victor R. Basili,et al.  The influence of organizational structure on software quality , 2008, 2008 ACM/IEEE 30th International Conference on Software Engineering.

[5]  A. Zeller,et al.  Predicting Defects for Eclipse , 2007, Third International Workshop on Predictor Models in Software Engineering (PROMISE'07: ICSE Workshops 2007).

[6]  Ken-ichi Matsumoto,et al.  Predicting Re-opened Bugs: A Case Study on the Eclipse Project , 2010, 2010 17th Working Conference on Reverse Engineering.

[7]  J. Herbsleb,et al.  Two case studies of open source software development: Apache and Mozilla , 2002, TSEM.

[8]  Daniel M. Germán,et al.  Towards a simplification of the bug report form in eclipse , 2008, MSR '08.

[9]  Harald C. Gall,et al.  Proceedings of the 2006 international workshop on Mining software repositories , 2006, International Conference on Software Engineering.

[10]  Gail C. Murphy,et al.  Who should fix this bug? , 2006, ICSE.

[11]  Ian H. Witten,et al.  Data mining - practical machine learning tools and techniques, Second Edition , 2005, The Morgan Kaufmann series in data management systems.

[12]  Roberto Alejo,et al.  Performance evaluation of prototype selection algorithms for nearest neighbor classification , 2001, Proceedings XIV Brazilian Symposium on Computer Graphics and Image Processing.

[13]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[14]  Salvatore J. Stolfo,et al.  Toward Scalable Learning with Non-Uniform Class and Cost Distributions: A Case Study in Credit Card Fraud Detection , 1998, KDD.

[15]  Harvey P. Siy,et al.  Predicting Fault Incidence Using Software Change History , 2000, IEEE Trans. Software Eng..

[16]  Ahmed E. Hassan,et al.  MapReduce as a general framework to support research in Mining Software Repositories (MSR) , 2009, 2009 6th IEEE International Working Conference on Mining Software Repositories.

[17]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1997, EuroCOLT.

[18]  Hoh Peter In,et al.  Micro interaction metrics for defect prediction , 2011, ESEC/FSE '11.

[19]  Andreas Zeller,et al.  How Long Will It Take to Fix This Bug? , 2007, Fourth International Workshop on Mining Software Repositories (MSR'07:ICSE Workshops 2007).

[20]  Philip J. Guo,et al.  Characterizing and predicting which bugs get fixed: an empirical study of Microsoft Windows , 2010, 2010 ACM/IEEE 32nd International Conference on Software Engineering.

[21]  Premkumar T. Devanbu,et al.  Fair and balanced?: bias in bug-fix datasets , 2009, ESEC/FSE '09.

[22]  Prasanth Anbalagan,et al.  "Days of the week" effect in predicting the time taken to fix defects , 2009, DEFECTS '09.

[23]  Audris Mockus,et al.  Organizational volatility and its effects on software defects , 2010, FSE '10.

[24]  Philip J. Guo,et al.  Characterizing and predicting which bugs get reopened , 2012, 2012 34th International Conference on Software Engineering (ICSE).

[25]  Sunghun Kim,et al.  How long did it take to fix bugs? , 2006, MSR '06.

[26]  อนิรุธ สืบสิงห์,et al.  Data Mining Practical Machine Learning Tools and Techniques , 2014 .

[27]  Ahmed E. Hassan,et al.  Should I contribute to this discussion? , 2010, 2010 7th IEEE Working Conference on Mining Software Repositories (MSR 2010).

[28]  R. Barandelaa,et al.  Strategies for learning in class imbalance problems , 2003, Pattern Recognit..

[29]  Audris Mockus,et al.  International Workshop on Mining Software Repositories , 2004 .

[30]  Westley Weimer,et al.  Modeling bug report quality , 2007, ASE '07.

[31]  Andreas Zeller,et al.  When do changes induce fixes? , 2005, ACM SIGSOFT Softw. Eng. Notes.

[32]  Nathalie Japkowicz,et al.  A Mixture-of-Experts Framework for Learning from Imbalanced Data Sets , 2001, IDA.

[33]  Ahmed E. Hassan,et al.  Predicting faults using the complexity of code changes , 2009, 2009 IEEE 31st International Conference on Software Engineering.

[34]  Phongphun Kijsanayothin,et al.  On modeling software defect repair time , 2009, Empirical Software Engineering.

[35]  Ahmed E. Hassan,et al.  Using Decision Trees to Predict the Certification Result of a Build , 2006, 21st IEEE/ACM International Conference on Automated Software Engineering (ASE'06).

[36]  Foutse Khomh,et al.  Is it a bug or an enhancement?: a text-based approach to classify change requests , 2008, CASCON '08.

[37]  Yi Zhang,et al.  Classifying Software Changes: Clean or Buggy? , 2008, IEEE Transactions on Software Engineering.

[38]  L. Erlikh,et al.  Leveraging legacy system dollars for e-business , 2000 .

[39]  Arie van Deursen,et al.  Proceedings of the 8th Working Conference on Mining Software Repositories , 2011, ICSE 2011.

[40]  Audris Mockus,et al.  High-impact defects: a study of breakage and surprise defects , 2011, ESEC/FSE '11.

[41]  Georgios Paliouras,et al.  Filtron: A Learning-Based Anti-Spam Filter , 2004, CEAS.

[42]  B. Efron Estimating the Error Rate of a Prediction Rule: Improvement on Cross-Validation , 1983 .

[43]  A. Zeller,et al.  If Your Bug Database Could Talk . . . , 2006 .

[44]  Tony A. Meyer,et al.  SpamBayes: Effective open-source, Bayesian based, email classification system , 2004, CEAS.

[45]  Gina Venolia,et al.  The secret life of bugs: Going past the errors and omissions in software repositories , 2009, 2009 IEEE 31st International Conference on Software Engineering.

[46]  Stan Matwin,et al.  Supporting software maintenance by mining software update records , 2001, Proceedings IEEE International Conference on Software Maintenance. ICSM 2001.

[47]  Constantine D. Spyropoulos,et al.  An experimental comparison of naive Bayesian and keyword-based anti-spam filtering with personal e-mail messages , 2000, SIGIR '00.

[48]  Michele Lanza,et al.  On the Relationship Between Change Coupling and Software Defects , 2009, 2009 16th Working Conference on Reverse Engineering.

[49]  Thomas Zimmermann,et al.  What Makes a Good Bug Report? , 2008, IEEE Transactions on Software Engineering.

[50]  ZhangHongyu,et al.  Comments on "Data Mining Static Code Attributes to Learn Defect Predictors" , 2007 .

[51]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques, 3rd Edition , 1999 .

[52]  Audris Mockus,et al.  Software Dependencies, Work Dependencies, and Their Impact on Failures , 2009, IEEE Transactions on Software Engineering.

[53]  Witold Pedrycz,et al.  A comparative analysis of the efficiency of change metrics and static code attributes for defect prediction , 2008, 2008 ACM/IEEE 30th International Conference on Software Engineering.

[54]  Thomas Zimmermann,et al.  Improving bug triage with bug tossing graphs , 2009, ESEC/FSE '09.

[55]  Lucas D. Panjer Predicting Eclipse Bug Lifetimes , 2007, Fourth International Workshop on Mining Software Repositories (MSR'07:ICSE Workshops 2007).

[56]  Thomas Zimmermann,et al.  Duplicate bug reports considered harmful … really? , 2008, 2008 IEEE International Conference on Software Maintenance.

[57]  Harald C. Gall,et al.  Populating a Release History Database from version control and bug tracking systems , 2003, International Conference on Software Maintenance, 2003. ICSM 2003. Proceedings..

[58]  Harald C. Gall,et al.  Does distributed development affect software quality? An empirical case study of Windows Vista , 2009, 2009 IEEE 31st International Conference on Software Engineering.

[59]  Thomas Zimmermann,et al.  When do changes induce fixes? On Fridays , 2005 .

[60]  Osamu Mizuno,et al.  An Integrated Approach to Detect Fault-Prone Modules Using Complexity and Text Feature Metrics , 2010, AST/UCMA/ISA/ACN.

[61]  Osamu Mizuno,et al.  Spam Filter Based Approach for Finding Fault-Prone Software Modules , 2007, Fourth International Workshop on Mining Software Repositories (MSR'07:ICSE Workshops 2007).