Determining "Grim Reaper" Policies to Prevent Languishing Bugs

Long-lived software products commonly have a large number of reported defects, some of which may not be fixed for a lengthy period of time, if ever. These so-called languishing bugs can incur various costs to project teams, such as wasted time in release planning and in defect analysis and inspection. They also result in an unrealistic view of the number of bugs still to be fixed at a given time. The goal of this work is to help software practitioners mitigate their costs from languishing bugs by providing a technique to predict and pre-emptively close them. We analyze defect fix times from an ABB program and the Apache HTTP server, and find that both contain a substantial number of languishing bugs. We also train decision tree classification models to predict whether a given bug will be fixed within a desired time period. We propose that an organization could use such a model to form a "grim reaper" policy, whereby bugs that are predicted to become languishing will be pre-emptively closed. However, initial results are mixed, with models for the ABB program achieving F-scores of 63-95%, while the Apache program has Fscores of 21-59%.

[1]  Liang Gong,et al.  Predicting bug-fixing time: An empirical study of commercial software projects , 2013, 2013 35th International Conference on Software Engineering (ICSE).

[2]  Westley Weimer,et al.  Modeling bug report quality , 2007, ASE '07.

[3]  Iulian Neamtiu,et al.  Bug-fix time prediction models: can we do better? , 2011, MSR '11.

[4]  Wei-Yin Loh,et al.  Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..

[5]  Mladen A. Vouk,et al.  On predicting the time taken to correct bug reports in open source projects , 2009, 2009 IEEE International Conference on Software Maintenance.

[6]  Philip J. Guo,et al.  Characterizing and predicting which bugs get fixed: an empirical study of Microsoft Windows , 2010, 2010 ACM/IEEE 32nd International Conference on Software Engineering.

[7]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .