BugCache for inspections: hit or miss?

Inspection is a highly effective but costly technique for quality control. Most companies do not have the resources to inspect all the code; thus accurate defect prediction can help focus available inspection resources. BugCache is a simple, elegant, award-winning prediction scheme that "caches" files that are likely to contain defects [12]. In this paper, we evaluate the utility of BugCache as a tool for focusing inspection, we examine the assumptions underlying BugCache with the aim of improving it, and finally we compare it with a simple, standard bug-prediction technique. We find that BugCache is, in fact, useful for focusing inspection effort; but surprisingly, we find that its performance, when used for inspections, is not much better than a naive prediction model -- viz., a model that orders files in the system by their count of closed bugs and chooses enough files to capture 20% of the lines in the system.

[1]  Gordon Johnston,et al.  Statistical Models and Methods for Lifetime Data , 2003, Technometrics.

[2]  Zhe Wang,et al.  Fix Cache Based Regression Test Selection , 2010 .

[3]  Banu Diri,et al.  A systematic review of software fault prediction studies , 2009, Expert Syst. Appl..

[4]  Jerald F. Lawless,et al.  Statistical Models and Methods for Lifetime Data. , 1983 .

[5]  Per Runeson,et al.  An Empirical Evaluation of Regression Testing Based on Fix-Cache Recommendations , 2010, 2010 Third International Conference on Software Testing, Verification and Validation.

[6]  Andreas Zeller,et al.  When do changes induce fixes? , 2005, ACM SIGSOFT Softw. Eng. Notes.

[7]  Ahmed E. Hassan,et al.  Predicting faults using the complexity of code changes , 2009, 2009 IEEE 31st International Conference on Software Engineering.

[8]  Abraham Bernstein,et al.  Software process data quality and characteristics: a historical view on open and closed source projects , 2009, IWPSE-Evol '09.

[9]  Thomas Zimmermann,et al.  Automatic Identification of Bug-Introducing Changes , 2006, 21st IEEE/ACM International Conference on Automated Software Engineering (ASE'06).

[10]  Premkumar T. Devanbu,et al.  Fair and balanced?: bias in bug-fix datasets , 2009, ESEC/FSE '09.

[11]  Lionel C. Briand,et al.  Data Mining Techniques for Building Fault-proneness Models in Telecom Java Software , 2007, The 18th IEEE International Symposium on Software Reliability (ISSRE '07).

[12]  Richard C. Holt,et al.  The top ten list: dynamic fault prediction , 2005, 21st IEEE International Conference on Software Maintenance (ICSM'05).

[13]  Adam A. Porter,et al.  Empirical studies of software engineering: a roadmap , 2000, ICSE '00.

[14]  Gregory Tassey,et al.  Prepared for what , 2007 .

[15]  N. Nagappan,et al.  Use of relative code churn measures to predict system defect density , 2005, Proceedings. 27th International Conference on Software Engineering, 2005. ICSE 2005..

[16]  Lionel C. Briand,et al.  A systematic and comprehensive investigation of methods to build and evaluate fault prediction models , 2010, J. Syst. Softw..

[17]  Xiaoyan Zhu,et al.  An empirical analysis of the FixCache algorithm , 2011, MSR '11.

[18]  Audris Mockus,et al.  Identifying reasons for software changes using historic databases , 2000, Proceedings 2000 International Conference on Software Maintenance.

[19]  Andreas Zeller,et al.  Predicting faults from cached history , 2008, ISEC '08.

[20]  Harald C. Gall,et al.  Populating a Release History Database from version control and bug tracking systems , 2003, International Conference on Software Maintenance, 2003. ICSM 2003. Proceedings..

[21]  Gail C. Murphy,et al.  Hipikat: recommending pertinent software development artifacts , 2003, 25th International Conference on Software Engineering, 2003. Proceedings..

[22]  Laurie A. Williams,et al.  Secure open source collaboration: an empirical study of linus' law , 2009, CCS.

[23]  Elaine J. Weyuker,et al.  Do too many cooks spoil the broth? Using the number of developers to enhance defect prediction models , 2008, Empirical Software Engineering.

[24]  Robert Feldt,et al.  Dynamic Regression Test Selection Based on a File Cache An Industrial Evaluation , 2009, 2009 International Conference on Software Testing Verification and Validation.

[25]  W. Kruskal,et al.  Use of Ranks in One-Criterion Variance Analysis , 1952 .