Using Pre-Release Test Failures to Build Early Post-Release Defect Prediction Models

Software quality is one of the most pressing concerns for nearly all software developing companies. At the same time, software companies also seek to shorten their release cycles to meet market demands while maintaining their product quality. Identifying problematic code areas becomes more and more important. Defect prediction models became popular in recent years and many different code and process metrics have been studied. There has been minimal effort relating test executions during development with defect likelihood. This is surprising as test executions capture the stability and quality of a program during the development process. This paper presents an exploratory study investigating whether test execution metrics, e.g. Test failure bursts, can be used as software quality indicators and used to build pre- and post-release defects prediction models. We show that test metrics collected during Windows 8 development can be used to build pre- and post-release defect prediction models early in the development process of a software product. Test metrics outperform pre-release defect counts when predicting post-release defects.

[1]  Nachiappan Nagappan,et al.  Predicting defects using network analysis on dependency graphs , 2008, 2008 ACM/IEEE 30th International Conference on Software Engineering.

[2]  Ahmed E. Hassan,et al.  Predicting faults using the complexity of code changes , 2009, 2009 IEEE 31st International Conference on Software Engineering.

[3]  Elaine J. Weyuker,et al.  Where the bugs are , 2004, ISSTA '04.

[4]  Brendan Murphy,et al.  CODEMINE: Building a Software Development Data Analytics Platform at Microsoft , 2013, IEEE Software.

[5]  Laurie A. Williams,et al.  Realizing quality improvement through test driven development: results and experiences of four industrial teams , 2008, Empirical Software Engineering.

[6]  John D. Musa,et al.  Software reliability measurement , 1984, J. Syst. Softw..

[7]  Frank Elberzhager,et al.  Guiding Testing Activities by Predicting Defect-Prone Parts Using Product and Inspection Metrics , 2012, 2012 38th Euromicro Conference on Software Engineering and Advanced Applications.

[8]  Anas N. Al-Rabadi,et al.  A comparison of modified reconstructability analysis and Ashenhurst‐Curtis decomposition of Boolean functions , 2004 .

[9]  John D. Musa,et al.  Software reliability - measurement, prediction, application , 1987, McGraw-Hill series in software engineering and technology.

[10]  S. M. Rafi,et al.  Incorporating fault dependent correction delay in SRGM with testing effort and release policy analysis , 2012, 2012 CSI Sixth International Conference on Software Engineering (CONSEG).

[11]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[12]  Vojislav B. Misic,et al.  The Effects of Test-Driven Development on External Quality and Productivity: A Meta-Analysis , 2013, IEEE Transactions on Software Engineering.

[13]  Michael R. Lyu,et al.  An Assessment of Testing-Effort Dependent Software Reliability Growth Models , 2007, IEEE Transactions on Reliability.

[14]  Banu Diri,et al.  A systematic review of software fault prediction studies , 2009, Expert Syst. Appl..

[15]  Laurie A. Williams,et al.  Early estimation of software quality using in-process testing metrics , 2005, WoSQ@ICSE.

[16]  Andreas Zeller,et al.  Mining Cause-Effect-Chains from Version Histories , 2011, 2011 IEEE 22nd International Symposium on Software Reliability Engineering.

[17]  Max Kuhn,et al.  caret: Classification and Regression Training , 2015 .

[18]  Brendan Murphy,et al.  Can developer-module networks predict failures? , 2008, SIGSOFT '08/FSE-16.

[19]  Mourad Badri,et al.  Evaluating the Effect of Control Flow on the Unit Testing Effort of Classes: An Empirical Analysis , 2012, Adv. Softw. Eng..

[20]  J. Voas,et al.  Software Testability: The New Verification , 1995, IEEE Softw..

[21]  K. Okumoto Software defect prediction based on stability test data , 2011, 2011 International Conference on Quality, Reliability, Risk, Maintenance, and Safety Engineering.

[22]  Kim Herzig,et al.  Mining and untangling change genealogies , 2012 .

[23]  N. Nagappan,et al.  Use of relative code churn measures to predict system defect density , 2005, Proceedings. 27th International Conference on Software Engineering, 2005. ICSE 2005..

[24]  Victor R. Basili,et al.  The influence of organizational structure on software quality , 2008, 2008 ACM/IEEE 30th International Conference on Software Engineering.

[25]  Marcel Abendroth,et al.  Data Mining Practical Machine Learning Tools And Techniques With Java Implementations , 2016 .

[26]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques, 3rd Edition , 1999 .

[27]  Richard Torkar,et al.  Software fault prediction metrics: A systematic literature review , 2013, Inf. Softw. Technol..

[28]  A. Zeller,et al.  Predicting Defects for Eclipse , 2007, Third International Workshop on Predictor Models in Software Engineering (PROMISE'07: ICSE Workshops 2007).

[29]  Atif M. Memon,et al.  Accounting for defect characteristics in evaluations of testing techniques , 2012, TSEM.

[30]  Witold Pedrycz,et al.  A comparative analysis of the efficiency of change metrics and static code attributes for defect prediction , 2008, 2008 ACM/IEEE 30th International Conference on Software Engineering.

[31]  Michael R. Lyu,et al.  Effect of code coverage on software reliability measurement , 2001, IEEE Trans. Reliab..

[32]  Andreas Zeller,et al.  Predicting defects using change genealogies , 2013, 2013 IEEE 24th International Symposium on Software Reliability Engineering (ISSRE).

[33]  Christian Bird,et al.  Assessing the value of branches with what-if analysis , 2012, SIGSOFT FSE.

[34]  Jie Tian,et al.  Experience report: Assessing the reliability of an industrial avionics software: Results, insights and recommendations , 2013, 2013 IEEE 24th International Symposium on Software Reliability Engineering (ISSRE).

[35]  Thomas Ball,et al.  Evidence-Based Failure Prediction , 2011, Making Software.

[36]  Norman E. Fenton,et al.  Software Measurement: Uncertainty and Causal Modeling , 2002, IEEE Softw..

[37]  Shane McIntosh,et al.  The impact of code review coverage and code review participation on software quality: a case study of the qt, VTK, and ITK projects , 2014, MSR 2014.

[38]  Harald C. Gall,et al.  Putting It All Together: Using Socio-technical Networks to Predict Failures , 2009, 2009 20th International Symposium on Software Reliability Engineering.

[39]  Norman E. Fenton,et al.  A Critique of Software Defect Prediction Models , 1999, IEEE Trans. Software Eng..

[40]  Yves Le Traon,et al.  Testing Security Policies: Going Beyond Functional Testing , 2007, The 18th IEEE International Symposium on Software Reliability (ISSRE '07).

[41]  Andreas Zeller,et al.  Change Bursts as Defect Predictors , 2010, 2010 IEEE 21st International Symposium on Software Reliability Engineering.

[42]  Kim Herzig Capturing the long-term impact of changes , 2010, 2010 ACM/IEEE 32nd International Conference on Software Engineering.

[43]  Xavier Robin,et al.  pROC: an open-source package for R and S+ to analyze and compare ROC curves , 2011, BMC Bioinformatics.

[44]  D. Kumar,et al.  Software Reliability Growth Model with testing effort using learning function , 2012, 2012 CSI Sixth International Conference on Software Engineering (CONSEG).

[45]  Audris Mockus,et al.  Test coverage and post-verification defects: A multiple case study , 2009, ESEM 2009.

[46]  Rainer Koschke,et al.  Effort-Aware Defect Prediction Models , 2010, 2010 14th European Conference on Software Maintenance and Reengineering.