Techniques for improving regression testing in continuous integration development environments

In continuous integration development environments, software engineers frequently integrate new or changed code with the mainline codebase. This can reduce the amount of code rework that is needed as systems evolve and speed up development time. While continuous integration processes traditionally require that extensive testing be performed following the actual submission of code to the codebase, it is also important to ensure that enough testing is performed prior to code submission to avoid breaking builds and delaying the fast feedback that makes continuous integration desirable. In this work, we present algorithms that make continuous integration processes more cost-effective. In an initial pre-submit phase of testing, developers specify modules to be tested, and we use regression test selection techniques to select a subset of the test suites for those modules that render that phase more cost-effective. In a subsequent post-submit phase of testing, where dependent modules as well as changed modules are tested, we use test case prioritization techniques to ensure that failures are reported more quickly. In both cases, the techniques we utilize are novel, involving algorithms that are relatively inexpensive and do not rely on code coverage information -- two requirements for conducting testing cost-effectively in this context. To evaluate our approach, we conducted an empirical study on a large data set from Google that we make publicly available. The results of our study show that our selection and prioritization techniques can each lead to cost-effectiveness improvements in the continuous integration process.

[1]  Gregg Rothermel,et al.  Prioritizing test cases for regression testing , 2000, ISSTA '00.

[2]  Gregg Rothermel,et al.  Analyzing Regression Test Selection Techniques , 1996, IEEE Trans. Software Eng..

[3]  Mary Lou Soffa,et al.  Efficient time-aware prioritization with knapsack solvers , 2007, WEASELTech '07.

[4]  Amitabh Srivastava,et al.  Effectively prioritizing tests in development environment , 2002, ISSTA '02.

[5]  Mark Harman,et al.  Regression Testing Minimisation, Selection and Prioritisation - A Survey , 2009 .

[6]  Gregg Rothermel,et al.  A safe, efficient regression test selection technique , 1997, TSEM.

[7]  Gregg Rothermel,et al.  Incorporating varying test costs and fault severities into test case prioritization , 2001, Proceedings of the 23rd International Conference on Software Engineering. ICSE 2001.

[8]  George Candea,et al.  Parallel symbolic execution for automated real-world software testing , 2011, EuroSys '11.

[9]  Adam A. Porter,et al.  A history-based test prioritization technique for regression testing in resource constrained environments , 2002, ICSE '02.

[10]  Mary Lou Soffa,et al.  TimeAware test suite prioritization , 2006, ISSTA '06.

[11]  Michael D. Ernst,et al.  An experimental evaluation of continuous testing during development , 2004, ISSTA '04.

[12]  Gregg Rothermel,et al.  Test Case Prioritization: A Family of Empirical Studies , 2002, IEEE Trans. Software Eng..

[13]  Mark Harman,et al.  Regression testing minimization, selection and prioritization: a survey , 2012, Softw. Test. Verification Reliab..

[14]  Gregg Rothermel,et al.  On the Use of Mutation Faults in Empirical Assessments of Test Case Prioritization Techniques , 2006, IEEE Transactions on Software Engineering.

[15]  Lee J. White,et al.  Industrial real-time regression testing and analysis using firewalls , 2004, 20th IEEE International Conference on Software Maintenance, 2004. Proceedings..

[16]  Gregg Rothermel,et al.  The Effects of Time Constraints on Test Case Prioritization: A Series of Controlled Experiments , 2010, IEEE Transactions on Software Engineering.

[17]  Tao Xie,et al.  Time-aware test-case prioritization using integer linear programming , 2009, ISSTA.

[18]  Andrew Glover,et al.  Continuous Integration: Improving Software Quality and Reducing Risk (The Addison-Wesley Signature Series) , 2007 .

[19]  Gregg Rothermel,et al.  The impact of software evolution on code coverage information , 2001, Proceedings IEEE International Conference on Software Maintenance. ICSM 2001.

[20]  David W. Binkley,et al.  Semantics Guided Regression Test Cost Reduction , 1997, IEEE Trans. Software Eng..

[21]  Mark Harman,et al.  Faster Fault Finding at Google Using Multi Objective Regression Test Optimisation , 2011 .

[22]  Mark Harman,et al.  Clustering test cases to achieve effective and scalable prioritisation incorporating expert knowledge , 2009, ISSTA.

[23]  Gregg Rothermel,et al.  A Scalable Distributed Concolic Testing Approach: An Empirical Evaluation , 2012, 2012 IEEE Fifth International Conference on Software Testing, Verification and Validation.

[24]  Alessandro Orso,et al.  Scaling regression testing to large software systems , 2004, SIGSOFT '04/FSE-12.

[25]  Gregg Rothermel,et al.  Oracle-Centric Test Case Prioritization , 2012, 2012 IEEE 23rd International Symposium on Software Reliability Engineering.

[26]  Arnaud Gotlieb,et al.  Test Case Prioritization for Continuous Regression Testing: An Industrial Case Study , 2013, 2013 IEEE International Conference on Software Maintenance.

[27]  Tsong Yueh Chen,et al.  How Well Do Test Case Prioritization Techniques Support Statistical Fault Localization , 2009, 2009 33rd Annual IEEE International Computer Software and Applications Conference.