Using Controlled Numbers of Real Faults and Mutants to Empirically Evaluate Coverage-Based Test Case Prioritization

Used to establish confidence in the correctness of evolving software, regression testing is an important, yet costly, task. Test case prioritization enables the rapid detection of faults during regression testing by reordering the test suite so that effective tests are run as early as is possible. However, a distinct lack of information about the regression faults found in complex real-world software forced prior experimental studies of these methods to use artificial faults called mutants. Using the Defects4J database of real faults, this paper presents the results of experiments evaluating the effectiveness of four representative test prioritization techniques. Since this paper’s results show that prioritization is susceptible to high amounts of variance when only one fault is present, our experiments also control the number of real faults and mutants in the program subject to regression testing. Our overall findings are that, in comparison to mutants, real faults are harder for reordered test suites to quickly detect, suggesting that mutants are not a surrogate for real faults.

[1]  Anne M. Denton,et al.  A clustering approach to improving test case prioritization: An industrial case study , 2011, 2011 27th IEEE International Conference on Software Maintenance (ICSM).

[2]  Michael D. Ernst,et al.  Evaluating and Improving Fault Localization , 2017, 2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE).

[3]  Gregg Rothermel,et al.  Cost-cognizant Test Case Prioritization , 2006 .

[4]  René Just,et al.  Using conditional mutation to increase the efficiency of mutation analysis , 2011, AST '11.

[5]  René Just,et al.  Using Non-redundant Mutation Operators and Test Suite Prioritization to Achieve Efficient and Scalable Mutation Analysis , 2012, 2012 IEEE 23rd International Symposium on Software Reliability Engineering.

[6]  Qi Luo,et al.  A large-scale empirical comparison of static and dynamic test case prioritization techniques , 2016, SIGSOFT FSE.

[7]  Ron Patton,et al.  Software Testing , 2000 .

[8]  Gregg Rothermel,et al.  Prioritizing test cases for regression testing , 2000, ISSTA '00.

[9]  Ahmed E. Hassan,et al.  Static test case prioritization using topic models , 2014, Empirical Software Engineering.

[10]  Gregg Rothermel,et al.  Empirical studies of test case prioritization in a JUnit testing environment , 2004, 15th International Symposium on Software Reliability Engineering.

[11]  David W. Coit,et al.  Multi-objective optimization using genetic algorithms: A tutorial , 2006, Reliab. Eng. Syst. Saf..

[12]  Pankaj Mudholkar,et al.  Software Testing , 2002, Computer.

[13]  Lionel C. Briand,et al.  Is mutation an appropriate tool for testing experiments? , 2005, ICSE.

[14]  Gregg Rothermel,et al.  Test case prioritization: an empirical study , 1999, Proceedings IEEE International Conference on Software Maintenance - 1999 (ICSM'99). 'Software Maintenance for Business Change' (Cat. No.99CB36360).

[15]  Mark Harman,et al.  Search Algorithms for Regression Test Case Prioritization , 2007, IEEE Transactions on Software Engineering.

[16]  Tao Xie,et al.  To Be Optimal or Not in Test-Case Prioritization , 2016, IEEE Transactions on Software Engineering.

[17]  D. J. Robson Regression testing , 1993 .

[18]  Michael D. Ernst,et al.  Empirically revisiting the test independence assumption , 2014, ISSTA 2014.

[19]  Gregg Rothermel,et al.  On the Use of Mutation Faults in Empirical Assessments of Test Case Prioritization Techniques , 2006, IEEE Transactions on Software Engineering.

[20]  Fang Yuan,et al.  Epistatic Genetic Algorithm for Test Case Prioritization , 2015, SSBSE.

[21]  David Leon,et al.  A comparison of coverage-based and distribution-based techniques for filtering and prioritizing test cases , 2003, 14th International Symposium on Software Reliability Engineering, 2003. ISSRE 2003..

[22]  S. Ellis,et al.  Practical significance (effect sizes) versus or in combination with statistical significance (p-values) : research note , 2003 .

[23]  Mary Lou Soffa,et al.  TimeAware test suite prioritization , 2006, ISSTA '06.

[24]  Lionel C. Briand,et al.  A Hitchhiker's guide to statistical tests for assessing randomized algorithms in software engineering , 2014, Softw. Test. Verification Reliab..

[25]  Andreas Zeller,et al.  Where is the bug and how is it fixed? an experiment with practitioners , 2017, ESEC/SIGSOFT FSE.

[26]  Dharmender Singh Kushwaha,et al.  An Improved History-Based Test Prioritization Technique Technique Using Code Coverage , 2015 .

[27]  Mark Harman,et al.  An Analysis and Survey of the Development of Mutation Testing , 2011, IEEE Transactions on Software Engineering.

[28]  Mark Harman,et al.  Regression testing minimization, selection and prioritization: a survey , 2012, Softw. Test. Verification Reliab..

[29]  Michael D. Ernst,et al.  Are mutants a valid substitute for real faults in software testing? , 2014, SIGSOFT FSE.

[30]  Gregory M. Kapfhammer Regression Testing , 2010, Encyclopedia of Software Engineering.

[31]  Amitabh Srivastava,et al.  Effectively prioritizing tests in development environment , 2002, ISSTA '02.

[32]  Gregory M. Kapfhammer,et al.  Empirically studying the role of selection operators duringsearch-based test suite prioritization , 2010, GECCO '10.

[33]  Lionel C. Briand,et al.  Coverage‐based regression test case selection, minimization and prioritization: a case study on an industrial system , 2015, Softw. Test. Verification Reliab..

[34]  Myra B. Cohen,et al.  Combinatorial Interaction Regression Testing: A Study of Test Case Generation and Prioritization , 2007, 2007 IEEE International Conference on Software Maintenance.

[35]  Michael D. Ernst,et al.  Defects4J: a database of existing faults to enable controlled testing studies for Java programs , 2014, ISSTA 2014.

[36]  A. Vargha,et al.  A Critique and Improvement of the CL Common Language Effect Size Statistics of McGraw and Wong , 2000 .

[37]  Gregg Rothermel,et al.  Bridging the gap between the total and additional test-case prioritization strategies , 2013, 2013 35th International Conference on Software Engineering (ICSE).

[38]  Gregory M. Kapfhammer,et al.  Reducing the Cost of Regression Testing by Identifying Irreplaceable Test Cases , 2012, 2012 Sixth International Conference on Genetic and Evolutionary Computing.

[39]  Reid Holmes,et al.  Coverage is not strongly correlated with test suite effectiveness , 2014, ICSE.

[40]  Shin Yoo,et al.  Empirical Evaluation of Mutation-based Test Prioritization Techniques , 2017, ArXiv.