Cause reduction: delta debugging, even without bugs

What is a test case for? Sometimes, to expose a fault. Tests can also exercise code, use memory or time, or produce desired output. Given a desired effect, a test case can be seen as a cause, and its components divided into essential (required for effect) and accidental. Delta debugging is used for removing accidents from failing test cases, producing smaller test cases that are easier to understand. This paper extends delta debugging by simplifying test cases with respect to arbitrary effects, a generalization called cause reduction. Suites produced by cause reduction provide effective quick tests for real‐world programs. For Mozilla's JavaScript engine, the reduced suite is possibly more effective for finding faults. The effectiveness of reduction‐based suites persists through changes to the software, improving coverage by over 500 branches for versions up to 4 months later. Cause reduction has other applications, including improving seeded symbolic execution, where using reduced tests can often double the number of additional branches explored. Copyright © 2015 John Wiley & Sons, Ltd.

[1]  Michael D. Ernst,et al.  Feedback-Directed Random Test Generation , 2007, 29th International Conference on Software Engineering (ICSE'07).

[2]  Xuejun Yang,et al.  Test-case reduction for C compiler bugs , 2012, PLDI.

[3]  Atul Gupta,et al.  An approach for experimentally evaluating effectiveness and efficiency of coverage criteria for software testing , 2008, International Journal on Software Tools for Technology Transfer.

[4]  Gregg Rothermel,et al.  A Hybrid Directed Test Suite Augmentation Technique , 2011, 2011 IEEE 22nd International Symposium on Software Reliability Engineering.

[5]  Luciano Baresi,et al.  TestFul: An Evolutionary Test Approach for Java , 2010, 2010 Third International Conference on Software Testing, Verification and Validation.

[6]  Jong-Deok Choi,et al.  Isolating failure-inducing thread schedules , 2002, ISSTA '02.

[7]  Tim Menzies,et al.  Genetic Algorithms for Randomized Unit Testing , 2011, IEEE Transactions on Software Engineering.

[8]  Zhendong Su,et al.  HDD: hierarchical delta debugging , 2006, ICSE.

[9]  Alex Groce,et al.  Guidelines for Coverage-Based Comparisons of Non-Adequate Test Suites , 2015, ACM Trans. Softw. Eng. Methodol..

[10]  Mary Lou Soffa,et al.  TimeAware test suite prioritization , 2006, ISSTA '06.

[11]  Alex Groce,et al.  Swarm testing , 2012, ISSTA 2012.

[12]  Gregg Rothermel,et al.  Supporting Controlled Experimentation with Testing Techniques: An Infrastructure and its Potential Impact , 2005, Empirical Software Engineering.

[13]  Alessandro Orso,et al.  MINTS: A general framework and tool for supporting test-suite minimization , 2009, 2009 IEEE 31st International Conference on Software Engineering.

[14]  Andreas Zeller,et al.  Yesterday, my program worked. Today, it does not. Why? , 1999, ESEC/FSE-7.

[15]  Alex Groce,et al.  Taming compiler fuzzers , 2013, ACM-SIGPLAN Symposium on Programming Language Design and Implementation.

[16]  Franz von Kutschera,et al.  Causation , 1993, J. Philos. Log..

[17]  Frank Tip,et al.  A survey of program slicing techniques , 1994, J. Program. Lang..

[18]  Alex Groce,et al.  Code coverage for suite evaluation by developers , 2014, ICSE.

[19]  Alex Groce,et al.  Comparing non-adequate test suites using coverage criteria , 2013, ISSTA.

[20]  Alex Groce,et al.  Comparing Automated Unit Testing Strategies , 2010 .

[21]  Mark Harman,et al.  Regression testing minimization, selection and prioritization: a survey , 2012, Softw. Test. Verification Reliab..

[22]  Koushik Sen,et al.  DART: directed automated random testing , 2005, PLDI '05.

[23]  H. Cleve,et al.  Locating causes of program failures , 2005, Proceedings. 27th International Conference on Software Engineering, 2005. ICSE 2005..

[24]  Rupak Majumdar,et al.  Hybrid Concolic Testing , 2007, 29th International Conference on Software Engineering (ICSE'07).

[25]  André van der Hoek,et al.  Proceedings of the eighteenth ACM SIGSOFT international symposium on Foundations of software engineering , 2010, FSE 2010.

[26]  Cristian Cadar,et al.  make test-zesti: A symbolic execution solution for improving regression testing , 2012, 2012 34th International Conference on Software Engineering (ICSE).

[27]  Alex Groce,et al.  Cause Reduction for Quick Testing , 2014, 2014 IEEE Seventh International Conference on Software Testing, Verification and Validation.

[28]  Andreas Zeller,et al.  Why Programs Fail: A Guide to Systematic Debugging , 2005 .

[29]  W. M. McKeeman,et al.  Differential Testing for Software , 1998, Digit. Tech. J..

[30]  Alex Groce,et al.  Randomized Differential Testing as a Prelude to Formal Verification , 2007, 29th International Conference on Software Engineering (ICSE'07).

[31]  Koen Claessen,et al.  QuickCheck: a lightweight tool for random testing of Haskell programs , 2000, ICFP.

[32]  Alex Groce,et al.  Lightweight Automated Testing with Adaptation-Based Programming , 2012, 2012 IEEE 23rd International Symposium on Software Reliability Engineering.

[33]  Alex Groce,et al.  Error explanation with distance metrics , 2004, International Journal on Software Tools for Technology Transfer.

[34]  Gregg Rothermel,et al.  Empirical studies of test‐suite reduction , 2002, Softw. Test. Verification Reliab..

[35]  Tsong Yueh Chen,et al.  Dividing Strategies for the Optimization of a Test Suite , 1996, Inf. Process. Lett..

[36]  Gogul Balakrishnan,et al.  Feedback-directed unit test generation for C/C++ using concolic execution , 2013, 2013 35th International Conference on Software Engineering (ICSE).

[37]  Anna Philippou,et al.  Tools and Algorithms for the Construction and Analysis of Systems , 2018, Lecture Notes in Computer Science.

[38]  Sigrid Eldh Software Testing Techniques , 2007 .

[39]  Gregg Rothermel,et al.  Prioritizing test cases for regression testing , 2000, ISSTA '00.

[40]  Hoyt Lougee,et al.  SOFTWARE CONSIDERATIONS IN AIRBORNE SYSTEMS AND EQUIPMENT CERTIFICATION , 2001 .

[41]  Yong Lei,et al.  Minimization of randomized unit test cases , 2005, 16th IEEE International Symposium on Software Reliability Engineering (ISSRE'05).

[42]  Alex Groce,et al.  Coverage and Its Discontents , 2014, Onward!.

[43]  Alex Groce,et al.  Random Test Run Length and Effectiveness , 2008, 2008 23rd IEEE/ACM International Conference on Automated Software Engineering.

[44]  Joseph Robert Horgan,et al.  Dynamic program slicing , 1990, PLDI '90.

[45]  Alex Groce,et al.  From scripts to specifications: the evolution of a flight software testing effort , 2010, 2010 ACM/IEEE 32nd International Conference on Software Engineering.

[46]  Andreas Zeller,et al.  Simplifying failure-inducing input , 2000, ISSTA '00.

[47]  Moonzoo Kim,et al.  Industrial application of concolic testing approach: A case study on libexif by using CREST-BV and KLEE , 2012, 2012 34th International Conference on Software Engineering (ICSE).

[48]  Boris Beizer,et al.  Software testing techniques (2. ed.) , 1990 .

[49]  Myra B. Cohen,et al.  Directed test suite augmentation: techniques and tradeoffs , 2010, FSE '10.

[50]  Lionel C. Briand,et al.  Is mutation an appropriate tool for testing experiments? , 2005, ICSE.

[51]  Andreas Zeller,et al.  Simplifying and Isolating Failure-Inducing Input , 2002, IEEE Trans. Software Eng..

[52]  Michael D. Ernst,et al.  Improving test suites via operational abstraction , 2003, 25th International Conference on Software Engineering, 2003. Proceedings..

[53]  Gregg Rothermel,et al.  A comparative study of coarse- and fine-grained safe regression test-selection techniques , 2001, TSEM.

[54]  M. Eliantonio,et al.  Private Parties and the Annulment Procedure: Can the Gap in the European System of Judicial Protection Be Closed? , 2010 .

[55]  Andreas Zeller,et al.  Efficient unit test case minimization , 2007, ASE '07.

[56]  Paul Gastin,et al.  Minimization of Counterexamples in SPIN , 2004, SPIN.

[57]  Gordon Fraser,et al.  EvoSuite: automatic test suite generation for object-oriented software , 2011, ESEC/FSE '11.

[58]  Gregg Rothermel,et al.  Selecting a Cost-Effective Test Case Prioritization Technique , 2004, Software Quality Journal.

[59]  Ilan Beer,et al.  Explaining counterexamples using causality , 2009, Formal Methods in System Design.

[60]  Xuejun Yang,et al.  Finding and understanding bugs in C compilers , 2011, PLDI '11.

[61]  Mary Jean Harrold,et al.  Empirical evaluation of the tarantula automatic fault-localization technique , 2005, ASE.

[62]  Gregg Rothermel,et al.  Prioritizing test cases for regression testing , 2000, ISSTA '00.

[63]  Myra B. Cohen,et al.  Hybrid Directed Test Suite Augmentation: An Interleaving Framework , 2014, 2014 IEEE Seventh International Conference on Software Testing, Verification and Validation.

[64]  George Mason,et al.  Procedures for Reducing the Size of Coverage-based Test Sets , 1995 .

[65]  Alex Groce,et al.  Making the Most of BMC Counterexamples , 2005, BMC@CAV.

[66]  Atif M. Memon,et al.  Call-Stack Coverage for GUI Test Suite Reduction , 2008, IEEE Transactions on Software Engineering.

[67]  Koushik Sen,et al.  CUTE: a concolic unit testing engine for C , 2005, ESEC/FSE-13.

[68]  A. Zeller Isolating cause-effect chains from computer programs , 2002, SIGSOFT '02/FSE-10.

[69]  Mark Weiser,et al.  Programmers use slices when debugging , 1982, CACM.

[70]  Tao Xie,et al.  Is operator-based mutant selection superior to random mutant selection? , 2010, 2010 ACM/IEEE 32nd International Conference on Software Engineering.

[71]  Alex Groce,et al.  Using test case reduction and prioritization to improve symbolic execution , 2014, ISSTA 2014.

[72]  Alex Groce,et al.  Establishing flight software reliability: testing, model checking, constraint-solving, monitoring and learning , 2014, Annals of Mathematics and Artificial Intelligence.