Evolutionary Generation of Whole Test Suites

Recent advances in software testing allow automatic derivation of tests that reach almost any desired point in the source code. There is, however, a fundamental problem with the general idea of targeting one distinct test coverage goal at a time: Coverage goals are neither independent of each other, nor is test generation for any particular coverage goal guaranteed to succeed. We present EvoSuite, a search-based approach that optimizes whole test suites towards satisfying a coverage criterion, rather than generating distinct test cases directed towards distinct coverage goals. Evaluated on five open source libraries and an industrial case study, we show that EvoSuite achieves up to 18 times the coverage of a traditional approach targeting single branches, with up to 44% smaller test suites.

[1]  L. Darrell Whitley,et al.  The GENITOR Algorithm and Selection Pressure: Why Rank-Based Allocation of Reproductive Trials is Best , 1989, ICGA.

[2]  Allen Goldberg,et al.  Applications of feasible path analysis to program testing , 1994, ISSTA '94.

[3]  A. Vargha,et al.  A Critique and Improvement of the CL Common Language Effect Size Statistics of McGraw and Wong , 2000 .

[4]  Mark Harman,et al.  Testability transformation , 2004, IEEE Transactions on Software Engineering.

[5]  Paolo Tonella,et al.  Evolutionary testing of classes , 2004, ISSTA '04.

[6]  Phil McMinn,et al.  Search-based software test data generation: a survey: Research Articles , 2004 .

[7]  Koushik Sen,et al.  CUTE: a concolic unit testing engine for C , 2005, ESEC/FSE-13.

[8]  David Notkin,et al.  Symstra: A Framework for Generating Object-Oriented Unit Tests Using Symbolic Execution , 2005, TACAS.

[9]  Koushik Sen,et al.  DART: directed automated random testing , 2005, PLDI '05.

[10]  Stefan Wappler,et al.  Using evolutionary algorithms for the unit testing of object-oriented software , 2005, GECCO '05.

[11]  Bruno Marre,et al.  PathCrawler: Automatic Generation of Path Tests by Combining Static and Dynamic Analysis , 2005, EDCC.

[12]  Jean-Marc Jézéquel,et al.  Automatic test case optimization: a bacteriologic algorithm , 2005, IEEE Software.

[13]  Michael D. Ernst,et al.  Randoop: feedback-directed random testing for Java , 2007, OOPSLA '07.

[14]  Nikolai Tillmann,et al.  Pex-White Box Test Generation for .NET , 2008, TAP.

[15]  José Carlos Bregieiro Ribeiro Search-based test case generation for object-oriented java software using strongly-typed genetic programming , 2008, GECCO '08.

[16]  Tao Xie,et al.  Improving Structural Testing of Object-Oriented Programs via Integrating Evolutionary Testing and Symbolic Execution , 2008, 2008 23rd IEEE/ACM International Conference on Automated Software Engineering.

[17]  Ernesto Costa,et al.  Dynamic limits for bloat control in genetic programming and a review of past and current bloat theories , 2009, Genetic Programming and Evolvable Machines.

[18]  Xin Yao,et al.  Search based software testing of object-oriented containers , 2008, Inf. Sci..

[19]  Mark Harman,et al.  Empirical evaluation of a nesting testability transformation for evolutionary testing , 2009, TSEM.

[20]  Mark Harman,et al.  A Theoretical and Empirical Study of Search-Based Testing: Local, Global, and Hybrid Search , 2010, IEEE Transactions on Software Engineering.

[21]  Luciano Baresi,et al.  TestFul: An Evolutionary Test Approach for Java , 2010, 2010 Third International Conference on Software Testing, Verification and Validation.

[22]  Lionel C. Briand,et al.  Black-Box System Testing of Real-Time Embedded Systems Using Random and Search-Based Testing , 2010, ICTSS.

[23]  Mark Harman,et al.  Optimizing for the Number of Tests Generated in Search Based Test Data Generation with an Application to the Oracle Cost Problem , 2010, 2010 Third International Conference on Software Testing, Verification, and Validation Workshops.

[24]  Lionel C. Briand,et al.  Formal analysis of the effectiveness and predictability of random testing , 2010, ISSTA '10.

[25]  Andrea Arcuri Longer is Better: On the Role of Test Sequence Length in Software Testing , 2010, 2010 Third International Conference on Software Testing, Verification and Validation.

[26]  Andrea Arcuri,et al.  It Does Matter How You Normalise the Branch Distance in Search Based Software Testing , 2010, 2010 Third International Conference on Software Testing, Verification and Validation.

[27]  Mark Harman,et al.  An Analysis and Survey of the Development of Mutation Testing , 2011, IEEE Transactions on Software Engineering.

[28]  Gordon Fraser,et al.  It is Not the Length That Matters, It is How You Control It , 2011, 2011 Fourth IEEE International Conference on Software Testing, Verification and Validation.

[29]  Lionel C. Briand,et al.  A practical guide for using statistical tests to assess randomized algorithms in software engineering , 2011, 2011 33rd International Conference on Software Engineering (ICSE).

[30]  Gordon Fraser,et al.  Testing Container Classes: Random or Systematic? , 2011, FASE.

[31]  Andrea Arcuri,et al.  A Theoretical and Empirical Analysis of the Role of Test Sequence Length in Software Testing for Structural Coverage , 2012, IEEE Transactions on Software Engineering.

[32]  Andreas Zeller,et al.  Mutation-Driven Generation of Unit Tests and Oracles , 2012, IEEE Trans. Software Eng..

[33]  Andrea Arcuri,et al.  It really does matter how you normalize the branch distance in search‐based software testing , 2013, Softw. Test. Verification Reliab..