Code coverage for suite evaluation by developers

One of the key challenges of developers testing code is determining a test suite's quality -- its ability to find faults. The most common approach is to use code coverage as a measure for test suite quality, and diminishing returns in coverage or high absolute coverage as a stopping rule. In testing research, suite quality is often evaluated by a suite's ability to kill mutants (artificially seeded potential faults). Determining which criteria best predict mutation kills is critical to practical estimation of test suite quality. Previous work has only used small sets of programs, and usually compares multiple suites for a single program. Practitioners, however, seldom compare suites --- they evaluate one suite. Using suites (both manual and automatically generated) from a large set of real-world open-source projects shows that evaluation results differ from those for suite-comparison: statement (not block, branch, or path) coverage predicts mutation kills best.

[1]  Anas N. Al-Rabadi,et al.  A comparison of modified reconstructability analysis and Ashenhurst‐Curtis decomposition of Boolean functions , 2004 .

[2]  Glenford J. Myers,et al.  Art of Software Testing , 1979 .

[3]  Phyllis G. Frankl,et al.  An Experimental Comparison of the Effectiveness of Branch Testing and Data Flow Testing , 1993, IEEE Trans. Software Eng..

[4]  Thomas J. Ostrand,et al.  Experiments on the effectiveness of dataflow- and control-flow-based test adequacy criteria , 1994, Proceedings of 16th International Conference on Software Engineering.

[5]  Subsumption of Condition Coverage Techniques by Mutation Testing , 1996 .

[6]  Phyllis G. Frankl,et al.  All-uses vs mutation testing: An experimental comparison of effectiveness , 1997, J. Syst. Softw..

[7]  R. Lipton,et al.  Mutation analysis , 1998 .

[8]  John Joseph Chilenski,et al.  An Investigation of Three Forms of the Modified Condition Decision Coverage (MCDC) Criterion , 2001 .

[9]  Thomas Ball,et al.  A Theory of Predicate-Complete Test Coverage and Generation , 2004, FMCO.

[10]  Lionel C. Briand,et al.  Is mutation an appropriate tool for testing experiments? , 2005, ICSE.

[11]  Michael R. Lyu,et al.  The effect of code coverage on fault detection under different testing profiles , 2005, A-MOST.

[12]  Luciano Baresi,et al.  An Introduction to Software Testing , 2006, FoVMT.

[13]  Lionel C. Briand,et al.  Using Mutation Analysis for Assessing and Comparing Testing Coverage Criteria , 2006, IEEE Transactions on Software Engineering.

[14]  Michael D. Ernst,et al.  Feedback-Directed Random Test Generation , 2007, 29th International Conference on Software Engineering (ICSE'07).

[15]  Atul Gupta,et al.  An approach for experimentally evaluating effectiveness and efficiency of coverage criteria for software testing , 2008, International Journal on Software Tools for Technology Transfer.

[16]  Akbar Siami Namin,et al.  The influence of size and coverage on test suite effectiveness , 2009, ISSTA.

[17]  A. Jefferson Offutt,et al.  An Experimental Comparison of Four Unit Test Criteria: Mutation, Edge-Pair, All-Uses and Prime Path Coverage , 2009, 2009 International Conference on Software Testing, Verification, and Validation Workshops.

[18]  Sahitya Kakarla,et al.  An analysis of parameters influencing test suite effectiveness , 2010 .

[19]  Bertrand Meyer,et al.  Is Branch Coverage a Good Measure of Testing Effectiveness? , 2010, LASER Summer School.

[20]  Laura Inozemtseva,et al.  Predicting Test Suite Effectiveness for Java Programs , 2012 .

[21]  James H. Andrews,et al.  Comparing Multi-Point Stride Coverage and dataflow coverage , 2013, 2013 35th International Conference on Software Engineering (ICSE).

[22]  Alex Groce,et al.  Comparing non-adequate test suites using coverage criteria , 2013, ISSTA.

[23]  Reid Holmes,et al.  Coverage is not strongly correlated with test suite effectiveness , 2014, ICSE.

[24]  Mario Piattini,et al.  Mutation Testing , 2014, IEEE Software.