Does choice of mutation tool matter?

Though mutation analysis is the primary means of evaluating the quality of test suites, it suffers from inadequate standardization. Mutation analysis tools vary based on language, when mutants are generated (phase of compilation), and target audience. Mutation tools rarely implement the complete set of operators proposed in the literature and mostly implement at least a few domain-specific mutation operators. Thus different tools may not always agree on the mutant kills of a test suite. Few criteria exist to guide a practitioner in choosing the right tool for either evaluating effectiveness of a test suite or for comparing different testing techniques. We investigate an ensemble of measures for evaluating efficacy of mutants produced by different tools. These include the traditional difficulty of detection, strength of minimal sets, and the diversity of mutants, as well as the information carried by the mutants produced. We find that mutation tools rarely agree. The disagreement between scores can be large, and the variation due to characteristics of the project—even after accounting for difference due to test suites—is a significant factor. However, the mean difference between tools is very small, indicating that no single tool consistently skews mutation scores high or low for all projects. These results suggest that experiments yielding small differences in mutation score, especially using a single tool, or a small number of projects may not be reliable. There is a clear need for greater standardization of mutation analysis. We propose one approach for such a standardization.

[1]  Franz Wotawa,et al.  Using Constraints for Equivalent Mutant Detection , 2012, WS-FMDS.

[2]  Andreas Zeller,et al.  Covering and Uncovering Equivalent Mutants , 2013, Softw. Test. Verification Reliab..

[3]  A. Jefferson Offutt,et al.  Mutation 2000: uniting the orthogonal , 2001 .

[4]  Mark Harman,et al.  An Analysis and Survey of the Development of Mutation Testing , 2011, IEEE Transactions on Software Engineering.

[5]  René Just,et al.  Do Redundant Mutants Affect the Effectiveness and Efficiency of Mutation Analysis? , 2012, 2012 IEEE Fifth International Conference on Software Testing, Verification and Validation.

[6]  Alex Groce,et al.  MuCheck: an extensible tool for mutation testing of haskell programs , 2014, ISSTA 2014.

[7]  Mario Piattini,et al.  Mutation Testing , 2014, IEEE Software.

[8]  Michael D. Ernst,et al.  Specification Coverage as a Measure of Test Suite Quality , 2001 .

[9]  Sang Joon Kim,et al.  A Mathematical Theory of Communication , 2006 .

[10]  Yves Le Traon,et al.  Trivial Compiler Equivalence: A Large Scale Empirical Study of a Simple, Fast and Effective Equivalent Mutant Detection Technique , 2015, 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering.

[11]  Pascale Thévenod-Fosse,et al.  Software error analysis: a real case study involving real faults and mutations , 1996, ISSTA '96.

[12]  Gregg Rothermel,et al.  On the Use of Mutation Faults in Empirical Assessments of Test Case Prioritization Techniques , 2006, IEEE Transactions on Software Engineering.

[13]  Auri Marcelo Rizzo Vincenzi,et al.  Toward the determination of sufficient mutant operators for C † , 2001, Softw. Test. Verification Reliab..

[14]  Akbar Siami Namin,et al.  Prioritizing Mutation Operators Based on Importance Sampling , 2010, 2010 IEEE 21st International Symposium on Software Reliability Engineering.

[15]  Michael D. Ernst,et al.  Improving test suites via operational abstraction , 2003, 25th International Conference on Software Engineering, 2003. Proceedings..

[16]  A. Jefferson Offutt,et al.  Mutant Subsumption Graphs , 2014, 2014 IEEE Seventh International Conference on Software Testing, Verification and Validation Workshops.

[17]  B.H. Smith,et al.  An Empirical Evaluation of the MuJava Mutation Operators , 2007, Testing: Academic and Industrial Conference Practice and Research Techniques - MUTATION (TAICPART-MUTATION 2007).

[18]  Richard J. Lipton,et al.  Theoretical and empirical studies on using program mutation to test the functional correctness of programs , 1980, POPL '80.

[19]  Lu Zhang,et al.  An Empirical Study on the Scalability of Selective Mutation Testing , 2014, 2014 IEEE 25th International Symposium on Software Reliability Engineering.

[20]  Richard J. Lipton,et al.  Hints on Test Data Selection: Help for the Practicing Programmer , 1978, Computer.

[21]  Weichen Eric Wong On mutation and data flow , 1993 .

[22]  Lionel C. Briand,et al.  Using Mutation Analysis for Assessing and Comparing Testing Coverage Criteria , 2006, IEEE Transactions on Software Engineering.

[23]  Henrique Madeira,et al.  Emulation of software faults by educated mutations at machine-code level , 2002, 13th International Symposium on Software Reliability Engineering, 2002. Proceedings..

[24]  Phyllis G. Frankl,et al.  Mutation Testing for Java Database Applications , 2009, 2009 International Conference on Software Testing Verification and Validation.

[25]  A. Jefferson Offutt,et al.  Inter-class mutation operators for Java , 2002, 13th International Symposium on Software Reliability Engineering, 2002. Proceedings..

[26]  Andreas Zeller,et al.  Efficient mutation testing by checking invariant violations , 2009, ISSTA.

[27]  Mike Papadakis,et al.  Evaluating Mutation Testing Alternatives: A Collateral Experiment , 2010, 2010 Asia Pacific Software Engineering Conference.

[28]  A. Jefferson Offutt The Coupling Effect: Fact or Fiction , 1989, Symposium on Testing, Analysis, and Verification.

[29]  WatanabeSatosi Information theoretical analysis of multivariate correlation , 1960 .

[30]  Fan Wu,et al.  Mutation testing of memory-related operators , 2015, 2015 IEEE Eighth International Conference on Software Testing, Verification and Validation Workshops (ICSTW).

[31]  Sarfraz Khurshid,et al.  Operator-based and random mutant selection: Better together , 2013, 2013 28th IEEE/ACM International Conference on Automated Software Engineering (ASE).

[32]  Lech Madeyski,et al.  Judy - a mutation testing tool for Java , 2010, IET Softw..

[33]  Michael D. Ernst,et al.  Automatic generation of program specifications , 2002, ISSTA '02.

[34]  Michael D. Ernst,et al.  Are mutants a valid substitute for real faults in software testing? , 2014, SIGSOFT FSE.

[35]  Timothy A. Budd,et al.  Program Testing by Specification Mutation , 1985, Comput. Lang..

[36]  Yue Jia,et al.  MILU: A Customizable, Runtime-Optimized Higher Order Mutation Testing Tool for the Full C Language , 2008, Testing: Academic & Industrial Conference - Practice and Research Techniques (taic part 2008).

[37]  Gregg Rothermel,et al.  An experimental evaluation of selective mutation , 1993, Proceedings of 1993 15th International Conference on Software Engineering.

[38]  A Jeeerson Ooutt,et al.  Subsumption of Condition Coverage Techniques by Mutation Testing , 1996 .

[39]  R. Lipton,et al.  Mutation analysis , 1998 .

[40]  René Just,et al.  The major mutation framework: efficient and scalable mutation analysis for Java , 2014, ISSTA 2014.

[41]  Timothy Alan Budd,et al.  Mutation analysis of program test data , 1980 .

[42]  Roland H. Untch On reduced neighborhood mutation analysis using a single mutagenic operator , 2009, ACM-SE 47.

[43]  Gregg Rothermel,et al.  An experimental determination of sufficient mutant operators , 1996, TSEM.

[44]  W. Eric Wong,et al.  Reducing the cost of mutation testing: An empirical study , 1995, J. Syst. Softw..

[45]  K. S. How Tai Wah A Theoretical Study of Fault Coupling , 2000, Softw. Test. Verification Reliab..

[46]  Macario Polo,et al.  Bacterio: Java mutation testing tool: A framework to evaluate quality of tests cases , 2012, 2012 28th IEEE International Conference on Software Maintenance (ICSM).

[47]  A. Jefferson Offutt,et al.  MuJava: a mutation system for java , 2006, ICSE.

[48]  Charles F. Hockett,et al.  A mathematical theory of communication , 1948, MOCO.

[49]  Morgan B Kaufmann,et al.  Mutation Testing for the New Century , 2002, Computer.

[50]  Michael Satosi Watanabe,et al.  Information Theoretical Analysis of Multivariate Correlation , 1960, IBM J. Res. Dev..

[51]  W. Eric Wong,et al.  An empirical comparison of data flow and mutation‐based test adequacy criteria , 1994, Softw. Test. Verification Reliab..

[52]  Alex Groce,et al.  Code coverage for suite evaluation by developers , 2014, ICSE.

[53]  Alex Groce,et al.  Comparing non-adequate test suites using coverage criteria , 2013, ISSTA.

[54]  Chao Wang,et al.  CCmutator: A mutation generator for concurrency constructs in multithreaded C/C++ applications , 2013, 2013 28th IEEE/ACM International Conference on Automated Software Engineering (ASE).

[55]  Pascale Thévenod-Fosse,et al.  A mutation analysis tool for Java programs , 2003, International Journal on Software Tools for Technology Transfer.

[56]  Om Prakash Sangwan,et al.  A Study and Review on the Development of Mutation Testing Tools for Java and Aspect-J Programs , 2014 .

[57]  Alex Groce,et al.  On The Limits of Mutation Reduction Strategies , 2016, 2016 IEEE/ACM 38th International Conference on Software Engineering (ICSE).

[58]  Alex Groce,et al.  Do Mutation Reduction Strategies Matter , 2015 .

[59]  A. Jefferson Offutt,et al.  Automatically detecting equivalent mutants and infeasible paths , 1997, Softw. Test. Verification Reliab..

[60]  Douglas Baldwin,et al.  Heuristics for Determining Equivalence of Program Mutations. , 1979 .

[61]  Akbar Siami Namin,et al.  The influence of size and coverage on test suite effectiveness , 2009, ISSTA.

[62]  Mickaël Delahaye,et al.  A Comparison of Mutation Analysis Tools for Java , 2013, 2013 13th International Conference on Quality Software.

[63]  Lionel C. Briand,et al.  Is mutation an appropriate tool for testing experiments? , 2005, ICSE.

[64]  Alex Groce,et al.  Mutations: How Close are they to Real Faults? , 2014, 2014 IEEE 25th International Symposium on Software Reliability Engineering.

[65]  Andreas Zeller,et al.  Javalanche: efficient mutation testing for Java , 2009, ESEC/SIGSOFT FSE.

[66]  Ivan Moore Jester - a JUnit test tester , 2001 .

[67]  R.A. DeMillo,et al.  An extended overview of the Mothra software testing environment , 1988, [1988] Proceedings. Second Workshop on Software Testing, Verification, and Analysis.

[68]  Darko Marinov,et al.  MuTMuT: Efficient Exploration for Mutation Testing of Multithreaded Code , 2010, 2010 Third International Conference on Software Testing, Verification and Validation.

[69]  Tao Xie,et al.  Is operator-based mutant selection superior to random mutant selection? , 2010, 2010 ACM/IEEE 32nd International Conference on Software Engineering.

[70]  Akbar Siami Namin,et al.  Sufficient mutation operators for measuring test effectiveness , 2008, 2008 ACM/IEEE 30th International Conference on Software Engineering.

[71]  Y. Yesha,et al.  Specification mutation for test generation and analysis , 2004 .

[72]  A. Jefferson Offutt,et al.  Investigations of the software testing coupling effect , 1992, TSEM.

[73]  A.P. Mathur Performance, effectiveness, and reliability issues in software testing , 1991, [1991] Proceedings The Fifteenth Annual International Computer Software & Applications Conference.

[74]  Anna Derezinska,et al.  Analysis of Mutation Operators for the Python Language , 2014, DepCoS-RELCOMEX.

[75]  MaYu-Seung,et al.  MuJava: an automated class mutation system , 2005 .

[76]  Alex Groce,et al.  How hard does mutation analysis have to be, anyway? , 2015, 2015 IEEE 26th International Symposium on Software Reliability Engineering (ISSRE).

[77]  J. A. Acree On mutation , 1980 .

[78]  Mark Harman,et al.  Multi Objective Higher Order Mutation Testing with Genetic Programming , 2009 .

[79]  S. Inglis,et al.  Jumble Java Byte Code to Measure the Effectiveness of Unit Tests , 2007, Testing: Academic and Industrial Conference Practice and Research Techniques - MUTATION (TAICPART-MUTATION 2007).

[80]  A. Jefferson Offutt,et al.  Using compiler optimization techniques to detect equivalent mutants , 1994, Softw. Test. Verification Reliab..

[81]  A. Jefferson Offutt,et al.  Establishing Theoretical Minimal Sets of Mutants , 2014, 2014 IEEE Seventh International Conference on Software Testing, Verification and Validation.

[82]  Michael R. Lyu,et al.  The effect of code coverage on fault detection under different testing profiles , 2005, A-MOST.

[83]  A. Jefferson Offutt,et al.  MuJava: an automated class mutation system , 2005, Softw. Test. Verification Reliab..

[84]  Mark Harman,et al.  A study of equivalent and stubborn mutation operators using human analysis of equivalence , 2014, ICSE.

[85]  K. S. How Tai Wah,et al.  An analysis of the coupling effect I: single test data , 2003, Sci. Comput. Program..