An Industrial Application of Mutation Testing: Lessons, Challenges, and Research Directions

Mutation analysis evaluates a testing or debugging technique by measuring how well it detects mutants, which are systematically seeded, artificial faults. Mutation analysis is inherently expensive due to the large number of mutants it generates and due to the fact that many of these generated mutants are not effective; they are redundant, equivalent, or simply uninteresting and waste computational resources. A large body of research has focused on improving the scalability of mutation analysis and proposed numerous optimizations to, e.g., select effective mutants or efficiently execute a large number of tests against a large number of mutants. However, comparatively little research has focused on the costs and benefits of mutation testing, in which mutants are presented as testing goals to a developer, in the context of an industrial-scale software development process. This paper draws on an industrial application of mutation testing, involving 30,000+ developers and 1.9 million change sets, written in 4 programming languages. It shows that mutation testing with productive mutants does not add a significant overhead to the software development process and reports on mutation testing benefits perceived by developers. This paper also quantifies the costs of unproductive mutants, and the results suggest that achieving mutation adequacy is neither practical nor desirable. Finally, this paper describes lessons learned from these studies, highlights the current challenges of efficiently and effectively applying mutation testing in an industrial-scale software development process, and outlines research directions.

[1]  Alex Groce,et al.  Applying Mutation Analysis on Kernel Test Suites: An Experience Report , 2017, 2017 IEEE International Conference on Software Testing, Verification and Validation Workshops (ICSTW).

[2]  Michael D. Ernst,et al.  Defects4J: a database of existing faults to enable controlled testing studies for Java programs , 2014, ISSTA 2014.

[3]  Gordon Fraser,et al.  EvoSuite: automatic test suite generation for object-oriented software , 2011, ESEC/FSE '11.

[4]  Thomas W. Reps,et al.  The care and feeding of wild-caught mutants , 2017, ESEC/SIGSOFT FSE.

[5]  Richard J. Lipton,et al.  Hints on Test Data Selection: Help for the Practicing Programmer , 1978, Computer.

[6]  Mark Harman,et al.  A study of equivalent and stubborn mutation operators using human analysis of equivalence , 2014, ICSE.

[7]  René Just,et al.  Inferring mutant utility from program context , 2017, ISSTA.

[8]  Lionel C. Briand,et al.  A Hitchhiker's guide to statistical tests for assessing randomized algorithms in software engineering , 2014, Softw. Test. Verification Reliab..

[9]  Josh Levenberg,et al.  Why Google stores billions of lines of code in a single repository , 2016, Commun. ACM.

[10]  Gregg Rothermel,et al.  An experimental determination of sufficient mutant operators , 1996, TSEM.

[11]  Andreas Zeller,et al.  The Impact of Equivalent Mutants , 2009, 2009 International Conference on Software Testing, Verification, and Validation Workshops.

[12]  Goran Petrovic,et al.  State of Mutation Testing at Google , 2017, 2018 IEEE/ACM 40th International Conference on Software Engineering: Software Engineering in Practice Track (ICSE-SEIP).

[13]  A. Jefferson Offutt,et al.  Introduction to Software Testing , 2008 .

[14]  Alex Groce,et al.  On The Limits of Mutation Reduction Strategies , 2016, 2016 IEEE/ACM 38th International Conference on Software Engineering (ICSE).

[15]  René Just,et al.  Higher accuracy and lower run time: efficient mutation analysis using non‐redundant mutation operators , 2015, Softw. Test. Verification Reliab..

[16]  Brian Marick How to Misuse Code Coverage , 1999 .

[17]  René Just,et al.  The major mutation framework: efficient and scalable mutation analysis for Java , 2014, ISSTA 2014.

[18]  Robert W. Bowdidge,et al.  Why don't software developers use static analysis tools to find bugs? , 2013, 2013 35th International Conference on Software Engineering (ICSE).

[19]  René Just,et al.  Tailored Mutants Fit Bugs Better , 2016, ArXiv.

[20]  Phil McMinn,et al.  Automatic Detection and Removal of Ineffective Mutants for the Mutation Analysis of Relational Database Schemas , 2019, IEEE Transactions on Software Engineering.

[21]  Lionel C. Briand,et al.  Is mutation an appropriate tool for testing experiments? , 2005, ICSE.

[22]  Andreas Zeller,et al.  (Un-)Covering Equivalent Mutants , 2010, 2010 Third International Conference on Software Testing, Verification and Validation.

[23]  Sarfraz Khurshid,et al.  Operator-based and random mutant selection: Better together , 2013, 2013 28th IEEE/ACM International Conference on Automated Software Engineering (ASE).

[24]  Michael D. Ernst,et al.  Are mutants a valid substitute for real faults in software testing? , 2014, SIGSOFT FSE.

[25]  A. Jefferson Offutt,et al.  Establishing Theoretical Minimal Sets of Mutants , 2014, 2014 IEEE Seventh International Conference on Software Testing, Verification and Validation.

[26]  Willem Visser,et al.  What makes killing a mutant hard , 2016, 2016 31st IEEE/ACM International Conference on Automated Software Engineering (ASE).

[27]  Michael D. Ernst,et al.  Evaluating and Improving Fault Localization , 2017, 2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE).

[28]  A. Jefferson Offutt,et al.  Analyzing the validity of selective mutation with dominator mutants , 2016, SIGSOFT FSE.

[29]  Mark Harman,et al.  Predictive Mutation Testing , 2019, IEEE Transactions on Software Engineering.

[30]  Mark Harman,et al.  An Analysis and Survey of the Development of Mutation Testing , 2011, IEEE Transactions on Software Engineering.

[31]  André L. M. Santos,et al.  Avoiding useless mutants , 2017, GPCE.