Evaluating the Effects of Compiler Optimizations on Mutation Testing at the Compiler IR Level

Software testing is one of the most widely used approaches for improving software reliability. The effectiveness of testing depends to a large extent on the quality of test suites. Researchers have developed various techniques to evaluate the quality of test suites. Of these techniques, mutation testing is generally considered to be the most advanced but also expensive. A key result of applying mutation testing to a given test suite is the mutation score representing the percentage of mutants killed by the test suite. Ideally the mutation score is computed ignoring the mutants that are semantically equivalent to the original code under test or to one another. In this paper, we investigate a new perspective on mutation testing: evaluating how standard compiler optimizations affect the cost and results of mutation testing performed at the compiler intermediate representation. Our study targets LLVM, a popular compiler infrastructure that supports multiple source and target languages. Our evaluation on 18 Coreutils programs discovers several interesting relations between the numbers of mutants (including the numbers on equivalent and duplicated mutants) and mutation scores on unoptimized and optimized programs.

[1]  Andreas Zeller,et al.  The Impact of Equivalent Mutants , 2009, 2009 International Conference on Software Testing, Verification, and Validation Workshops.

[2]  Mark Harman,et al.  Using program slicing to assist in the detection of equivalent mutants , 1999, Softw. Test. Verification Reliab..

[3]  Douglas Baldwin,et al.  Heuristics for Determining Equivalence of Program Mutations. , 1979 .

[4]  Ajitha Rajan,et al.  The effect of program and model structure on mc/dc test adequacy coverage , 2008, 2008 ACM/IEEE 30th International Conference on Software Engineering.

[5]  Richard J. Lipton,et al.  Hints on Test Data Selection: Help for the Practicing Programmer , 1978, Computer.

[6]  Vikram S. Adve,et al.  LLVM: a compilation framework for lifelong program analysis & transformation , 2004, International Symposium on Code Generation and Optimization, 2004. CGO 2004..

[7]  Mark Harman,et al.  An Analysis and Survey of the Development of Mutation Testing , 2011, IEEE Transactions on Software Engineering.

[8]  Andreas Zeller,et al.  Javalanche: efficient mutation testing for Java , 2009, ESEC/SIGSOFT FSE.

[9]  Andreas Zeller,et al.  (Un-)Covering Equivalent Mutants , 2010, 2010 Third International Conference on Software Testing, Verification and Validation.

[10]  Yves Le Traon,et al.  Trivial Compiler Equivalence: A Large Scale Empirical Study of a Simple, Fast and Effective Equivalent Mutant Detection Technique , 2015, 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering.

[11]  M. Ellims,et al.  The Csaw C Mutation Tool: Initial Results , 2007, Testing: Academic and Industrial Conference Practice and Research Techniques - MUTATION (TAICPART-MUTATION 2007).

[12]  W. Eric Wong Mutation Testing for the New Century , 2001 .

[13]  Morgan B Kaufmann,et al.  Mutation Testing for the New Century , 2002, Computer.

[14]  Gregg Rothermel,et al.  An experimental determination of sufficient mutant operators , 1996, TSEM.

[15]  J.H. Andrews,et al.  Is mutation an appropriate tool for testing experiments? [software testing] , 2005, Proceedings. 27th International Conference on Software Engineering, 2005. ICSE 2005..

[16]  Sarfraz Khurshid,et al.  Studying the influence of standard compiler optimizations on symbolic execution , 2015, 2015 IEEE 26th International Symposium on Software Reliability Engineering (ISSRE).

[17]  Lionel C. Briand,et al.  Is mutation an appropriate tool for testing experiments? , 2005, ICSE.

[18]  Yue Jia,et al.  MILU: A Customizable, Runtime-Optimized Higher Order Mutation Testing Tool for the Full C Language , 2008, Testing: Academic & Industrial Conference - Practice and Research Techniques (taic part 2008).

[19]  Darko Marinov,et al.  An empirical analysis of flaky tests , 2014, SIGSOFT FSE.

[20]  Dana Angluin,et al.  Two notions of correctness and their relation to testing , 1982, Acta Informatica.

[21]  George Candea,et al.  Efficient state merging in symbolic execution , 2012, Software Engineering.

[22]  Eric Schulte,et al.  Neutral Networks of Real-World Programs and their Application to Automated Software Evolution , 2014 .

[23]  Dawson R. Engler,et al.  KLEE: Unassisted and Automatic Generation of High-Coverage Tests for Complex Systems Programs , 2008, OSDI.

[24]  Cristian Cadar,et al.  make test-zesti: A symbolic execution solution for improving regression testing , 2012, 2012 34th International Conference on Software Engineering (ICSE).

[25]  A. Jefferson Offutt,et al.  Detecting equivalent mutants and the feasible path problem , 1996, Proceedings of 11th Annual Conference on Computer Assurance. COMPASS '96.

[26]  Alper Sen,et al.  Generation of TLM testbenches using mutation testing , 2012, CODES+ISSS '12.

[27]  Richard G. Hamlet,et al.  Testing Programs with the Aid of a Compiler , 1977, IEEE Transactions on Software Engineering.

[28]  Mark Harman,et al.  How to Overcome the Equivalent Mutant Problem and Achieve Tailored Selective Mutation Using Co-evolution , 2004, GECCO.

[29]  A. Jefferson Offutt,et al.  Using compiler optimization techniques to detect equivalent mutants , 1994, Softw. Test. Verification Reliab..

[30]  M.P.E. Heimdahl,et al.  On MC/DC and implementation structure: An empirical study , 2008, 2008 IEEE/AIAA 27th Digital Avionics Systems Conference.

[31]  A. Jefferson Offutt,et al.  Introduction to Software Testing , 2008 .

[32]  A. Jefferson Offutt,et al.  Automatically detecting equivalent mutants and infeasible paths , 1997 .