Mutant census : an empirical examination of the competent programmer hypothesis

Mutation analysis is often used to compare the effectiveness of different test suites or testing techniques. One of the main assumptions underlying this technique is the Competent Programmer Hypothesis, which proposes that programs are very close to a correct version, or that the difference between current and correct code for each fault is very small. Testers have generally assumed, on the basis of the Competent Programmer Hypothesis, that mutation analysis with single token changes produces mutations that are similar to real faults. While there exists some evidence that supports this assumption, these studies are based on analysis of a limited and potentially non-representative set of programs and are hence not conclusive. In this paper, we investigate the Competent Programmer Hypothesis by analyzing changes (and bug-fixes in particular) in a very large set of randomly selected projects using four different programming languages. Our analysis suggests that a typical fault involves about three to four tokens, and is seldom equivalent to any traditional mutation operator. We also find the most frequently occurring syntactical patterns, and identify the factors that affect the real bug-fix change distribution. Our analysis suggests that different languages have different distributions, which in turn suggests that operators optimal in one language may not be optimal for others. Moreover, our results suggest that mutation analysis stands in need of better empirical support of the connection between mutant detection and detection of actual program faults in a larger body of real programs.

[1]  Lionel C. Briand,et al.  Is mutation an appropriate tool for testing experiments? , 2005, ICSE.

[2]  Thomas J. Ostrand,et al.  Experiments on the effectiveness of dataflow- and control-flow-based test adequacy criteria , 1994, Proceedings of 16th International Conference on Software Engineering.

[3]  Akbar Siami Namin,et al.  The use of mutation in testing experiments and its sensitivity to external threats , 2011, ISSTA '11.

[4]  A. Jefferson Offutt,et al.  Introduction to Software Testing , 2008 .

[5]  Richard A. DeMillo,et al.  Completely validated software: test adequacy and program mutation (panel session) , 1989, ICSE '89.

[6]  Alfredo Benso,et al.  Fault Injection Techniques and Tools for Embedded Systems , 2003 .

[7]  Wynne Hsu,et al.  DESIGN OF MUTANT OPERATORS FOR THE C PROGRAMMING LANGUAGE , 2006 .

[8]  Jörgen Christmansson,et al.  Error injection aimed at fault removal in fault tolerance mechanisms-criteria for error selection using field data on software faults , 1996, Proceedings of ISSRE '96: 7th International Symposium on Software Reliability Engineering.

[9]  Alex Groce,et al.  Comparing non-adequate test suites using coverage criteria , 2013, ISSTA.

[10]  R. Lipton,et al.  Mutation analysis , 1998 .

[11]  Henrique Madeira,et al.  Emulation of Software Faults: A Field Data Study and a Practical Approach , 2006, IEEE Transactions on Software Engineering.

[12]  Dewayne E. Perry,et al.  Toward understanding the rhetoric of small source code changes , 2005, IEEE Transactions on Software Engineering.

[13]  Gregg Rothermel,et al.  An experimental determination of sufficient mutant operators , 1996, TSEM.

[14]  Henrique Madeira,et al.  Definition of software fault emulation operators: a field data study , 2003, 2003 International Conference on Dependable Systems and Networks, 2003. Proceedings..

[15]  Richard J. Lipton,et al.  Theoretical and empirical studies on using program mutation to test the functional correctness of programs , 1980, POPL '80.

[16]  K. S. How Tai Wah,et al.  An analysis of the coupling effect I: single test data , 2003, Sci. Comput. Program..

[17]  K. S. How Tai Wah A Theoretical Study of Fault Coupling , 2000, Softw. Test. Verification Reliab..

[18]  Audris Mockus,et al.  Identifying reasons for software changes using historic databases , 2000, Proceedings 2000 International Conference on Software Maintenance.

[19]  Lionel C. Briand,et al.  Using Mutation Analysis for Assessing and Comparing Testing Coverage Criteria , 2006, IEEE Transactions on Software Engineering.

[20]  J.H. Andrews,et al.  Is mutation an appropriate tool for testing experiments? [software testing] , 2005, Proceedings. 27th International Conference on Software Engineering, 2005. ICSE 2005..

[21]  Pascale Thévenod-Fosse,et al.  Software error analysis: a real case study involving real faults and mutations , 1996, ISSTA '96.

[22]  Foutse Khomh,et al.  Is it a bug or an enhancement?: a text-based approach to classify change requests , 2008, CASCON '08.

[23]  Ram Chillarege,et al.  Orthogonal defect classification , 1996 .

[24]  Sunghun Kim,et al.  Toward an understanding of bug fix patterns , 2009, Empirical Software Engineering.

[25]  A. Jefferson Offutt,et al.  Inter-class mutation operators for Java , 2002, 13th International Symposium on Software Reliability Engineering, 2002. Proceedings..

[26]  Ram Chillarege,et al.  Generation of an error set that emulates software faults based on field data , 1996, Proceedings of Annual Symposium on Fault Tolerant Computing.

[27]  A. Jefferson Offutt,et al.  Investigations of the software testing coupling effect , 1992, TSEM.

[28]  Hong Zhu,et al.  Software unit test coverage and adequacy , 1997, ACM Comput. Surv..