Code coverage differences of Java bytecode and source code instrumentation tools

Many software testing fields, like white-box testing, test case generation, test prioritization, and fault localization, depend on code coverage measurement. If used as an overall completeness measure, the minor inaccuracies of coverage data reported by a tool do not matter that much; however, in certain situations, they can lead to serious confusion. For example, a code element that is falsely reported as covered can introduce false confidence in the test. This work investigates code coverage measurement issues for the Java programming language. For Java, the prevalent approach to code coverage measurement is using bytecode instrumentation due to its various benefits over source code instrumentation. As we have experienced, bytecode instrumentation-based code coverage tools produce different results than source code instrumentation-based ones in terms of the reported items as covered. We report on an empirical study to compare the code coverage results provided by tools using the different instrumentation types for Java coverage measurement on the method level. In particular, we want to find out how much a bytecode instrumentation approach is inaccurate compared to a source code instrumentation method. The differences are systematically investigated both in quantitative (how much the outputs differ) and in qualitative terms (what the causes for the differences are). In addition, the impact on test prioritization and test suite reduction—a possible application of coverage measurement—is investigated in more detail as well.

[1]  A. Jefferson Offutt,et al.  Is bytecode instrumentation as good as source code instrumentation: An empirical study with industrial tools (Experience Report) , 2013, 2013 IEEE 24th International Symposium on Software Reliability Engineering (ISSRE).

[2]  Mark Harman,et al.  Fault localization prioritization: Comparing information-theoretic and coverage-based approaches , 2013, TSEM.

[3]  Rui Abreu,et al.  Prioritizing tests for fault localization through ambiguity group reduction , 2011, 2011 26th IEEE/ACM International Conference on Automated Software Engineering (ASE 2011).

[4]  HarmanMark,et al.  Fault localization prioritization , 2013 .

[5]  Qian Yang,et al.  A Survey of Coverage-Based Testing Tools , 2009, Comput. J..

[6]  Atul Gupta,et al.  A Multipurpose Code Coverage Tool for Java , 2007, 2007 40th Annual Hawaii International Conference on System Sciences (HICSS'07).

[7]  Tibor Gyimóthy,et al.  Test suite reduction for fault detection and localization: A combined approach , 2014, 2014 Software Evolution Week - IEEE Conference on Software Maintenance, Reengineering, and Reverse Engineering (CSMR-WCRE).

[8]  Gregg Rothermel,et al.  Empirical studies of test‐suite reduction , 2002, Softw. Test. Verification Reliab..

[9]  Dorothy Graham,et al.  Foundations of Software Testing: ISTQB certification, 3rd Edition , 2012 .

[10]  Eugene V. Stakhiv,et al.  Empirical Studies , 2004, Administration and Policy in Mental Health and Mental Health Services Research.

[11]  Gregg Rothermel,et al.  An empirical study of regression test selection techniques , 1998, Proceedings of the 20th International Conference on Software Engineering.

[12]  Rex Black,et al.  Foundations of Software Testing ISTQB Certification , 2006 .

[13]  Mary Jean Harrold,et al.  Empirical evaluation of the tarantula automatic fault-localization technique , 2005, ASE.

[14]  Gregg Rothermel,et al.  Prioritizing test cases for regression testing , 2000, ISSTA '00.

[15]  Thomas J. Ostrand,et al.  White‐Box Testing , 2002 .

[16]  Gordon Fraser,et al.  EvoSuite: automatic test suite generation for object-oriented software , 2011, ESEC/FSE '11.

[17]  Gregg Rothermel,et al.  An empirical investigation of program spectra , 1998, PASTE '98.

[18]  Yves Ledru,et al.  Experiences in coverage testing of a Java middleware , 2005, SEM '05.

[19]  George Mason,et al.  Procedures for Reducing the Size of Coverage-based Test Sets , 1995 .

[20]  Macario Polo,et al.  Mutation Testing Cost Reduction Techniques: A Survey , 2010, IEEE Software.

[21]  Simeon C. Ntafos,et al.  A Comparison of Some Structural Testing Strategies , 1988, IEEE Trans. Software Eng..

[22]  Mark Harman,et al.  An Analysis and Survey of the Development of Mutation Testing , 2011, IEEE Transactions on Software Engineering.

[23]  Elinda Kajo,et al.  An Evaluation of Java Code Coverage Testing Tools , 2012, BCI.

[24]  Sanjai Rayadurgam,et al.  Coverage based test-case generation using model checkers , 2001, Proceedings. Eighth Annual IEEE International Conference and Workshop On the Engineering of Computer-Based Systems-ECBS 2001.

[25]  Mark Harman,et al.  Regression testing minimization, selection and prioritization: a survey , 2012, Softw. Test. Verification Reliab..

[26]  Qian Yang,et al.  A survey of coverage based testing tools , 2006, AST '06.

[27]  Walter Binder,et al.  Advanced Java bytecode instrumentation , 2007, PPPJ.

[28]  Rui Abreu,et al.  A diagnosis-based approach to software comprehension , 2014, ICPC 2014.

[29]  Reid Holmes,et al.  Coverage is not strongly correlated with test suite effectiveness , 2014, ICSE.

[30]  Kenneth I. Magel,et al.  Examining the Effectiveness of Testing Coverage Tools : An Empirical Study , 2014 .

[31]  Francesca Arcelli Fontana,et al.  An Experience Report on Using Code Smells Detection Tools , 2011, 2011 IEEE Fourth International Conference on Software Testing, Verification and Validation Workshops.

[32]  Alessandro Orso,et al.  Understanding myths and realities of test-suite evolution , 2012, SIGSOFT FSE.

[33]  Tibor Gyimóthy,et al.  Toolset and Program Repository for Code Coverage-Based Test Suite Analysis and Manipulation , 2014, 2014 IEEE 14th International Working Conference on Source Code Analysis and Manipulation.

[34]  Päivi Raulamo-Jurvanen,et al.  Decision Support for Selecting Tools for Software Test Automation , 2017, SOEN.

[35]  David M. Clark,et al.  FLINT: Fault Localisation using Information Theory , 2011 .

[36]  Ulf Nilsson,et al.  A Comparative Study of Industrial Static Analysis Tools , 2008, SSV.

[37]  Tibor Gyimóthy,et al.  Negative Effects of Bytecode Instrumentation on Java Source Code Coverage , 2016, 2016 IEEE 23rd International Conference on Software Analysis, Evolution, and Reengineering (SANER).