How good are my tests

Background: Test quality is a prerequisite for achieving production system quality. While the concept of quality is multidimensional, most of the effort in testing context has been channelled towards measuring test effectiveness. Objective: While effectiveness of tests is certainly important, we aim to identify a core list of testing principles that also address other quality facets of testing, and to discuss how they can be quantified as indicators of test quality. Method: We have conducted a two-day workshop with our industry partners to come up with a list of relevant principles and best practices expected to result in high quality tests. We then utilised our academic and industrial training materials together with recommendations in practitioner oriented testing books to refine the list. We surveyed existing literature for potential metrics to quantify identified principles. Results: We have identified a list of 15 testing principles to capture the essence of testing goals and best practices from quality perspective. Eight principles do not map to existing test smells and we propose metrics for six of those. Further, we have identified additional potential metrics for the seven principles that partially map to test smells. Conclusion: We provide a core list of testing principles along with a discussion of possible ways to quantify them for assessing goodness of tests. We believe that our work would be useful for practitioners in assessing the quality of their tests from multiple perspectives including but not limited to maintainability, comprehension and simplicity.

[1]  Natalia Juristo Juzgado,et al.  Towards an operationalization of test-driven development skills: An industrial empirical study , 2015, Inf. Softw. Technol..

[2]  Roy Osherove The Art of Unit Testing: With Examples in .NET , 2009 .

[3]  Arie van Deursen,et al.  Automated Detection of Test Fixture Strategies and Smells , 2013, 2013 IEEE Sixth International Conference on Software Testing, Verification and Validation.

[4]  Amin Milani Fard,et al.  An empirical study of bugs in test code , 2015, 2015 IEEE International Conference on Software Maintenance and Evolution (ICSME).

[5]  R. Gunning The Fog Index After Twenty Years , 1969 .

[6]  Alex Groce,et al.  Code coverage for suite evaluation by developers , 2014, ICSE.

[7]  Nachiappan Nagappan Toward a software testing and reliability early warning metric suite , 2004, Proceedings. 26th International Conference on Software Engineering.

[8]  Alexandra Martínez,et al.  Cost Effectiveness of Unit Testing: A Case Study in a Financial Institution , 2013, 2013 ACM / IEEE International Symposium on Empirical Software Engineering and Measurement.

[9]  Tibor Gyimóthy,et al.  Beyond code coverage — An approach for test suite assessment and improvement , 2015, 2015 IEEE Eighth International Conference on Software Testing, Verification and Validation Workshops (ICSTW).

[10]  Gabriele Bavota,et al.  An empirical analysis of the distribution of unit test smells and their impact on software maintenance , 2012, 2012 28th IEEE International Conference on Software Maintenance (ICSM).

[11]  Bertrand Meyer,et al.  Is Branch Coverage a Good Measure of Testing Effectiveness? , 2010, LASER Summer School.

[12]  Serge Demeyer,et al.  Characterizing the Relative Significance of a Test Smell , 2006, 2006 22nd IEEE International Conference on Software Maintenance.

[13]  Gordon Fraser,et al.  Modeling readability to improve unit tests , 2015, ESEC/SIGSOFT FSE.

[14]  Filippo Lanubile,et al.  Inspecting Automated Test Code: A Preliminary Study , 2007, XP.

[15]  Michael Ellims,et al.  The Economics of Unit Testing , 2006, Empirical Software Engineering.

[16]  Serge Demeyer,et al.  On The Detection of Test Smells: A Metrics-Based Approach for General Fixture and Eager Test , 2007, IEEE Trans. Software Eng..

[17]  Bertrand Meyer,et al.  Seven Principles of Software Testing , 2008, Computer.

[18]  Daniel Sundmark,et al.  Test case quality in test driven development: A study design and a pilot experiment , 2012, EASE.

[19]  Laurie A. Williams,et al.  Early estimation of software quality using in-process testing metrics: a controlled case study , 2005, ACM SIGSOFT Softw. Eng. Notes.

[20]  Gabriele Bavota,et al.  Are test smells really harmful? An empirical study , 2014, Empirical Software Engineering.

[21]  Vahid Garousi,et al.  Software test-code engineering: A systematic mapping , 2015, Inf. Softw. Technol..

[22]  Kent L. Beck,et al.  Test-driven Development - by example , 2002, The Addison-Wesley signature series.

[23]  Pekka Abrahamsson,et al.  Providing test quality feedback using static source code and automatic test suite metrics , 2005, 16th IEEE International Symposium on Software Reliability Engineering (ISSRE'05).

[24]  Hong Zhu,et al.  Software unit test coverage and adequacy , 1997, ACM Comput. Surv..

[25]  Kim Herzig,et al.  Using Pre-Release Test Failures to Build Early Post-Release Defect Prediction Models , 2014, 2014 IEEE 25th International Symposium on Software Reliability Engineering.

[26]  Arie van Deursen,et al.  Refactoring test code , 2001 .

[27]  Burak Turhan,et al.  On the role of tests in test-driven development: a differentiated and partial replication , 2013, Empirical Software Engineering.

[28]  Daniel Sundmark,et al.  Quality of Testing in Test Driven Development , 2012, 2012 Eighth International Conference on the Quality of Information and Communications Technology.

[29]  Arie van Deursen,et al.  On the Interplay Between Software Testing and Evolution and its Effect on Program Comprehension , 2008, Software Evolution.

[30]  Reid Holmes,et al.  Coverage is not strongly correlated with test suite effectiveness , 2014, ICSE.

[31]  Yuri Chernak,et al.  Validating and Improving Test-Case Effectiveness , 2001, IEEE Softw..

[32]  Bart Van Rompaey,et al.  TestQ: Exploring Structural and Maintenance Characteristics of Unit Test Suites , 2008 .

[33]  Ayse Basar Bener,et al.  Empirical analysis of factors affecting confirmation bias levels of software engineers , 2014, Software Quality Journal.

[34]  Audris Mockus,et al.  Test coverage and post-verification defects: A multiple case study , 2009, ESEM 2009.

[35]  Mats Per Erik Heimdahl,et al.  Programs, tests, and oracles: the foundations of testing revisited , 2011, 2011 33rd International Conference on Software Engineering (ICSE).

[36]  Mauricio Finavaro Aniche,et al.  What Do the Asserts in a Unit Test Tell Us about Code Quality? A Study on Open Source and Industrial Projects , 2013, 2013 17th European Conference on Software Maintenance and Reengineering.