An Initiative to Improve Reproducibility and Empirical Evaluation of Software Testing Techniques

The current concern regarding quality of evaluation performed in existing studies reveals the need for methods and tools to assist in the definition and execution of empirical studies and experiments. However, when trying to apply general methods from empirical software engineering in specific fields, such as evaluation of software testing techniques, new obstacles and threats to validity appears, hindering researchers' use of empirical methods. This paper discusses those issues specific for evaluation of software testing techniques and proposes an initiative for a collaborative effort to encourage reproducibility of experiments evaluating software testing techniques (STT). We also propose the development of a tool that enables automatic execution and analysis of experiments producing a reproducible research compendia as output that is, in turn, shared among researchers. There are many expected benefits from this Endeavour, such as providing a foundation for evaluation of existing and upcoming STT, and allowing researchers to devise and publish better experiments.

[1]  Rajesh Subramanyan,et al.  A survey on model-based testing approaches: a systematic review , 2007, WEASELTech '07.

[2]  Lionel C. Briand,et al.  A Hitchhiker's guide to statistical tests for assessing randomized algorithms in software engineering , 2014, Softw. Test. Verification Reliab..

[3]  Natalia Juristo Juzgado,et al.  Reviewing 25 Years of Testing Technique Experiments , 2004, Empirical Software Engineering.

[4]  Lionel C. Briand,et al.  Empirical studies of software testing techniques: challenges, practical strategies, and future research , 2004, SOEN.

[5]  Claes Wohlin,et al.  Experimentation in Software Engineering , 2000, The Kluwer International Series in Software Engineering.

[6]  Per Runeson,et al.  Empirical evaluations of regression test selection techniques: a systematic review , 2008, ESEM '08.

[7]  Jesús M. González-Barahona,et al.  On the reproducibility of empirical software engineering studies based on data retrieved from development repositories , 2011, Empirical Software Engineering.

[8]  Richard Torkar,et al.  Searching for models to evaluate software technology , 2013, 2013 1st International Workshop on Combining Modelling and Search-Based Software Engineering (CMSBSE).

[9]  A. Cann Replication , 2003, Principles of Molecular Virology.

[10]  Per Runeson,et al.  Guidelines for conducting and reporting case study research in software engineering , 2009, Empirical Software Engineering.

[11]  Omar S. Gómez,et al.  Replication , Reproduction and Re-analysis : Three ways for verifying experimental , 2010 .

[12]  Robert Feldt,et al.  Finding test data with specific properties via metaheuristic search , 2013, 2013 IEEE 24th International Symposium on Software Reliability Engineering (ISSRE).

[13]  Natalia Juristo Juzgado,et al.  Understanding replication of experiments in software engineering: A classification , 2014, Inf. Softw. Technol..