The role of non-exact replications in software engineering experiments

In no science or engineering discipline does it make sense to speak of isolated experiments. The results of a single experiment cannot be viewed as representative of the underlying reality. Experiment replication is the repetition of an experiment to double-check its results. Multiple replications of an experiment increase the confidence in its results. Software engineering has tried its hand at the identical (exact) replication of experiments in the way of the natural sciences (physics, chemistry, etc.). After numerous attempts over the years, apart from experiments replicated by the same researchers at the same site, no exact replications have yet been achieved. One key reason for this is the complexity of the software development setting, which prevents the many experimental conditions from being identically reproduced. This paper reports research into whether non-exact replications can be of any use. We propose a process aimed at researchers running non-exact replications. Researchers enacting this process will be able to identify new variables that are possibly having an effect on experiment results. The process consists of four phases: replication definition and planning, replication operation and analysis, replication interpretation, and analysis of the replication’s contribution. To test the effectiveness of the proposed process, we have conducted a multiple-case study, revealing the variables learned from two different replications of an experiment.

[1]  Gregory V. Wilson,et al.  On the difficulty of replicating human subjects studies in software engineering , 2008, 2008 ACM/IEEE 30th International Conference on Software Engineering.

[2]  Anne Whitehead,et al.  Meta-Analysis of Controlled Clinical Trials , 2002 .

[3]  Oliver Laitenberger,et al.  (Quasi-)Experimental Studies in Industrial Settings , 2003, Lecture Notes on Empirical Software Engineering.

[4]  Ana M. Moreno,et al.  Lecture Notes on Empirical Software Engineering , 2003, Series on Software Engineering and Knowledge Engineering.

[5]  James Miller,et al.  Comparing and combining software defect detection techniques: a replicated empirical study , 1997, ESEC '97/FSE-5.

[6]  Jeffrey C. Carver,et al.  Replicated Studies: Building a Body of Knowledge about Software Reading Techniques , 2003, Lecture Notes on Empirical Software Engineering.

[7]  James Miller,et al.  Applying meta-analytical procedures to software engineering experiments , 2000, J. Syst. Softw..

[8]  Adam A. Porter,et al.  Assessing Software Review Meetings: Results of a Comparative Analysis of Two Experimental Studies , 1997, IEEE Trans. Software Eng..

[9]  Victor R. Basili,et al.  Comparing the Effectiveness of Software Testing Strategies , 1987, IEEE Transactions on Software Engineering.

[10]  Natalia Juristo Juzgado,et al.  Analysis of the influence of communication between researchers on experiment replication , 2006, ISESE '06.

[11]  Erik Kamsties,et al.  An Empirical Evaluation of Three Defect-Detection Techniques , 1995, ESEC.

[12]  Gary James Jason,et al.  The Logic of Scientific Discovery , 1988 .

[13]  Forrest Shull,et al.  Developing techniques for using software documents: a series of empirical studies , 1998 .

[14]  Natalia Juristo Juzgado,et al.  Functional Testing, Structural Testing, and Code Reading: What Fault Type Do They Each Detect? , 2003, ESERNET.

[15]  F. E. Close Too hot to handle : the story of the race for cold fusion , 1991 .

[16]  Jeffrey C. Carver,et al.  The role of replications in Empirical Software Engineering , 2008, Empirical Software Engineering.

[17]  R. Yin Case Study Research: Design and Methods , 1984 .

[18]  Amela Karahasanovic,et al.  A survey of controlled experiments in software engineering , 2005, IEEE Transactions on Software Engineering.

[19]  Jeffrey C. Carver,et al.  A Pragmatic Documents Standard for an Experience Library: Roles,Documen, Contents and Structure , 2001 .

[20]  Per Runeson,et al.  What do we know about defect detection methods? [software testing] , 2006, IEEE Software.

[21]  Natalia Juristo Juzgado,et al.  Replications types in experimental disciplines , 2010, ESEM '10.

[22]  M. Kendall,et al.  The Logic of Scientific Discovery. , 1959 .

[23]  L. Hedges,et al.  Statistical Methods for Meta-Analysis , 1987 .

[24]  Natalia Juristo Juzgado,et al.  Reviewing 25 Years of Testing Technique Experiments , 2004, Empirical Software Engineering.

[25]  James Miller,et al.  An empirical evaluation of defect detection techniques , 1997, Inf. Softw. Technol..

[26]  S. Sharp,et al.  Explaining heterogeneity in meta-analysis: a comparison of methods. , 1999 .

[27]  Jeffrey C. Carver,et al.  A Framework for Software Engineering Experimental Replications , 2008, 13th IEEE International Conference on Engineering of Complex Computer Systems (iceccs 2008).

[28]  A. Ehrenberg,et al.  The Design of Replicated Studies , 1993 .