Replication of Software Engineering Experiments

Experimentation has played a major role in scientific advancement. Replication is one of the essentials of the experimental methods. In replications, experiments are repeated aiming to check their results. Successful replication increases the validity and reliability of the outcomes observed in an experiment. There is debate about the best way of running replications of Software Engineering (SE) experiments. Some of the questions that have cropped up in this debate are, "Should replicators reuse the baseline experiment materials? Which is the adequate sort of communication among experimenters and replicators if any? What elements of the experimental structure can be changed and still be considered a replication instead of a new experiment?". A deeper understanding of the concept of replication should help to clarify these issues as well as increase and improve replications in SE experimental practices. In this chapter, we study the concept of replication in order to gain insight. The chapter starts with an introduction to the importance of replication and the state of replication in ESE. Then we discuss replication from both the statistical and scientific viewpoint. Based on a review of the diverse types of replication used in other scientific disciplines, we identify the different types of replication that are feasible to be run in our discipline. Finally, we present the different purposes that replication can serve in Experimental Software Engineering (ESE).

[1]  Siah Hwee Ang,et al.  Increasing Replication for Knowledge Accumulation in Strategy Research , 2003 .

[2]  M. Khoury,et al.  Most Published Research Findings Are False—But a Little Replication Goes a Long Way , 2007, PLoS medicine.

[3]  S. Schmidt Shall we Really do it Again? The Powerful Concept of Replication is Neglected in the Social Sciences , 2009 .

[4]  James Miller Can results from software engineering experiments be safely combined? , 1999, Proceedings Sixth International Software Metrics Symposium (Cat. No.PR00403).

[5]  Nicholas Wade William Broad Betrayers of the Truth: Fraud and Deceit in the Halls of Science , 1983 .

[6]  Ola Blomkvist,et al.  An Extended Replication of an Experiment for Assessing Methods for Software Requirements Inspections , 1998, Empirical Software Engineering.

[7]  Natalia Juristo Juzgado,et al.  Functional Testing, Structural Testing, and Code Reading: What Fault Type Do They Each Detect? , 2003, ESERNET.

[8]  N. Cartwright Replicability, Reproducibility, and Robustness: Comments on Harry Collins , 1991 .

[9]  James Miller,et al.  Replicating software engineering experiments: a poisoned chalice or the Holy Grail , 2005, Inf. Softw. Technol..

[10]  Gregory V. Wilson,et al.  On the difficulty of replicating human subjects studies in software engineering , 2008, 2008 ACM/IEEE 30th International Conference on Software Engineering.

[11]  Forrest Shull,et al.  Building Knowledge through Families of Experiments , 1999, IEEE Trans. Software Eng..

[12]  J. Scott Armstrong,et al.  Replications of Forecasting Research , 2009 .

[13]  I. Good The White Shoe is a Red Herring , 1967 .

[14]  Randall L. Schultz,et al.  A Study of Marketing Generalizations , 1980 .

[15]  Barry H. Kantowitz,et al.  Experimental Psychology: Understanding Psychological Research , 1978 .

[16]  Tore Dybå,et al.  A systematic review of statistical power in software engineering experiments , 2006, Inf. Softw. Technol..

[17]  James Miller,et al.  Triangulation as a basis for knowledge discovery in software engineering , 2008, Empirical Software Engineering.

[18]  D. Campbell,et al.  EXPERIMENTAL AND QUASI-EXPERIMENT Al DESIGNS FOR RESEARCH , 2012 .

[19]  Caroline L. Park What is the value of replicating other studies , 2004 .

[20]  R Fisher,et al.  Design of Experiments , 1936 .

[21]  A R Fielder,et al.  Lack of efficacy of light reduction in preventing retinopathy of prematurity. Light Reduction in Retinopathy of Prematurity (LIGHT-ROP) Cooperative Group. , 1998, The New England journal of medicine.

[22]  Magne Jørgensen,et al.  A review of studies on expert estimation of software development effort , 2004, J. Syst. Softw..

[23]  Faculteit der Sociale Wetenschappen,et al.  A process model of replication studies: on the relation between different types of replication , 1994 .

[24]  A. Rivadulla Inducción, deducción y decisión en las teorías estadísticas de la inferencia científica , 1993 .

[25]  Will Hayes,et al.  Research synthesis in software engineering: a case for meta-analysis , 1999, Proceedings Sixth International Software Metrics Symposium (Cat. No.PR00403).

[26]  Jeffrey C. Carver,et al.  The role of replications in Empirical Software Engineering , 2008, Empirical Software Engineering.

[27]  James Miller,et al.  Further Experiences with Scenarios and Checklists , 1998, Empirical Software Engineering.

[28]  Denise Polit-O'Hara,et al.  Nursing Research: Principles and Methods , 1978 .

[29]  Giancarlo Succi,et al.  Report of the 4th international symposium on empirical software engineering and measurement ESEM 2010 , 2011, SOEN.

[30]  John G. Lynch,et al.  Validity and the research process , 1986 .

[31]  M. Dunn,et al.  Conducting Marketing Science: The Role of Replication in the Research Process , 2000 .

[32]  Hans Radder,et al.  Experimental Reproducibility and the Experimenters' Regress , 1992, PSA: Proceedings of the Biennial Meeting of the Philosophy of Science Association.

[33]  Michael A. La Sorte Replication as a Verification Technique in Survey Research: A Paradigm , 1972 .

[34]  Barbara A. Kitchenham,et al.  The role of replications in empirical software engineering—a word of warning , 2008, Empirical Software Engineering.

[35]  Natalia Juristo Juzgado,et al.  1st International Workshop on Replication in Empirical Software Engineering Research (RESER) , 2010, 2010 ACM/IEEE 32nd International Conference on Software Engineering.

[36]  H Liesenhoff,et al.  A controlled clinical trial of light and retinopathy of prematurity. , 1994, American journal of ophthalmology.

[37]  Giuseppe Visaggio,et al.  A Replicated Experiment to Assess Requirements Inspection Techniques , 2004, Empirical Software Engineering.

[38]  Natalia Juristo Juzgado,et al.  Reviewing 25 Years of Testing Technique Experiments , 2004, Empirical Software Engineering.

[39]  D. Lykken Statistical significance in psychological research. , 1968, Psychological bulletin.

[40]  Natalia Juristo Juzgado,et al.  Analysis of the influence of communication between researchers on experiment replication , 2006, ISESE '06.

[41]  Johan Per Fredrik Almqvist,et al.  Replication of Controlled Experiments in Empirical Software Engineering - A Survey , 2006 .

[42]  Walter F. Tichy,et al.  Should Computer Scientists Experiment More? , 1998, Computer.

[43]  Susan Leigh Star,et al.  Changing Order: Replication and Induction in Scientific Practice by H. M. Collins (review) , 1988, Technology and Culture.

[44]  Leyland F. Pitt,et al.  Potential Research Space in MIS: A Framework for Envisioning and Evaluating Research Replication, Extension, and Generation , 2002, Inf. Syst. Res..

[45]  Adam A. Porter,et al.  Comparing Detection Methods for Software Requirements Inspections: A Replicated Experiment , 1995, IEEE Trans. Software Eng..

[46]  Tore Dybå,et al.  The effectiveness of pair programming: A meta-analysis , 2009, Inf. Softw. Technol..

[47]  C. Hempel Philosophy of Natural Science , 1966 .

[48]  Barbara A. Kitchenham,et al.  Combining empirical results in software engineering , 1998, Inf. Softw. Technol..

[49]  D. Friendly,et al.  Effect of bright light in the hospital nursery on the incidence of retinopathy of prematurity. , 1985, The New England journal of medicine.

[50]  Natalia Juristo Juzgado,et al.  Using differences among replications of software engineering experiments to gain knowledge , 2009, 2009 3rd International Symposium on Empirical Software Engineering and Measurement.

[51]  L. J. Chase,et al.  REPLICATION IN EXPERIMENTAL COMMUNICATION RESEARCH: AN ANALYSIS , 1979 .

[52]  John E. Hunter,et al.  The Desperate Need for Replications , 2001 .

[53]  Bernard M. Finifter The Generation of Confidence: Evaluating Research Findings by Random Subsample Replication , 1972 .

[54]  Melanie Kalman,et al.  A call for replication. , 2003, Journal of nursing scholarship : an official publication of Sigma Theta Tau International Honor Society of Nursing.

[55]  Theodore Caplow,et al.  MIDDLETOWN III: PROBLEMS OF REPLICATION, LONGITUDINAL MEASUREMENT, AND TRIANGULATION , 1983 .

[56]  Shari Lawrence Pfleeger,et al.  Experimental design and analysis in software engineering: Part 2: how to set up and experiment , 1995, SOEN.

[57]  Victor R. Basili,et al.  Comparing the Effectiveness of Software Testing Strategies , 1987, IEEE Transactions on Software Engineering.

[58]  Jeffrey C. Carver,et al.  Replicating software engineering experiments: addressing the tacit knowledge problem , 2002, Proceedings International Symposium on Empirical Software Engineering.

[59]  Eric W. K. Tsang,et al.  Replication and Theory Development in Organizational Science: A Critical Realist Perspective , 1999 .

[60]  Tore Dybå,et al.  A systematic review of effect size in software engineering experiments , 2007, Inf. Softw. Technol..

[61]  T. Bayes An essay towards solving a problem in the doctrine of chances , 2003 .

[62]  Jeffrey C. Carver,et al.  A Framework for Software Engineering Experimental Replications , 2008, 13th IEEE International Conference on Engineering of Complex Computer Systems (iceccs 2008).

[63]  Tom DeMarco Software Engineering: An Idea Whose Time Has Come and Gone? , 2009, IEEE Software.

[64]  A. Ehrenberg,et al.  The Design of Replicated Studies , 1993 .

[65]  James Miller,et al.  Comparing and combining software defect detection techniques: a replicated empirical study , 1997, ESEC '97/FSE-5.

[66]  James Miller,et al.  Applying meta-analytical procedures to software engineering experiments , 2000, J. Syst. Softw..

[67]  Jouni Markkula,et al.  Towards Multi-Method Research Approach in Empirical Software Engineering , 2009, PROFES.

[68]  B Ackerman,et al.  Reduced incidental light exposure: effect on the development of retinopathy of prematurity in low birth weight infants. , 1989, Pediatrics.

[69]  Daniela E. Damian,et al.  Selecting Empirical Methods for Software Engineering Research , 2008, Guide to Advanced Empirical Software Engineering.

[70]  C. Hendrick,et al.  Replications, strict replications, and conceptual replications: Are they important? , 1990 .

[71]  Carl G. Hempel,et al.  I.—STUDIES IN THE LOGIC OF CONFIRMATION (II.) , 1945 .

[72]  M. Kendall,et al.  The Logic of Scientific Discovery. , 1959 .

[73]  Adam A. Porter,et al.  Comparing Detection Methods For Software Requirements Inspections: A Replication Using Professional Subjects , 1998, Empirical Software Engineering.

[74]  Harold Maurice Collins,et al.  The experimenter's regress as philosophical sociology , 2002 .