A survey of controlled experiments in software engineering

The classical method for identifying cause-effect relationships is to conduct controlled experiments. This paper reports upon the present state of how controlled experiments in software engineering are conducted and the extent to which relevant information is reported. Among the 5,453 scientific articles published in 12 leading software engineering journals and conferences in the decade from 1993 to 2002, 103 articles (1.9 percent) reported controlled experiments in which individuals or teams performed one or more software engineering tasks. This survey quantitatively characterizes the topics of the experiments and their subjects (number of subjects, students versus professionals, recruitment, and rewards for participation), tasks (type of task, duration, and type and size of application) and environments (location, development tools). Furthermore, the survey reports on how internal and external validity is addressed and the extent to which experiments are replicated. The gathered data reflects the relevance of software engineering experiments to industrial practice and the scientific maturity of software engineering research.

[1]  Michael R. Clarkson,et al.  Formal Methods Application: An Empirical Tale of Software Development , 2002, IEEE Trans. Software Eng..

[2]  Norman E. Fenton,et al.  How effective are software engineering methods? , 1993, J. Syst. Softw..

[3]  William M. K. Trochim,et al.  Research methods knowledge base , 2001 .

[4]  Forrest Shull,et al.  Building Knowledge through Families of Experiments , 1999, IEEE Trans. Software Eng..

[5]  Bill Curtis,et al.  By the way, did anyone study any real programmers? , 1986 .

[6]  Walter Franz Tichy,et al.  Should Computer Scientists Experiment More? - 16 Excuses to Avoid Experimentation , 1997 .

[7]  A. Ehrenberg,et al.  The Design of Replicated Studies , 1993 .

[8]  Marvin V. Zelkowitz,et al.  Experimental Models for Validating Technology , 1998, Computer.

[9]  M. Host,et al.  Experimental context classification: incentives and experience of subjects , 2005, Proceedings. 27th International Conference on Software Engineering, 2005. ICSE 2005..

[10]  Robert L. Glass,et al.  An assessment of systems and software engineering scholars and institutions (1998-2002) , 2003, J. Syst. Softw..

[11]  Tore Dybå,et al.  Challenges and Recommendations When Increasing the Realism of Controlled Software Engineering Experiments , 2003, ESERNET.

[12]  Michael R. Clarkson,et al.  Response to "Comments on 'Formal Methods Application: An Empirical Tale of Software Development'" , 2003, IEEE Trans. Software Eng..

[13]  Dag I. K. Sjøberg,et al.  Evaluating the effect of a delegated versus centralized control style on the maintainability of object-oriented software , 2004, IEEE Transactions on Software Engineering.

[14]  T. Lunsford,et al.  The Research Sample, Part I: Sampling , 1995 .

[15]  Victor R. Basili,et al.  The Experimental Paradigm in Software Engineering , 1992, Experimental Software Engineering Issues.

[16]  Magne Jørgensen,et al.  A review of studies on expert estimation of software development effort , 2004, J. Syst. Softw..

[17]  W. Shadish,et al.  Experimental and Quasi-Experimental Designs for Generalized Causal Inference , 2001 .

[18]  Robert L. Glass,et al.  An assessment of systems and software engineering scholars and institutions (1993-1997) , 1997, Journal of Systems and Software.

[19]  Victor R. Basili,et al.  The role of experimentation in software engineering: past, current, and future , 1996, Proceedings of IEEE 18th International Conference on Software Engineering.

[20]  A Straw,et al.  Guide to the Software Engineering Body of Knowledge , 1998 .

[21]  Ron Weber,et al.  Editor's comment: theoretically speaking , 2003 .

[22]  J. Lucas,et al.  Theory-Testing, Generalization, and the Problem of External Validity* , 2003 .

[23]  B. Curtis,et al.  Measurement and experimentation in software engineering , 1980, Proceedings of the IEEE.

[24]  Manos Roumeliotis,et al.  A Review of Experimental Investigations into Object-Oriented Technology , 2004, Empirical Software Engineering.

[25]  Magne Jørgensen,et al.  Better sure than safe? Over-confidence in judgement based software development effort prediction intervals , 2004, J. Syst. Softw..

[26]  Mary Shaw,et al.  Writing good software engineering research papers: minitutorial , 2003, ICSE 2003.

[27]  Albert Endres,et al.  A handbook of software and systems engineering - empirical observations, laws and theories , 2003, The Fraunhofer IESE series on software engineering.

[28]  Paul Lukowicz,et al.  Experimental evaluation in computer science: A quantitative study , 1995, J. Syst. Softw..

[29]  Magne Jørgensen,et al.  Generalization and theory-building in software engineering research , 2004, ICSE 2004.

[30]  T.C. Lethbridge,et al.  Guide to the Software Engineering Body of Knowledge (SWEBOK) and the Software Engineering Education Knowledge (SEEK) - a preliminary mapping , 2001, 10th International Workshop on Software Technology and Engineering Practice.

[31]  Victor R. Basili,et al.  Experimental Software Engineering Issues: Critical Assessment and Future Directions , 1993, Lecture Notes in Computer Science.

[32]  R. Ferber Research By Convenience , 1977 .

[33]  Per Runeson,et al.  Using Students as Experiment Subjects – An Analysis on Graduate and Freshmen Student Data , 2003 .

[34]  Venkataraman Ramesh,et al.  Research in software engineering: an analysis of the literature , 2002, Inf. Softw. Technol..

[35]  J. Pearl,et al.  Confounding and Collapsibility in Causal Inference , 1999 .

[36]  Barbara Kitchenham,et al.  Procedures for Performing Systematic Reviews , 2004 .

[37]  Walter F. Tichy,et al.  Should Computer Scientists Experiment More? , 1998, Computer.

[38]  Marvin V. Zelkowitz,et al.  Experimental validation in software engineering , 1997, Inf. Softw. Technol..

[39]  Venkataraman Ramesh,et al.  An analysis of research in computing disciplines , 2004, CACM.

[40]  John F. Rockart,et al.  Editor’s comments , 2005, MIS Q. Executive.

[41]  Walter F. Tichy,et al.  Hints for Reviewing Empirical Work in Software Engineering , 2000, Empirical Software Engineering.

[42]  Edward A. Youngs Human Errors in Programming , 1974, Int. J. Man Mach. Stud..

[43]  Claes Wohlin,et al.  Experimentation in Software Engineering , 2000, The Kluwer International Series in Software Engineering.

[44]  T. Cook,et al.  Quasi-experimentation: Design & analysis issues for field settings , 1979 .

[45]  Will Hayes,et al.  Research synthesis in software engineering: a case for meta-analysis , 1999, Proceedings Sixth International Software Metrics Symposium (Cat. No.PR00403).

[46]  Natalia Juristo Juzgado,et al.  Reviewing 25 Years of Testing Technique Experiments , 2004, Empirical Software Engineering.

[47]  Andreas Zendler,et al.  A Preliminary Software Engineering Theory as Investigated by Published Experiments , 2001, Empirical Software Engineering.

[48]  Christopher M. Lott,et al.  Repeatable software engineering experiments for comparing defect-detection techniques , 2004, Empirical Software Engineering.

[49]  Tore Dybå,et al.  Conducting realistic experiments in software engineering , 2002, Proceedings International Symposium on Empirical Software Engineering.

[50]  Mary Shaw,et al.  Writing good software engineering research papers , 2003, 25th International Conference on Software Engineering, 2003. Proceedings..

[51]  Shari Lawrence Pfleeger,et al.  Preliminary Guidelines for Empirical Research in Software Engineering , 2002, IEEE Trans. Software Eng..

[52]  George H. Zimny Method in Experimental Psychology , 1961 .