A Replicated Experiment on the Effectiveness of Test-First Development

Background: Test-first development (TF) is regarded as a development practice that can lead to better quality of software products, as well as improved developer productivity. By implementing unit tests before the corresponding production code, the tests themselves are the main driver to such improvements. The role of tests on the effectiveness of TF has been studied in a controlled experiment by Erdogmus et al. (i.e. original study). Aim: Our goal is to examine the impact of test-first (TF) development on product quality and developer productivity, specifically the role that tests play in it. Method: We replicated the original study's controlled experiment by comparing an experimental group applying TF to a control group applying a test-last approach. We then carried out a correlation study in order to understand whether the number of tests is a good predictor for external quality and/or productivity. Results: Mann-Whitney tests did not show any significant difference between the two groups in terms of number of tests written (W=114.5, p=0.38), developers' productivity (W=90, p=0.82) and external quality (W=81.55, p=0.53). In addition, while a significant correlation exists between the number of tests and productivity (Spearman's ρ = 0.57, p<;0.001), none was found in the case of external quality (Spearman's ρ = 0.17, p=0.18). Conclusions: We conclude that TF neither improves nor deteriorates the external quality or the productivity when compared to the test-last approach, leaving room for other variables to impact the effects of TF. This replication has partially confirmed the findings of the original study.

[1]  Philip M. Johnson,et al.  Automated Recognition of Test-Driven Development with Zorro , 2007, Agile 2007 (AGILE 2007).

[2]  M. Fay,et al.  Wilcoxon-Mann-Whitney or t-test? On assumptions for hypothesis tests and multiple interpretations of decision rules. , 2010, Statistics surveys.

[3]  Hossein Saiedian,et al.  A Leveled Examination of Test-Driven Development Acceptance , 2007, 29th International Conference on Software Engineering (ICSE'07).

[4]  Jennifer J. Richler,et al.  Effect size estimates: current use, calculations, and interpretation. , 2012, Journal of experimental psychology. General.

[5]  Mariana V. Bravo,et al.  Coding Dojo: An Environment for Learning and Sharing Agile Practices , 2008, Agile 2008 Conference.

[6]  Dave Astels,et al.  Test Driven Development: A Practical Guide , 2003 .

[7]  Scott E. Maxwell,et al.  Designing Experiments and Analyzing Data , 1992 .

[8]  H. B. Mann,et al.  On a Test of Whether one of Two Random Variables is Stochastically Larger than the Other , 1947 .

[9]  Sumanth Yenduri,et al.  Impact of Using Test-Driven Development: A Case Study , 2006, Software Engineering Research and Practice.

[10]  G. Melnik,et al.  A cross-program investigation of students' perceptions of agile methods , 2005, Proceedings. 27th International Conference on Software Engineering, 2005. ICSE 2005..

[11]  T. Cook,et al.  Quasi-experimentation: Design & analysis issues for field settings , 1979 .

[12]  Claes Wohlin,et al.  Experimentation in software engineering: an introduction , 2000 .

[13]  Edsel A. Peña,et al.  Global Validation of Linear Model Assumptions , 2006, Journal of the American Statistical Association.

[14]  Lech Madeyski,et al.  Test-Driven Development - An Empirical Evaluation of Agile Practice , 2009 .

[15]  Boby George,et al.  Analysis and Quantification of Test Driven Development Approach , 2002 .

[16]  D. C. Howell Statistical Methods for Psychology , 1987 .

[17]  Radha K. Mahapatra,et al.  To test before or to test after---an experimental investigation of the impact of test driven development , 2009 .

[18]  Laurie A. Williams,et al.  Assessing test-driven development at IBM , 2003, 25th International Conference on Software Engineering, 2003. Proceedings..

[19]  Mario Piattini,et al.  Evaluating advantages of test driven development: a controlled experiment with professionals , 2006, ISESE '06.

[20]  Thomas Flohr,et al.  Lessons Learned from an XP Experiment with Students: Test-First Needs More Teachings , 2006, PROFES.

[21]  Mojca Ciglaric,et al.  Impact of test-driven development on productivity, code and tests: A controlled experiment , 2011, Inf. Softw. Technol..

[22]  Vojislav B. Misic,et al.  The Effects of Test-Driven Development on External Quality and Productivity: A Meta-Analysis , 2013, IEEE Transactions on Software Engineering.

[23]  Marco Torchiano,et al.  On the effectiveness of the test-first approach to programming , 2005, IEEE Transactions on Software Engineering.

[24]  Daniel Sundmark,et al.  Test case quality in test driven development: A study design and a pilot experiment , 2012, EASE.

[25]  Matthias M. Müller,et al.  The effect of experience on the test-driven development process , 2007, Empirical Software Engineering.

[26]  Forrest Shull,et al.  What Do We Know about Test-Driven Development? , 2010, IEEE Software.

[27]  Tore Dybå,et al.  A systematic review of effect size in software engineering experiments , 2007, Inf. Softw. Technol..

[28]  Tong Li,et al.  Evaluation of Test-Driven Development: An Academic Case Study , 2009, SERA.

[29]  H. Lilliefors On the Kolmogorov-Smirnov Test for Normality with Mean and Variance Unknown , 1967 .

[30]  Jeffrey C. Carver Towards Reporting Guidelines for Experimental Replications: A Proposal , 2010 .

[31]  Nachiappan Nagappan,et al.  Evaluating the efficacy of test-driven development: industrial case studies , 2006, ISESE '06.

[32]  J. G. Adair,et al.  The Hawthorne effect: A reconsideration of the methodological artifact. , 1984 .

[33]  Jeffrey C. Carver,et al.  The role of replications in Empirical Software Engineering , 2008, Empirical Software Engineering.

[34]  Lech Madeyski,et al.  The impact of Test-First programming on branch coverage and mutation score indicator of unit tests: An experiment , 2010, Inf. Softw. Technol..

[35]  Pekka Abrahamsson,et al.  Does Test-Driven Development Improve the Program Code? Alarming Results from a Comparative Case Study , 2008, CEE-SET.

[36]  Kent L. Beck,et al.  Test-driven Development - by example , 2002, The Addison-Wesley signature series.

[37]  Natalia Juristo Juzgado,et al.  Using differences among replications of software engineering experiments to gain knowledge , 2009, 2009 3rd International Symposium on Empirical Software Engineering and Measurement.

[38]  Boby George,et al.  A structured experiment of test-driven development , 2004, Inf. Softw. Technol..