Towards an operationalization of test-driven development skills: An industrial empirical study

Context: The majority of the empirical studies on Test-driven development (TDD) are concerned with verifying or refuting the effectiveness of the technique over a traditional approach, and they tend to neglect whether the subjects possess the necessary skills to apply TDD, though they argue such skills are necessary.Objective: We evaluate a set of minimal, a priori and in process skills necessary to apply TDD. We determine whether variations in external quality (i.e., number of defects) and productivity (i.e., number of features implemented) can be associated with different clusters of the TDD skills' set.Method: We executed a quasi-experiment involving 30 practitioners from industry. We first grouped the participants according to their TDD skills' set (consisting of a priori experience on programming and testing as well as in-process TDD conformance) into three levels (Low-Medium-High) using k-means clustering. We then applied ANOVA to compare the clusters in terms of external quality and productivity, and conducted post-hoc pairwise analysis.Results: We did not observe a statistically significant difference between the clusters either for external software quality ( F ( 2 , 27 = 1.44 , p = . 260 ), or productivity ( F ( 2 , 27 ) = 3.02 , p = . 065 ). However, the analysis of the effect sizes and their confidence intervals shows that the TDD skills' set is a factor that could account for up to 28% of the external quality, and 38% for productivity.Conclusion: We have reason to conclude that focusing on the improvement of TDD skills' set investigated in this study could benefit software developers in improving their baseline productivity and the external quality of the code they produce. However, replications are needed to overcome the issues related with the statistical power of this study. We suggest practical insights for future work to investigate the phenomenon further.

[1]  Jennifer J. Richler,et al.  Effect size estimates: current use, calculations, and interpretation. , 2012, Journal of experimental psychology. General.

[2]  Vojislav B. Misic,et al.  The Effects of Test-Driven Development on External Quality and Productivity: A Meta-Analysis , 2013, IEEE Transactions on Software Engineering.

[3]  Marco Torchiano,et al.  On the effectiveness of the test-first approach to programming , 2005, IEEE Transactions on Software Engineering.

[4]  Forrest Shull,et al.  How Effective Is Test-Driven Development? , 2011, Making Software.

[5]  Matthias M. Müller,et al.  The effect of experience on the test-driven development process , 2007, Empirical Software Engineering.

[6]  Burak Turhan,et al.  On the role of tests in test-driven development: a differentiated and partial replication , 2013, Empirical Software Engineering.

[7]  Kiyoung Choi,et al.  AIM , 2016 .

[8]  Paul Ralph,et al.  Requirements fixation , 2014, ICSE.

[9]  Forrest Shull,et al.  Tool supported detection and judgment of nonconformance in process execution , 2009, 2009 3rd International Symposium on Empirical Software Engineering and Measurement.

[10]  J. G. Adair,et al.  The Hawthorne effect: A reconsideration of the methodological artifact. , 1984 .

[11]  W G Henderson,et al.  Multisite Randomized Controlled Trials in Health Services Research: Scientific Challenges and Operational Issues , 2001, Medical care.

[12]  Kent L. Beck,et al.  Aim, Fire , 2001, IEEE Softw..

[13]  H. Chaiklin,et al.  How Does Social Science Work? Reflections on Practice , 1993 .

[14]  W. Shadish,et al.  Experimental and Quasi-Experimental Designs for Generalized Causal Inference , 2001 .

[15]  Bas Vodde,et al.  Learning Test-Driven Development by Counting Lines , 2007, IEEE Software.

[16]  Burak Turhan,et al.  On the effects of programming and testing skills on external quality and productivity in a test-driven development context , 2015, EASE.

[17]  Sven Apel,et al.  Measuring and modeling programming experience , 2013, Empirical Software Engineering.

[18]  Dave Astels,et al.  Test Driven Development: A Practical Guide , 2003 .

[19]  Roberto Latorre,et al.  Effects of Developer Experience on Learning and Applying Unit Test-Driven Development , 2014, IEEE Transactions on Software Engineering.

[20]  B. Thompson What Future Quantitative Social Science Research Could Look Like: Confidence Intervals for Effect Sizes , 2002 .

[21]  G. Loftus Psychology Will Be a Much Better Science When We Change the Way We Analyze Data , 1996 .

[22]  Claes Wohlin,et al.  Experimentation in Software Engineering , 2000, The Kluwer International Series in Software Engineering.

[23]  Natalia Juristo Juzgado,et al.  Software industry experiments: a systematic literature review , 2013, CESI@ICSE.

[24]  Marcelo Soares Pimenta,et al.  Besouro: A framework for exploring compliance rules in automatic TDD behavior assessment , 2015, Inf. Softw. Technol..

[25]  Hakan Erdogmus,et al.  Operational definition and automated inference of test-driven development with Zorro , 2010, Automated Software Engineering.

[26]  Mariana V. Bravo,et al.  Coding Dojo: An Environment for Learning and Sharing Agile Practices , 2008, Agile 2008 Conference.

[27]  Burak Turhan,et al.  Impact of process conformance on the effects of test-driven development , 2014, ESEM '14.

[28]  Kai Petersen,et al.  Considering rigor and relevance when evaluating test driven development: A systematic review , 2014, Inf. Softw. Technol..

[29]  Tore Dybå,et al.  Evidence-based software engineering , 2016, Perspectives on Data Science for Software Engineering.

[30]  Burak Turhan,et al.  Conformance factor in test-driven development: initial results from an enhanced replication , 2014, EASE '14.

[31]  Natalia Juristo Juzgado,et al.  Are Students Representatives of Professionals in Software Engineering Experiments? , 2015, 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering.

[32]  S. Raudenbush,et al.  Statistical power and optimal design for multisite randomized trials. , 2000, Psychological methods.

[33]  R. Hyman Quasi-Experimentation: Design and Analysis Issues for Field Settings (Book) , 1982 .

[34]  Fucci Davide A lab package for TDD experiment replication , 2015 .

[35]  Lech Madeyski,et al.  Test-Driven Development - An Empirical Evaluation of Agile Practice , 2009 .

[36]  W. J. Langford Statistical Methods , 1959, Nature.

[37]  Keying Ye,et al.  Determining the Number of Clusters Using the Weighted Gap Statistic , 2007, Biometrics.

[38]  N. L. Johnson,et al.  Multivariate Analysis , 1958, Nature.

[39]  Natalia Juristo Juzgado,et al.  Topic selection in industry experiments , 2014, CESI 2014.

[40]  Gerardo Canfora,et al.  How changes affect software entropy: an empirical study , 2014, Empirical Software Engineering.

[41]  Janice Singer,et al.  Studying Software Engineers: Data Collection Techniques for Software Field Studies , 2005, Empirical Software Engineering.

[42]  Philip Prescott,et al.  On the Accuracy of Bonferroni Significance Levels for Detecting Outliers in Linear Models , 1981 .

[43]  Margaret J. Robertson,et al.  Design and Analysis of Experiments , 2006, Handbook of statistics.

[44]  J. Osborne Improving your data transformations: Applying the Box-Cox transformation , 2010 .

[45]  Hakan Erdogmus,et al.  The Role of Process Measurement in Test-Driven Development , 2004, XP/Agile Universe.

[46]  Paul D. Ellis,et al.  The essential guide to effect sizes : statistical power, meta-analysis, and the interpretation of research results , 2010 .

[47]  Daniel Sundmark,et al.  Factors Limiting Industrial Adoption of Test Driven Development: A Systematic Review , 2011, 2011 Fourth IEEE International Conference on Software Testing, Verification and Validation.

[48]  Leo A. Meyerovich,et al.  Empirical analysis of programming language adoption , 2013, OOPSLA.

[49]  Graziotin Daniel Are Happy Developers More Productive? PROFES 2013 Presentation , 2013 .

[50]  Tore Dybå,et al.  Construction and Validation of an Instrument for Measuring Programming Skill , 2014, IEEE Transactions on Software Engineering.

[51]  Kent L. Beck,et al.  Test-driven Development - by example , 2002, The Addison-Wesley signature series.