Analyzing the effects of test driven development in GitHub

Testing is an integral part of the software development lifecycle, approached with varying degrees of rigor by different process models. Agile process models recommend Test Driven Development (TDD) as a key practice for reducing costs and improving code quality. The objective of this work is to perform a cost-benefit analysis of this practice. To that end, we have conducted a comparative analysis of GitHub repositories that adopts TDD to a lesser or greater extent, in order to determine how TDD affects software development productivity and software quality. We classified GitHub repositories archived in 2015 in terms of how rigorously they practiced TDD, thus creating a TDD spectrum. We then matched and compared various subsets of these repositories on this TDD spectrum with control sets of equal size. The control sets were samples from all GitHub repositories that matched certain characteristics, and that contained at least one test file. We compared how the TDD sets differed from the control sets on the following characteristics: number of test files, average commit velocity, number of bug-referencing commits, number of issues recorded, usage of continuous integration, number of pull requests, and distribution of commits per author. We found that Java TDD projects were relatively rare. In addition, there were very few significant differences in any of the metrics we used to compare TDD-like and non-TDD projects; therefore, our results do not reveal any observable benefits from using TDD.

[1]  Georgios Gousios,et al.  How (Much) Do Developers Test? , 2015, 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering.

[2]  Natalia Juristo Juzgado,et al.  An External Replication on the Effects of Test-driven Development Using a Multi-site Blind Analysis Approach , 2016, ESEM.

[3]  Y. Hochberg A sharper Bonferroni procedure for multiple tests of significance , 1988 .

[4]  Alexander Serebrenik,et al.  Continuous Integration in a Social-Coding World: Empirical Evidence from GitHub , 2014, 2014 IEEE International Conference on Software Maintenance and Evolution.

[5]  Ken Pugh Lean-Agile Acceptance Test-Driven Development: Better Software Through Collaboration , 2010 .

[6]  Andy Zaidman,et al.  Test Code Quality and Its Relation to Issue Handling Performance , 2014, IEEE Transactions on Software Engineering.

[7]  Grigori Melnik,et al.  Guest Editors' Introduction: TDD--The Art of Fearless Programming , 2007, IEEE Software.

[8]  Kent Beck,et al.  Extreme Programming Explained: Embrace Change (2nd Edition) , 2004 .

[9]  Michael W. Godfrey,et al.  Release Pattern Discovery via Partitioning: Methodology and Case Study , 2007, Fourth International Workshop on Mining Software Repositories (MSR'07:ICSE Workshops 2007).

[10]  Daniela E. Damian,et al.  The promises and perils of mining GitHub , 2009, MSR 2014.

[11]  Eleni Stroulia,et al.  Analyzing the effects of test driven development in GitHub , 2018, ICSE.

[12]  Hridesh Rajan,et al.  Boa: A language and infrastructure for analyzing ultra-large-scale software repositories , 2013, 2013 35th International Conference on Software Engineering (ICSE).

[13]  Kent L. Beck,et al.  Test-driven Development - by example , 2002, The Addison-Wesley signature series.

[14]  Kent L. Beck,et al.  Extreme programming explained - embrace change , 1990 .

[15]  Oscar Nierstrasz,et al.  Comparative analysis of evolving software systems using the Gini coefficient , 2009, 2009 IEEE International Conference on Software Maintenance.

[16]  Daniel M. Germán,et al.  The promises and perils of mining git , 2009, 2009 6th IEEE International Working Conference on Mining Software Repositories.

[17]  P. Rousseeuw Silhouettes: a graphical aid to the interpretation and validation of cluster analysis , 1987 .

[18]  Abram Hindle,et al.  Judging a Commit by Its Cover: Correlating Commit Message Entropy with Build Status on Travis-CI , 2016, 2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR).

[19]  S. Holm A Simple Sequentially Rejective Multiple Test Procedure , 1979 .

[20]  Georgios Gousios,et al.  When, how, and why developers (do not) test in their IDEs , 2015, ESEC/SIGSOFT FSE.

[21]  M. Aickin,et al.  Adjusting for multiple testing when reporting research results: the Bonferroni vs Holm methods. , 1996, American journal of public health.

[22]  Arie van Deursen,et al.  Mining Software Repositories to Study Co-Evolution of Production & Test Code , 2008, 2008 1st International Conference on Software Testing, Verification, and Validation.

[23]  Robert Dyer,et al.  Bringing ultra-large-scale software repository mining to the masses with boa , 2013 .