Considering rigor and relevance when evaluating test driven development: A systematic review

Context: Test driven development (TDD) has been extensively researched and compared to traditional approaches (test last development, TLD). Existing literature reviews show varying results for TDD. Objective: This study investigates how the conclusions of existing literature reviews change when taking two study quality dimension into account, namely rigor and relevance. Method: In this study a systematic literature review has been conducted and the results of the identified primary studies have been analyzed with respect to rigor and relevance scores using the assessment rubric proposed by Ivarsson and Gorschek 2011. Rigor and relevance are rated on a scale, which is explained in this paper. Four categories of studies were defined based on high/low rigor and relevance. Results: We found that studies in the four categories come to different conclusions. In particular, studies with a high rigor and relevance scores show clear results for improvement in external quality, which seem to come with a loss of productivity. At the same time high rigor and relevance studies only investigate a small set of variables. Other categories contain many studies showing no difference, hence biasing the results negatively for the overall set of primary studies. Given the classification differences to previous literature reviews could be highlighted. Conclusion: Strong indications are obtained that external quality is positively influenced, which has to be further substantiated by industry experiments and longitudinal case studies. Future studies in the high rigor and relevance category would contribute largely by focusing on a wider set of outcome variables (e.g. internal code quality). We also conclude that considering rigor and relevance in TDD evaluation is important given the differences in results between categories and in comparison to previous reviews.

[1]  Magnus C. Ohlsson,et al.  Experimentation in Software Engineering , 2000, The Kluwer International Series in Software Engineering.

[2]  Per Runeson,et al.  Guidelines for conducting and reporting case study research in software engineering , 2009, Empirical Software Engineering.

[3]  T. Dybå,et al.  Applying Systematic Reviews to Diverse Study Types: An Experience Report , 2007, First International Symposium on Empirical Software Engineering and Measurement (ESEM 2007).

[4]  Pearl Brereton,et al.  Systematic literature reviews in software engineering - A systematic literature review , 2009, Inf. Softw. Technol..

[5]  David S. Janzen,et al.  Implications of integrating test-driven development into CS1/CS2 curricula , 2009, SIGCSE '09.

[6]  Claes Wohlin,et al.  Systematic literature studies: Database searches vs. backward snowballing , 2012, Proceedings of the 2012 ACM-IEEE International Symposium on Empirical Software Engineering and Measurement.

[7]  Pekka Abrahamsson,et al.  A Comparative Case Study on the Impact of Test-Driven Development on Program Design and Test Coverage , 2007, First International Symposium on Empirical Software Engineering and Measurement (ESEM 2007).

[8]  Mojca Ciglaric,et al.  Impact of test-driven development on productivity, code and tests: A controlled experiment , 2011, Inf. Softw. Technol..

[9]  T. Vidmar,et al.  Towards empirical evaluation of test-driven development in a university environment , 2003, The IEEE Region 8 EUROCON 2003. Computer as a Tool..

[10]  Lei Zhang,et al.  Comparison Between Test Driven Development and Waterfall Development in a Small-Scale Project , 2006, XP.

[11]  Lech Madeyski,et al.  Test-Driven Development - An Empirical Evaluation of Agile Practice , 2009 .

[12]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[13]  Lech Madeyski Preliminary Analysis of the Effects of Pair Programming and Test-Driven Development on the External Code Quality , 2005, Software Engineering: Evolution and Emerging Technologies.

[14]  Marco Torchiano,et al.  On the effectiveness of the test-first approach to programming , 2005, IEEE Transactions on Software Engineering.

[15]  Lech Madeyski,et al.  The Impact of Test-Driven Development on Software Development Productivity - An Empirical Study , 2007, EuroSPI.

[16]  Claes Wohlin,et al.  Systematic literature reviews in software engineering , 2013, Inf. Softw. Technol..

[17]  Hossein Saiedian,et al.  A Leveled Examination of Test-Driven Development Acceptance , 2007, 29th International Conference on Software Engineering (ICSE'07).

[18]  Atul Gupta,et al.  An Experimental Evaluation of the Effectiveness and Efficiency of the Test Driven Development , 2007, First International Symposium on Empirical Software Engineering and Measurement (ESEM 2007).

[19]  Liang Huang,et al.  Empirical Assessment of Test-First Approach , 2006, Testing: Academic & Industrial Conference - Practice And Research Techniques (TAIC PART'06).

[20]  Lisa Crispin,et al.  Driving Software Quality: How Test-Driven Development Impacts Software Quality , 2006, IEEE Software.

[21]  James Miller,et al.  A prototype empirical evaluation of test driven development , 2004, 10th International Symposium on Software Metrics, 2004. Proceedings..

[22]  Lars Lundberg,et al.  Results from introducing component-level test automation and Test-Driven Development , 2006, J. Syst. Softw..

[23]  Tore Dybå,et al.  The Future of Empirical Methods in Software Engineering Research , 2007, Future of Software Engineering (FOSE '07).

[24]  Forrest Shull,et al.  What Do We Know about Test-Driven Development? , 2010, IEEE Software.

[25]  Kai Petersen,et al.  Identifying Strategies for Study Selection in Systematic Reviews and Maps , 2011, 2011 International Symposium on Empirical Software Engineering and Measurement.

[26]  Sumanth Yenduri,et al.  Impact of Using Test-Driven Development: A Case Study , 2006, Software Engineering Research and Practice.

[27]  David S. Janzen,et al.  Test-driven learning in early programming courses , 2008, SIGCSE '08.

[28]  Amiram Yehudai,et al.  Regression Test Selection Techniques for Test-Driven Development , 2011, 2011 IEEE Fourth International Conference on Software Testing, Verification and Validation Workshops.

[29]  S.M. Rahman Applying the TBC method in introductory programming courses , 2007, 2007 37th Annual Frontiers In Education Conference - Global Engineering: Knowledge Without Borders, Opportunities Without Passports.

[30]  Tony Gorschek,et al.  A method for evaluating rigor and industrial relevance of technology evaluations , 2011, Empirical Software Engineering.

[31]  Amela Karahasanovic,et al.  A survey of controlled experiments in software engineering , 2005, IEEE Transactions on Software Engineering.

[32]  David S. Janzen,et al.  On the Influence of Test-Driven Development on Software Design , 2006, 19th Conference on Software Engineering Education & Training (CSEET'06).

[33]  Kevin McDaid,et al.  Test-driven development: can it work for spreadsheets? , 2008, WEUSE '08.

[34]  D. Budgen,et al.  Mapping study completeness and reliability - a case study , 2012, EASE.

[35]  Forrest Shull,et al.  How Effective Is Test-Driven Development? , 2011, Making Software.

[36]  Lech Madeyski,et al.  The impact of Test-First programming on branch coverage and mutation score indicator of unit tests: An experiment , 2010, Inf. Softw. Technol..

[37]  Nachiappan Nagappan,et al.  Evaluating the efficacy of test-driven development: industrial case studies , 2006, ISESE '06.

[38]  Stefano Russo,et al.  Bug Localization in Test-Driven Development , 2011, Adv. Softw. Eng..

[39]  Vojislav B. Misic,et al.  The Effects of Test-Driven Development on External Quality and Productivity: A Meta-Analysis , 2013, IEEE Transactions on Software Engineering.

[40]  Mauricio Finavaro Aniche,et al.  Most Common Mistakes in Test-Driven Development Practice: Results from an Online Survey with Developers , 2010, 2010 Third International Conference on Software Testing, Verification, and Validation Workshops.

[41]  Kai Petersen,et al.  Improving Students With Rubric-Based Self-Assessment and Oral Feedback , 2012, IEEE Transactions on Education.

[42]  Alan R. Hevner,et al.  Conflict in collaborative software development , 2003, SIGMIS CPR '03.

[43]  Sami Kollanus,et al.  Critical Issues on Test-Driven Development , 2011, PROFES.

[44]  H. Kundel,et al.  Measurement of observer agreement. , 2003, Radiology.

[45]  Laurie A. Williams,et al.  Test-driven development as a defect-reduction practice , 2003, 14th International Symposium on Software Reliability Engineering, 2003. ISSRE 2003..

[46]  Nuno Laranjeiro,et al.  Extending Test-Driven Development for Robust Web Services , 2009, 2009 Second International Conference on Dependability.

[47]  T. Greenhalgh,et al.  Effectiveness and efficiency of search methods in systematic reviews of complex evidence: audit of primary sources , 2005, BMJ : British Medical Journal.

[48]  Anders Jonsson,et al.  The use of scoring rubrics: Reliability, validity, and educational consequences , 2007 .

[49]  G. Lip How the Read a Paper: The Basics of Evidence Based Medicine , 1998, Journal of Human Hypertension.

[50]  Alan R. Hevner,et al.  Controlled experimentation on adaptations of pair programming , 2007, Inf. Technol. Manag..

[51]  E. Michael Maximilien,et al.  A Longitudinal Study of the Use of a Test-Driven Development Practice in Industry , 2007 .

[52]  Ioannis Stamelos,et al.  Empirical Studies on Quality in Agile Practices: A Systematic Literature Review , 2010, 2010 Seventh International Conference on the Quality of Information and Communications Technology.

[53]  Kent L. Beck,et al.  Extreme programming explained - embrace change , 1990 .

[54]  David Batic,et al.  The effectiveness of test-driven development: an industrial case study , 2011, Software Quality Journal.

[55]  Pekka Abrahamsson,et al.  Improving Business Agility Through Technical Solutions: A Case Study on Test-Driven Development in Mobile Software Development , 2005, Business Agility and Information Technology Diffusion.

[56]  K. Petersen,et al.  Context in industrial software engineering research , 2009, 2009 3rd International Symposium on Empirical Software Engineering and Measurement.

[57]  Marvin V. Zelkowitz,et al.  Culture Conflicts in Software Engineering Technology Transfer , 1998 .

[58]  David S. Janzen,et al.  A survey of evidence for test-driven development in academia , 2008, SGCS.

[59]  Stephen H. EDWARDS Using Test-Driven Development in the Classroom : Providing Students with Automatic , Concrete Feedback on Performance , 2003 .

[60]  Sami Kollanus,et al.  Test-Driven Development - Still a Promising Approach? , 2010, 2010 Seventh International Conference on the Quality of Information and Communications Technology.

[61]  Pekka Abrahamsson,et al.  Does Test-Driven Development Improve the Program Code? Alarming Results from a Comparative Case Study , 2008, CEE-SET.

[62]  Shari Lawrence Pfleeger,et al.  Software Metrics : A Rigorous and Practical Approach , 1998 .

[63]  Grigori Melnik,et al.  Guest Editors' Introduction: TDD--The Art of Fearless Programming , 2007, IEEE Software.

[64]  Matthias M. Müller,et al.  Experiment about test-first programming , 2002, IEE Proc. Softw..

[65]  Randy A. Ynchausti Integrating Unit Testing Into A Software Development Team’s Process , 2001 .

[66]  Laurie A. Williams,et al.  Realizing quality improvement through test driven development: results and experiences of four industrial teams , 2008, Empirical Software Engineering.

[67]  John Huan Vu,et al.  Evaluating Test-Driven Development in an Industry-Sponsored Capstone Project , 2009, 2009 Sixth International Conference on Information Technology: New Generations.

[68]  Julia Eichmann,et al.  Making Software - What Really Works, and Why We Believe It , 2011, Making Software.

[69]  Laurie A. Smith King,et al.  Grading essays in computer ethics: rubrics considered helpful , 2002, SIGCSE '02.

[70]  Ville Isomöttönen,et al.  Test-driven development in education: experiences with critical viewpoints , 2008, ITiCSE.

[71]  Liang Huang,et al.  Empirical investigation towards the effectiveness of Test First programming , 2009, Inf. Softw. Technol..

[72]  Andrew P. Martin,et al.  A multiple comparative study of test-with development product changes and their effects on team speed and product quality , 2011, Empirical Software Engineering.

[73]  Reidar Conradi,et al.  The Impact of Test Driven Development on the Evolution of a Reusable Framework of Components – An Industrial Case Study , 2008, 2008 The Third International Conference on Software Engineering Advances.

[74]  Laurie A. Williams,et al.  On the Sustained Use of a Test-Driven Development Practice at IBM , 2007, Agile 2007 (AGILE 2007).

[75]  David S. Janzen,et al.  Does Test-Driven Development Really Improve Software Design Quality? , 2008, IEEE Software.

[76]  Kai Petersen,et al.  Worldviews, Research Methods, and their Relationship to Validity in Empirical Software Engineering Research , 2013, 2013 Joint Conference of the 23rd International Workshop on Software Measurement and the 8th International Conference on Software Process and Product Measurement.

[77]  Thomas Flohr,et al.  An XP Experiment with Students - Setup and Problems , 2005, PROFES.

[78]  Tore Dybå,et al.  The effectiveness of pair programming: A meta-analysis , 2009, Inf. Softw. Technol..

[79]  Boby George,et al.  A structured experiment of test-driven development , 2004, Inf. Softw. Technol..

[80]  Thomas Flohr,et al.  Lessons Learned from an XP Experiment with Students: Test-First Needs More Teachings , 2006, PROFES.

[81]  Kent Beck,et al.  Extreme Programming Explained: Embrace Change (2nd Edition) , 2004 .

[82]  Laurie A. Williams,et al.  Assessing test-driven development at IBM , 2003, 25th International Conference on Software Engineering, 2003. Proceedings..

[83]  Kai Petersen,et al.  Measuring and predicting software productivity: A systematic map and review , 2011, Inf. Softw. Technol..

[84]  David S. Janzen,et al.  Implications of test-driven development: a pilot study , 2003, OOPSLA '03.

[85]  Jay F. Nunamaker,et al.  Comparing the Defect Reduction Benefits of Code Inspection and Test-Driven Development , 2012, IEEE Transactions on Software Engineering.

[86]  Boby George,et al.  An initial investigation of test driven development in industry , 2003, SAC '03.

[87]  G. Melnik,et al.  A cross-program investigation of students' perceptions of agile methods , 2005, Proceedings. 27th International Conference on Software Engineering, 2005. ICSE 2005..

[88]  Claes Wohlin,et al.  On the reliability of mapping studies in software engineering , 2013, J. Syst. Softw..

[89]  Mario Piattini,et al.  Evaluating advantages of test driven development: a controlled experiment with professionals , 2006, ISESE '06.

[90]  Tong Li,et al.  Evaluation of Test-Driven Development: An Academic Case Study , 2009, SERA.

[91]  Daniel Sundmark,et al.  Factors Limiting Industrial Adoption of Test Driven Development: A Systematic Review , 2011, 2011 Fourth IEEE International Conference on Software Testing, Verification and Validation.