A Large-Scale Empirical Study on Self-Admitted Technical Debt

Technical debt is a metaphor introduced by Cunningham to indicate "not quite right code which we postpone making it right". Examples of technical debt are code smells and bug hazards. Several techniques have been proposed to detect different types of technical debt. Among those, Potdar and Shihab defined heuristics to detect instances of self-admitted technical debt in code comments, and used them to perform an empirical study on five software systems to investigate the phenomenon. Still, very little is known about the diffusion and evolution of technical debt in software projects.This paper presents a differentiated replication of the work by Potdar and Shihab. We run a study across 159 software projects to investigate the diffusion and evolution of self-admitted technical debt and its relationship with software quality. The study required the mining of over 600K commits and 2 Billion comments as well as a qualitative analysis performed via open coding.Our main findings show that self-admitted technical debt (i) is diffused, with an average of 51 instances per system, (ii) is mostly represented by code (30%), defect, and requirement debt (20% each), (iii) increases over time due to the introduction of new instances that are not fixed by developers, and (iv) even when fixed, it survives long time (over 1,000 commits on average) in the system.

[1]  Forrest Shull,et al.  Investigating technical debt folklore: Shedding some light on technical debt opinion , 2013, 2013 4th International Workshop on Managing Technical Debt (MTD).

[2]  Jonathan I. Maletic,et al.  An XML-Based Lightweight C++ Fact Extractor , 2003, IWPC.

[3]  Gabriele Bavota,et al.  Are test smells really harmful? An empirical study , 2014, Empirical Software Engineering.

[4]  Anas N. Al-Rabadi,et al.  A comparison of modified reconstructability analysis and Ashenhurst‐Curtis decomposition of Boolean functions , 2004 .

[5]  Mauricio A. Saca Refactoring improving the design of existing code , 2017, 2017 IEEE 37th Central America and Panama Convention (CONCAPAN XXXVII).

[6]  Ward Cunningham,et al.  The WyCash portfolio management system , 1992, OOPSLA '92.

[7]  Jacob Cohen Statistical Power Analysis for the Behavioral Sciences , 1969, The SAGE Encyclopedia of Research Design.

[8]  Lin Tan,et al.  Do time of day and developer experience affect commit bugginess? , 2011, MSR '11.

[9]  Forrest Shull,et al.  Investigating the impact of design debt on software quality , 2011, MTD '11.

[10]  Yann-Gaël Guéhéneuc,et al.  REPENT: Analyzing the Nature of Identifier Renamings , 2014, IEEE Transactions on Software Engineering.

[11]  Gabriele Bavota,et al.  Mining Version Histories for Detecting Code Smells , 2015, IEEE Transactions on Software Engineering.

[12]  André L. M. Santos,et al.  Tracking technical debt — An exploratory case study , 2011, 2011 27th IEEE International Conference on Software Maintenance (ICSM).

[13]  Charles A. Sutton,et al.  Learning natural coding conventions , 2014, SIGSOFT FSE.

[14]  Robert V. Binder,et al.  Testing Object-Oriented Systems: Models, Patterns, and Tools , 1999 .

[15]  David P. Darcy,et al.  Managerial Use of Metrics for Object-Oriented Software: An Exploratory Analysis , 1998, IEEE Trans. Software Eng..

[16]  Kenneth S. Rubin,et al.  Essential Scrum: A Practical Guide to the Most Popular Agile Process , 2012 .

[17]  A. Strauss,et al.  Grounded theory , 2017 .

[18]  Robert L. Nord,et al.  Managing technical debt in software-reliant systems , 2010, FoSER '10.

[19]  Premkumar T. Devanbu,et al.  Ownership, experience and defects: a fine-grained study of authorship , 2011, 2011 33rd International Conference on Software Engineering (ICSE).

[20]  R. Grissom,et al.  Effect sizes for research: A broad practical approach. , 2005 .

[21]  Peri L. Tarr,et al.  An enterprise perspective on technical debt , 2011, MTD '11.

[22]  Forrest Shull,et al.  A case study on effectively identifying technical debt , 2013, EASE '13.

[23]  Robert L. Nord,et al.  Technical debt: towards a crisper definition report on the 4th international workshop on managing technical debt , 2013, SOEN.

[24]  Claes Wohlin,et al.  Experimentation in software engineering: an introduction , 2000 .

[25]  Forrest Shull,et al.  Technical Debt: Showing the Way for Better Transfer of Empirical Results , 2013, Perspectives on the Future of Software Engineering.

[26]  J. H. Zar,et al.  Significance Testing of the Spearman Rank Correlation Coefficient , 1972 .

[27]  Emad Shihab,et al.  Detecting and quantifying different types of self-admitted technical Debt , 2015, 2015 IEEE 7th International Workshop on Managing Technical Debt (MTD).

[28]  Emad Shihab,et al.  An Exploratory Study on Self-Admitted Technical Debt , 2014, 2014 IEEE International Conference on Software Maintenance and Evolution.

[29]  Westley Weimer,et al.  Learning a Metric for Code Readability , 2010, IEEE Transactions on Software Engineering.

[30]  Yann-Gaël Guéhéneuc,et al.  DECOR: A Method for the Specification and Detection of Code and Design Smells , 2010, IEEE Transactions on Software Engineering.

[31]  David J. Groggel,et al.  Practical Nonparametric Statistics , 2000, Technometrics.

[32]  Foutse Khomh,et al.  An exploratory study of the impact of antipatterns on class change- and fault-proneness , 2011, Empirical Software Engineering.

[33]  Rodrigo O. Spínola,et al.  Towards an Ontology of Terms on Technical Debt , 2014, 2014 Sixth International Workshop on Managing Technical Debt.

[34]  Robert L. Nord,et al.  Technical Debt: From Metaphor to Theory and Practice , 2012, IEEE Software.

[35]  Emad Shihab,et al.  Examining the Impact of Self-Admitted Technical Debt on Software Quality , 2016, 2016 IEEE 23rd International Conference on Software Analysis, Evolution, and Reengineering (SANER).

[36]  Carolyn B. Seaman,et al.  A Balancing Act: What Software Practitioners Have to Say about Technical Debt , 2012, IEEE Softw..