What Types of Defects Are Really Discovered in Code Reviews?

Research on code reviews has often focused on defect counts instead of defect types, which offers an imperfect view of code review benefits. In this paper, we classified the defects of nine industrial (C/C++) and 23 student (Java) code reviews, detecting 388 and 371 defects, respectively. First, we discovered that 75 percent of defects found during the review do not affect the visible functionality of the software. Instead, these defects improved software evolvability by making it easier to understand and modify. Second, we created a defect classification consisting of functional and evolvability defects. The evolvability defect classification is based on the defect types found in this study, but, for the functional defects, we studied and compared existing functional defect classifications. The classification can be useful for assigning code review roles, creating checklists, assessing software evolvability, and building software engineering tools. We conclude that, in addition to functional defects, code reviews find many evolvability defects and, thus, offer additional benefits over execution-based quality assurance methods that cannot detect evolvability defects. We suggest that code reviews may be most valuable for software products with long life cycles as the value of discovering evolvability defects in them is greater than for short life cycle systems.

[1]  Claes Wohlin,et al.  An Experimental Evaluation of an Experience-Based Capture-Recapture Method in Software Code Inspections , 1998, Empirical Software Engineering.

[2]  Yuming Zhou,et al.  Empirical Analysis of Object-Oriented Design Metrics for Predicting High and Low Severity Faults , 2006, IEEE Transactions on Software Engineering.

[3]  Mika Mäntylä,et al.  Drivers for software refactoring decisions , 2006, ISESE '06.

[4]  K. Seers Qualitative data analysis , 2011, Evidence Based Nursing.

[5]  Forrest Shull,et al.  The empirical investigation of Perspective-Based Reading , 1995, Empirical Software Engineering.

[6]  Victor R. Basili,et al.  Comparing the Effectiveness of Software Testing Strategies , 1987, IEEE Transactions on Software Engineering.

[7]  Per Runeson,et al.  Are the Perspectives Really Different? – Further Experimentation on Scenario-Based Reading of Requirements , 2000, Empirical Software Engineering.

[8]  Sallie M. Henry,et al.  Object-oriented metrics that predict maintainability , 1993, J. Syst. Softw..

[9]  Lionel C. Briand,et al.  A Controlled Experiment for Evaluating Quality Guidelines on the Maintainability of Object-Oriented Designs , 2001, IEEE Trans. Software Eng..

[10]  Robert B. Grady,et al.  Practical Software Metrics for Project Management and Process Improvement , 1992 .

[11]  David Lorge Parnas,et al.  Active design reviews: principles and practices , 1985, ICSE '85.

[12]  Tom Mens,et al.  A survey of software refactoring , 2004, IEEE Transactions on Software Engineering.

[13]  T. Cook,et al.  Quasi-experimentation: Design & analysis issues for field settings , 1979 .

[14]  Ilene Burnstein,et al.  Practical Software Testing , 2003, Springer Professional Computing.

[15]  James E. Tomayko,et al.  The structural complexity of software an experimental test , 2005, IEEE Transactions on Software Engineering.

[16]  Paul Clements,et al.  ATAM: Method for Architecture Evaluation , 2000 .

[17]  Harvey P. Siy,et al.  Does the modern code inspection have value? , 2001, Proceedings IEEE International Conference on Software Maintenance. ICSM 2001.

[18]  Curtis R. Cook,et al.  Typographic style is more than cosmetic , 1990, CACM.

[19]  Sigrid Eldh Software Testing Techniques , 2007 .

[20]  Stephen K Johnson Modifying AFOTEC's (Air Force Operational Test and Evaluation Center's) Software Maintainability Evaluation Guidelines , 1988 .

[21]  Thomas D. LaToza,et al.  Maintaining mental models: a study of developer work habits , 2006, ICSE.

[22]  S. Jamieson Likert scales: how to (ab)use them , 2004, Medical education.

[23]  Zhenchang Xing,et al.  Refactoring Practice: How it is and How it Should be Supported - An Eclipse Case Study , 2006, 2006 22nd IEEE International Conference on Software Maintenance.

[24]  Shari Lawrence Pfleeger,et al.  Preliminary Guidelines for Empirical Research in Software Engineering , 2002, IEEE Trans. Software Eng..

[25]  Oliver Laitenberger,et al.  Studying the effects of code inspection and structural testing on software quality , 1998, Proceedings Ninth International Symposium on Software Reliability Engineering (Cat. No.98TB100257).

[26]  Thomas Thelin,et al.  An industrial case study of the verification and validation activities , 2003, Proceedings. 5th International Workshop on Enterprise Networking and Computing in Healthcare Industry (IEEE Cat. No.03EX717).

[27]  R. S. Arnold,et al.  Software restructuring , 1989, Proc. IEEE.

[28]  Shari Lawrence Pfleeger,et al.  Software Quality: The Elusive Target , 1996, IEEE Softw..

[29]  K. Beck,et al.  Extreme Programming Explained , 2002 .

[30]  L. Delbeke Quasi-experimentation - design and analysis issues for field settings - cook,td, campbell,dt , 1980 .

[31]  Barry Boehm,et al.  Top 10 list [software development] , 2001 .

[32]  Ted Tenny,et al.  Program Readability: Procedures Versus Comments , 1988, IEEE Trans. Software Eng..

[33]  James Miller,et al.  Comparing and combining software defect detection techniques: a replicated empirical study , 1997, ESEC '97/FSE-5.

[34]  Glenford J. Myers,et al.  A controlled experiment in program testing and code walkthroughs/inspections , 1978, CACM.

[35]  Kent L. Beck,et al.  Test-driven Development - by example , 2002, The Addison-Wesley signature series.

[36]  Rajiv D. Banker,et al.  Software complexity and maintenance costs , 1993, CACM.

[37]  Ward Cunningham,et al.  The WyCash portfolio management system , 1992, OOPSLA '92.

[38]  C MurphyGail,et al.  How Are Java Software Developers Using the Eclipse IDE , 2006 .

[39]  Lionel C. Briand,et al.  A Comprehensive Evaluation of Capture-Recapture Models for Estimating Software Defect Content , 2000, IEEE Trans. Software Eng..

[40]  Inderpal S. Bhandari,et al.  Orthogonal Defect Classification - A Concept for In-Process Measurements , 1992, IEEE Trans. Software Eng..

[41]  Narasimhaiah Gorla,et al.  Debugging Effort Estimation Using Software Metrics , 1990, IEEE Trans. Software Eng..

[42]  Erik Kamsties,et al.  An Empirical Evaluation of Three Defect-Detection Techniques , 1995, ESEC.

[43]  Watts S. Humphrey,et al.  A discipline for software engineering , 2012, Series in software engineering.

[44]  Cem Kaner,et al.  Testing Computer Software , 1988 .

[45]  H. Dieter Rombach,et al.  A Controlled Expeniment on the Impact of Software Structure on Maintainability , 1987, IEEE Transactions on Software Engineering.

[46]  Ben Shneiderman,et al.  Program indentation and comprehensibility , 1983, CACM.

[47]  Adam A. Porter,et al.  Comparing Detection Methods for Software Requirements Inspections: A Replicated Experiment , 1995, IEEE Trans. Software Eng..

[48]  Rick Kazman,et al.  Evaluating Software Architectures: Methods and Case Studies , 2001 .

[49]  T. R. Knapp Treating ordinal scales as interval scales: an attempt to resolve the controversy. , 1990, Nursing research.

[50]  Oliver Laitenberger,et al.  An encompassing life cycle centric survey of software inspection , 2000, J. Syst. Softw..

[51]  Claes Wohlin,et al.  State‐of‐the‐art: software inspections after 25 years , 2002, Softw. Test. Verification Reliab..

[52]  Yong Rae Kwon,et al.  An empirical evaluation of six methods to detect faults in software , 2002, Softw. Test. Verification Reliab..

[53]  Khaled El Emam,et al.  The repeatability of code defect classifications , 1998, Proceedings Ninth International Symposium on Software Reliability Engineering (Cat. No.98TB100257).

[54]  Inderpal S. Bhandari,et al.  In-Process Evaluation for Software Inspection and Test , 1993, IEEE Trans. Software Eng..

[55]  Mik Kersten,et al.  How are Java software developers using the Elipse IDE? , 2006, IEEE Software.

[56]  Vijay K. Vaishnavi,et al.  Predicting Maintenance Performance Using Object-Oriented Design Complexity Metrics , 2003, IEEE Trans. Software Eng..

[57]  Khaled El Emam,et al.  Quantitative modeling of software reviews in an industrial setting , 1999, Proceedings Sixth International Software Metrics Symposium (Cat. No.PR00403).

[58]  Carolyn B. Seaman,et al.  Qualitative Methods in Empirical Studies of Software Engineering , 1999, IEEE Trans. Software Eng..

[59]  Jeffrey C. Carver,et al.  Perspective-Based Reading: A Replicated Experiment Focused on Individual Reviewer Effectiveness , 2006, Empirical Software Engineering.

[60]  Per Runeson,et al.  What do we know about defect detection methods? [software testing] , 2006, IEEE Software.

[61]  David P. Darcy,et al.  Managerial Use of Metrics for Object-Oriented Software: An Exploratory Analysis , 1998, IEEE Trans. Software Eng..

[62]  RunesonPer,et al.  What Do We Know about Defect Detection Methods , 2006 .

[63]  HuberAndy Peer reviews in software , 2002 .

[64]  Carl von Linné Systema Naturae: Per Regna Tria Naturae, Secundum Classes, Ordines, Genera, Species, Cum Characteribus, Differentiis, Synonymis, Locis, , 2011 .

[65]  Glen W. Russell,et al.  Experience with inspection in ultralarge-scale development , 1991, IEEE Software.

[66]  Michael E. Fagan Design and Code Inspections to Reduce Errors in Program Development , 1976, IBM Syst. J..