When Does a Refactoring Induce Bugs? An Empirical Study

Refactorings are - as defined by Fowler - behavior preserving source code transformations. Their main purpose is to improve maintainability or comprehensibility, or also reduce the code footprint if needed. In principle, refactorings are defined as simple operations so that are "unlikely to go wrong" and introduce faults. In practice, refactoring activities could have their risks, as other changes. This paper reports an empirical study carried out on three Java software systems, namely Apache Ant, Xerces, and Ar-go UML, aimed at investigating to what extent refactoring activities induce faults. Specifically, we automatically detect (and then manually validate) 15,008 refactoring operations (of 52 different kinds) using an existing tool (Ref-Finder). Then, we use the SZZ algorithm to determine whether it is likely that refactorings induced a fault. Results indicate that, while some kinds of refactorings are unlikely to be harmful, others, such as refactorings involving hierarchies (e.g., pull up method), tend to induce faults very frequently. This suggests more accurate code inspection or testing activities when such specific refactorings are performed.

[1]  David J. Sheskin,et al.  Handbook of Parametric and Nonparametric Statistical Procedures , 1997 .

[2]  Martin Fowler,et al.  Refactoring - Improving the Design of Existing Code , 1999, Addison Wesley object technology series.

[3]  Ralph E. Johnson,et al.  Automated Detection of Refactorings in Evolving Components , 2006, ECOOP.

[4]  Stephan Diehl,et al.  Identifying Refactorings from Source-Code Changes , 2006, 21st IEEE/ACM International Conference on Automated Software Engineering (ASE'06).

[5]  Warren Harrison,et al.  An Entropy-Based Measure of Software Complexity , 1992, IEEE Trans. Software Eng..

[6]  Yi Zhang,et al.  Classifying Software Changes: Clean or Buggy? , 2008, IEEE Transactions on Software Engineering.

[7]  Tom Mens,et al.  A survey of software refactoring , 2004, IEEE Transactions on Software Engineering.

[8]  Miryung Kim,et al.  Template-based reconstruction of complex refactorings , 2010, 2010 IEEE International Conference on Software Maintenance.

[9]  Ahmed E. Hassan,et al.  Predicting faults using the complexity of code changes , 2009, 2009 IEEE 31st International Conference on Software Engineering.

[10]  Darko Marinov,et al.  Automated testing of refactoring engines , 2007, ESEC-FSE '07.

[11]  Foutse Khomh,et al.  Is it a bug or an enhancement?: a text-based approach to classify change requests , 2008, CASCON '08.

[12]  Gerardo Canfora,et al.  An Exploratory Study of Factors Influencing Change Entropy , 2010, 2010 IEEE 18th International Conference on Program Comprehension.

[13]  Harald C. Gall,et al.  On the relation of refactorings and software defect prediction , 2008, MSR '08.

[14]  Michael W. Godfrey,et al.  Using origin analysis to detect merging and splitting of source code entities , 2005, IEEE Transactions on Software Engineering.

[15]  Premkumar T. Devanbu,et al.  Fair and balanced?: bias in bug-fix datasets , 2009, ESEC/FSE '09.

[16]  Serge Demeyer,et al.  Reconstruction of successful software evolution using clone detection , 2003, Sixth International Workshop on Principles of Software Evolution, 2003. Proceedings..

[17]  Gabriele Bavota,et al.  Identifying Extract Class refactoring opportunities using structural and semantic cohesion measures , 2011, J. Syst. Softw..

[18]  Oscar Nierstrasz,et al.  Finding refactorings via change metrics , 2000, OOPSLA '00.

[19]  Andreas Zeller,et al.  When do changes induce fixes? , 2005, ACM SIGSOFT Softw. Eng. Notes.

[20]  Stephan Diehl,et al.  Are refactorings less error-prone than other changes? , 2006, MSR '06.

[21]  Eleni Stroulia,et al.  Refactoring Detection based on UMLDiff Change-Facts Queries , 2006, 2006 13th Working Conference on Reverse Engineering.

[22]  David Lorge Parnas,et al.  Software aging , 1994, Proceedings of 16th International Conference on Software Engineering.