An empirical investigation into the role of API-level refactorings during software evolution

It is widely believed that refactoring improves software quality and programmer productivity by making it easier to maintain and understand software systems. However, the role of refactorings has not been systematically investigated using fine-grained evolution history. We quantitatively and qualitatively studied API-level refactorings and bug fixes in three large open source projects, totaling 26523 revisions of evolution. The study found several surprising results: One, there is an increase in the number of bug fixes after API-level refactorings. Two, the time taken to fix bugs is shorter after API-level refactorings than before. Three, a large number of refactoring revisions include bug fixes at the same time or are related to later bug fix revisions. Four, API-level refactorings occur more frequently before than after major software releases. These results call for re-thinking refactoring's true benefits. Furthermore, frequent floss refactoring mistakes observed in this study call for new software engineering tools to support safe application of refactoring and behavior modifying edits together.

[1]  Meir M. Lehman,et al.  On understanding laws, evolution, and conservation in the large-program life cycle , 1984, J. Syst. Softw..

[2]  Ward Cunningham,et al.  The WyCash portfolio management system , 1992, OOPSLA '92.

[3]  William G. Griswold Program restructuring as an aid to software maintenance , 1992 .

[4]  Ralph E. Johnson,et al.  A Refactoring Tool for Smalltalk , 1997, Theory Pract. Object Syst..

[5]  Harald C. Gall,et al.  Populating a Release History Database from version control and bug tracking systems , 2003, International Conference on Software Maintenance, 2003. ICSM 2003. Proceedings..

[6]  Thomas D. Sandry,et al.  Introductory Statistics With R , 2003, Technometrics.

[7]  Gail C. Murphy,et al.  Hipikat: recommending pertinent software development artifacts , 2003, 25th International Conference on Software Engineering, 2003. Proceedings..

[8]  Tom Mens,et al.  A survey of software refactoring , 2004, IEEE Transactions on Software Engineering.

[9]  Carsten Görg,et al.  Error detection by refactoring reconstruction , 2005, MSR '05.

[10]  Andreas Zeller,et al.  When do changes induce fixes? , 2005, ACM SIGSOFT Softw. Eng. Notes.

[11]  N. Nagappan,et al.  Use of relative code churn measures to predict system defect density , 2005, Proceedings. 27th International Conference on Software Engineering, 2005. ICSE 2005..

[12]  Eleni Stroulia,et al.  UMLDiff: an algorithm for object-oriented design differencing , 2005, ASE.

[13]  Ralph E. Johnson,et al.  The role of refactorings in API evolution , 2005, 21st IEEE International Conference on Software Maintenance (ICSM'05).

[14]  Zhenchang Xing,et al.  Refactoring Practice: How it is and How it Should be Supported - An Eclipse Case Study , 2006, 2006 22nd IEEE International Conference on Software Maintenance.

[15]  Ralph E. Johnson,et al.  Automated Detection of Refactorings in Evolving Components , 2006, ECOOP.

[16]  Stephan Diehl,et al.  Are refactorings less error-prone than other changes? , 2006, MSR '06.

[17]  Harald C. Gall,et al.  Change Distilling:Tree Differencing for Fine-Grained Source Code Change Extraction , 2007, IEEE Transactions on Software Engineering.

[18]  Darko Marinov,et al.  Automated testing of refactoring engines , 2007, ESEC-FSE '07.

[19]  Andrew P. Black,et al.  Why Don't People Use Refactoring Tools? , 2007, WRT.

[20]  Miryung Kim,et al.  Automatic Inference of Structural Changes for Matching across Program Versions , 2007, 29th International Conference on Software Engineering (ICSE'07).

[21]  Daniel M. Germán,et al.  What do large commits tell us?: a taxonomical study of large commits , 2008, MSR '08.

[22]  Harald C. Gall,et al.  On the relation of refactorings and software defect prediction , 2008, MSR '08.

[23]  Andreas Zeller,et al.  Predicting faults from cached history , 2008, ISEC '08.

[24]  Andrew P. Black,et al.  How we refactor, and how we know it , 2009, 2009 IEEE 31st International Conference on Software Engineering.

[25]  Miryung Kim,et al.  Discovering and representing systematic code changes , 2009, 2009 IEEE 31st International Conference on Software Engineering.

[26]  Amer Diwan,et al.  Program Metamorphosis , 2009, ECOOP.

[27]  Premkumar T. Devanbu,et al.  Fair and balanced?: bias in bug-fix datasets , 2009, ESEC/FSE '09.

[28]  Gina Venolia,et al.  The secret life of bugs: Going past the errors and omissions in software repositories , 2009, 2009 IEEE 31st International Conference on Software Engineering.

[29]  Rick Kazman,et al.  A cost-benefit framework for making architectural decisions in a business context , 2010, 2010 ACM/IEEE 32nd International Conference on Software Engineering.

[30]  Wei Wu,et al.  AURA: a hybrid approach to identify framework evolution , 2010, 2010 ACM/IEEE 32nd International Conference on Software Engineering.

[31]  Miryung Kim,et al.  Template-based reconstruction of complex refactorings , 2010, 2010 IEEE International Conference on Software Maintenance.

[32]  Miryung Kim,et al.  A graph-based approach to API usage adaptation , 2010, OOPSLA.

[33]  James D. Herbsleb,et al.  Factors leading to integration failures in global feature-oriented development: an empirical analysis , 2011, 2011 33rd International Conference on Software Engineering (ICSE).