Clone refactoring inspection by summarizing clone refactorings and detecting inconsistent changes during software evolution

It has been broadly assumed that removing code clones by refactorings would solve the problems of code duplication. Despite recent empirical studies on the benefit of refactorings, contradicting evidence shows that it is often difficult or impossible to remove clones by using standard refactoring techniques. Developers cannot easily determine which clones can be refactored or how they should be maintained scattered throughout a large code base in evolving systems. We propose pattern‐based clone refactoring inspection (PRI), a technique for managing clone refactorings. PRI summarizes refactorings of clones and detects clones that are not consistently refactored. To help developers refactor these anomalies, PRI also visualizes clone evolution and refactorings and fixes refactoring anomalies to prevent the clone group from being left in an inconsistent state. We evaluated PRI on 6 open‐source projects and showed that it identifies clone refactorings with 94.1% accuracy and detects inconsistent refactorings with 98.4% accuracy, tracking clone change histories. In a study with 10 student developers, the participants reported that flexible PRI's summarization and detection features can be valuable for novice developers to learn about refactorings to clones. These results show that PRI should improve developer productivity in inspecting clone refactorings distributed across multiple files in evolving systems.

[1]  Shuvendu K. Lahiri,et al.  Helping Developers Help Themselves: Automatic Decomposition of Code Review Changesets , 2015, 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering.

[2]  Miryung Kim,et al.  A field study of refactoring challenges and benefits , 2012, SIGSOFT FSE.

[3]  Stas Negara,et al.  The need for richer refactoring usage data , 2011, PLATEAU '11.

[4]  Chanchal Kumar Roy,et al.  Understanding the evolution of Type-3 clones: An exploratory study , 2013, 2013 10th Working Conference on Mining Software Repositories (MSR).

[5]  Miryung Kim,et al.  An empirical study of code clone genealogies , 2005, ESEC/FSE-13.

[6]  Emerson R. Murphy-Hill,et al.  Reconciling manual and automatic refactoring , 2012, 2012 34th International Conference on Software Engineering (ICSE).

[7]  Zhendong Su,et al.  Scalable detection of semantic clones , 2008, 2008 ACM/IEEE 30th International Conference on Software Engineering.

[8]  Lerina Aversano,et al.  How Clones are Maintained: An Empirical Study , 2007, 11th European Conference on Software Maintenance and Reengineering (CSMR'07).

[9]  Harald C. Gall,et al.  Change Distilling:Tree Differencing for Fine-Grained Source Code Change Extraction , 2007, IEEE Transactions on Software Engineering.

[10]  Shinji Kusumoto,et al.  Identifying, Tailoring, and Suggesting Form Template Method Refactoring Opportunities with Program Dependence Graph , 2012, 2012 16th European Conference on Software Maintenance and Reengineering.

[11]  Dongmei Zhang,et al.  How do software engineers understand code changes?: an exploratory study in industry , 2012, SIGSOFT FSE.

[12]  Michael D. Ernst,et al.  CBCD: Cloned buggy code detector , 2012, 2012 34th International Conference on Software Engineering (ICSE).

[13]  Emerson R. Murphy-Hill,et al.  Refactoring-aware code review , 2017, 2017 IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC).

[14]  Ying Zou,et al.  An Empirical Study on Inconsistent Changes to Code Clones at Release Level , 2009, 2009 16th Working Conference on Reverse Engineering.

[15]  Jens Krinke,et al.  Is Cloned Code More Stable than Non-cloned Code? , 2008, 2008 Eighth IEEE International Working Conference on Source Code Analysis and Manipulation.

[16]  William G. Griswold,et al.  WitchDoctor: IDE support for real-time auto-completion of refactorings , 2012, 2012 34th International Conference on Software Engineering (ICSE).

[17]  Andrew Begel,et al.  Managing Duplicated Code with Linked Editing , 2004, 2004 IEEE Symposium on Visual Languages - Human Centric Computing.

[18]  C MurphyGail,et al.  How Are Java Software Developers Using the Eclipse IDE , 2006 .

[19]  Stas Negara,et al.  Is It Dangerous to Use Version Control Histories to Study Source Code Evolution? , 2012, ECOOP.

[20]  Hridesh Rajan,et al.  A study of repetitiveness of code changes in software evolution , 2013, 2013 28th IEEE/ACM International Conference on Automated Software Engineering (ASE).

[21]  Jens Krinke,et al.  Is cloned code older than non-cloned code? , 2011, IWSC '11.

[22]  Mik Kersten,et al.  How are Java software developers using the Elipse IDE? , 2006, IEEE Software.

[23]  Darko Marinov,et al.  Automated testing of refactoring engines , 2007, ESEC-FSE '07.

[24]  Katsuro Inoue,et al.  Finding file clones in FreeBSD Ports Collection , 2010, 2010 7th IEEE Working Conference on Mining Software Repositories (MSR 2010).

[25]  Tibor Gyimóthy,et al.  A Manually Validated Code Refactoring Dataset and Its Assessment Regarding Software Maintainability , 2016, PROMISE.

[26]  Daqing Hou,et al.  CReN: a tool for tracking copy-and-paste code clones and renaming identifiers consistently in the IDE , 2007, eclipse '07.

[27]  Nils Göde,et al.  Clone removal: fact or fiction? , 2010, IWSC '10.

[28]  Ferosh Jacob,et al.  Actively comparing clones inside the code editor , 2010, IWSC '10.

[29]  Andrew P. Black,et al.  How we refactor, and how we know it , 2009, 2009 IEEE 31st International Conference on Software Engineering.

[30]  Stas Negara,et al.  A Comparative Study of Manual and Automated Refactorings , 2013, ECOOP.

[31]  Paulo Borba,et al.  Making refactoring safer through impact analysis , 2014, Sci. Comput. Program..

[32]  Tibor Gyimóthy,et al.  A Code Refactoring Dataset and Its Assessment Regarding Software Maintainability , 2016, 2016 IEEE 23rd International Conference on Software Analysis, Evolution, and Reengineering (SANER).

[33]  Yuanyuan Zhou,et al.  CP-Miner: A Tool for Finding Copy-paste and Related Bugs in Operating System Code , 2004, OSDI.

[34]  Chanchal Kumar Roy,et al.  An Empirical Study of Function Clones in Open Source Software , 2008, 2008 15th Working Conference on Reverse Engineering.

[35]  Nils Göde,et al.  Clone Stability , 2011, 2011 15th European Conference on Software Maintenance and Reengineering.

[36]  Stephan Diehl,et al.  Are refactorings less error-prone than other changes? , 2006, MSR '06.

[37]  Hoan Anh Nguyen,et al.  Detection of recurring software vulnerabilities , 2010, ASE.

[38]  Shinji Kusumoto,et al.  Identifying clone removal opportunities based on co-evolution analysis , 2013, IWPSE 2013.

[39]  Ralph E. Johnson,et al.  Alternate refactoring paths reveal usability problems , 2014, ICSE.

[40]  Miryung Kim,et al.  Interactive Code Review for Systematic Changes , 2015, 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering.

[41]  Mauricio A. Saca Refactoring improving the design of existing code , 2017, 2017 IEEE 37th Central America and Panama Convention (CONCAPAN XXXVII).

[42]  Ferosh Jacob,et al.  CnP: Towards an environment for the proactive management of copy-and-paste programming , 2009, 2009 IEEE 17th International Conference on Program Comprehension.

[43]  Jens Krinke,et al.  A Study of Consistent and Inconsistent Changes to Code Clones , 2007, 14th Working Conference on Reverse Engineering (WCRE 2007).

[44]  Hoan Anh Nguyen,et al.  Clone-Aware Configuration Management , 2009, 2009 IEEE/ACM International Conference on Automated Software Engineering.

[45]  Chanchal Kumar Roy,et al.  Comparison and evaluation of code clone detection techniques and tools: A qualitative approach , 2009, Sci. Comput. Program..

[46]  Shinji Kusumoto,et al.  CCFinder: A Multilinguistic Token-Based Code Clone Detection System for Large Scale Source Code , 2002, IEEE Trans. Software Eng..

[47]  Xiaohong Su,et al.  SPAPE: A semantic-preserving amorphous procedure extraction method for near-miss clones , 2013, J. Syst. Softw..

[48]  Alan Mycroft,et al.  Java 8 in Action: Lambdas, Streams, and Functional-Style Programming , 2014 .

[49]  Melina Mongiovi,et al.  Scaling Testing of Refactoring Engines , 2014, 2014 IEEE International Conference on Software Maintenance and Evolution.

[50]  Zhendong Su,et al.  DECKARD: Scalable and Accurate Tree-Based Detection of Code Clones , 2007, 29th International Conference on Software Engineering (ICSE'07).

[51]  Jun Sun,et al.  Detecting differences across multiple instances of code clones , 2014, ICSE.

[52]  Mihai Surdeanu,et al.  The Stanford CoreNLP Natural Language Processing Toolkit , 2014, ACL.

[53]  Alexander Chatzigeorgiou,et al.  Identification of extract method refactoring opportunities for the decomposition of methods , 2011, J. Syst. Softw..

[54]  Martin P. Robillard,et al.  Tracking Code Clones in Evolving Software , 2007, 29th International Conference on Software Engineering (ICSE'07).

[55]  Ferosh Jacob,et al.  Exploring the design space of proactive tool support for copy-and-paste programming , 2009, CASCON.

[56]  Premkumar T. Devanbu,et al.  On the naturalness of software , 2016, Commun. ACM.

[57]  Ralph E. Johnson,et al.  The role of refactorings in API evolution , 2005, 21st IEEE International Conference on Software Maintenance (ICSM'05).

[58]  Chris Parnin,et al.  Improving change descriptions with change contexts , 2008, MSR '08.

[59]  Gabriele Bavota,et al.  Supporting extract class refactoring in Eclipse: The ARIES project , 2012, 2012 34th International Conference on Software Engineering (ICSE).

[60]  Miryung Kim,et al.  RefDistiller: a refactoring aware code review tool for inspecting manual refactoring edits , 2014, SIGSOFT FSE.

[61]  Vladimir I. Levenshtein,et al.  Binary codes capable of correcting deletions, insertions, and reversals , 1965 .

[62]  Miryung Kim,et al.  Detecting and characterizing semantic inconsistencies in ported code , 2013, 2013 28th IEEE/ACM International Conference on Automated Software Engineering (ASE).

[63]  Gustavo Soares,et al.  Automated Behavioral Testing of Refactoring Engines , 2013, IEEE Transactions on Software Engineering.

[64]  Emerson R. Murphy-Hill,et al.  Towards refactoring-aware code review , 2014, CHASE.

[65]  Manishankar Mondal,et al.  Automatic ranking of clones for refactoring through mining association rules , 2014, 2014 Software Evolution Week - IEEE Conference on Software Maintenance, Reengineering, and Reverse Engineering (CSMR-WCRE).

[66]  Davood Mazinanian,et al.  Assessing the Refactorability of Software Clones , 2015, IEEE Transactions on Software Engineering.

[67]  Shinji Kusumoto,et al.  A metric-based approach to identifying refactoring opportunities for merging code clones in a Java software system , 2008, J. Softw. Maintenance Res. Pract..

[68]  Manishankar Mondal,et al.  Comparative stability of cloned and non-cloned code: an empirical study , 2012, SAC '12.

[69]  Andrew P. Black,et al.  Breaking the barriers to successful refactoring: observations and tools for extract method , 2008, ICSE.

[70]  Zhendong Su,et al.  Context-based detection of clone-related bugs , 2007, ESEC-FSE '07.

[71]  Foutse Khomh,et al.  Late propagation in software clones , 2011, 2011 27th IEEE International Conference on Software Maintenance (ICSM).

[72]  Carolyn B. Seaman,et al.  Qualitative Methods in Empirical Studies of Software Engineering , 1999, IEEE Trans. Software Eng..

[73]  Barbara G. Ryder,et al.  Crisp: a debugging tool for Java programs , 2005, 21st IEEE International Conference on Software Maintenance (ICSM'05).

[74]  Miryung Kim,et al.  Does Automated Refactoring Obviate Systematic Editing? , 2015, 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering.

[75]  Stas Negara,et al.  Use, disuse, and misuse of automated refactorings , 2012, 2012 34th International Conference on Software Engineering (ICSE).

[76]  Emerson R. Murphy-Hill,et al.  Comparing approaches to analyze refactoring activity on software repositories , 2013, J. Syst. Softw..

[77]  Rob Miller,et al.  Interactive Simultaneous Editing of Multiple Text Regions , 2001, USENIX ATC, General Track.

[78]  David W. Binkley,et al.  Interprocedural slicing using dependence graphs , 1990, TOPL.

[79]  Jeffrey G. Gray,et al.  Increasing clone maintenance support by unifying clone detection and refactoring activities , 2012, Inf. Softw. Technol..

[80]  Miryung Kim,et al.  An empirical investigation into the impact of refactoring on regression testing , 2012, 2012 28th IEEE International Conference on Software Maintenance (ICSM).

[81]  Chanchal Kumar Roy,et al.  NICAD: Accurate Detection of Near-Miss Intentional Clones Using Flexible Pretty-Printing and Code Normalization , 2008, 2008 16th IEEE International Conference on Program Comprehension.

[82]  Rohit Gheyi,et al.  Identifying overly strong conditions in refactoring implementations , 2011, 2011 27th IEEE International Conference on Software Maintenance (ICSM).

[83]  Michael W. Godfrey,et al.  Investigating Intentional Clone Refactoring , 2014, Electron. Commun. Eur. Assoc. Softw. Sci. Technol..

[84]  Miryung Kim,et al.  An ethnographic study of copy and paste programming practices in OOPL , 2004, Proceedings. 2004 International Symposium on Empirical Software Engineering, 2004. ISESE '04..

[85]  Miryung Kim,et al.  Template-based reconstruction of complex refactorings , 2010, 2010 IEEE International Conference on Software Maintenance.

[86]  Davood Mazinanian,et al.  Clone Refactoring with Lambda Expressions , 2017, 2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE).

[87]  Rainer Koschke,et al.  Studying clone evolution using incremental clone detection , 2013, J. Softw. Evol. Process..

[88]  Jens Krinke,et al.  Identifying similar code with program dependence graphs , 2001, Proceedings Eighth Working Conference on Reverse Engineering.

[89]  Gabriele Bavota,et al.  There and back again: Can you compile that snapshot? , 2017, J. Softw. Evol. Process..