Tracking Method-Level Clones and a Case Study

Analyzing histories of code clones is important for understanding how they affect software development and developers. For this, many studies have been devoted to the approach of tracking code clones. However, to the best of our knowledge, no existing studies have attempted to track code clones in long-term and fine-grained change histories.In this paper, we report on the analysis of histories of method-level code clones hosted by a fine-grained version control system called historage, which allowed us to track source code entities across commits.We have tracked and analyzed method-level code clones in 10 open source software projects and found out that (1) in many projects, method-level code clones are removed regardless of whether they were changed or how frequently they were changed, and (2) a group of method-level code clones created at the same time tend to survive longer than those created individually. We believe that these findings will provide useful insights for future research on code clones such as determining the priority of code clone management.

[1]  Osamu Mizuno,et al.  Bug prediction based on fine-grained module histories , 2012, 2012 34th International Conference on Software Engineering (ICSE).

[2]  Hajimu Iida,et al.  A hosting service of multi-language historage repositories , 2016, 2016 IEEE/ACIS 15th International Conference on Computer and Information Science (ICIS).

[3]  Katsuro Inoue,et al.  Extracting code clones for refactoring using combinations of clone metrics , 2011, IWSC '11.

[4]  Hajimu Iida,et al.  ROCAT on KATARIBE: Code Visualization for Communities , 2016, 2016 4th Intl Conf on Applied Computing and Information Technology/3rd Intl Conf on Computational Science/Intelligence and Applied Informatics/1st Intl Conf on Big Data, Cloud Computing, Data Science & Engineering (ACIT-CSII-BCD).

[5]  Hajimu Iida,et al.  An Approach for Fine-grained Detection of Refactoring Instances using Repository with Syntactic Information , 2015 .

[6]  Ferosh Jacob,et al.  Actively comparing clones inside the code editor , 2010, IWSC '10.

[7]  Chanchal Kumar Roy,et al.  An automatic framework for extracting and classifying near-miss clone genealogies , 2011, 2011 27th IEEE International Conference on Software Maintenance (ICSM).

[8]  Katsuro Inoue,et al.  Where does this code come from and where does it go? — Integrated code history tracker for open source systems , 2012, 2012 34th International Conference on Software Engineering (ICSE).

[9]  Miryung Kim,et al.  An ethnographic study of copy and paste programming practices in OOPL , 2004, Proceedings. 2004 International Symposium on Empirical Software Engineering, 2004. ISESE '04..

[10]  Ferosh Jacob,et al.  Exploring the design space of proactive tool support for copy-and-paste programming , 2009, CASCON.

[11]  Haidar Osman,et al.  An Extensive Analysis of Efficient Bug Prediction Configurations , 2017, PROMISE.

[12]  Miryung Kim,et al.  An empirical study of code clone genealogies , 2005, ESEC/FSE-13.

[13]  Witold Pedrycz,et al.  A comparative analysis of the efficiency of change metrics and static code attributes for defect prediction , 2008, 2008 ACM/IEEE 30th International Conference on Software Engineering.

[14]  Ricardo Terra,et al.  Recommending automated extract method refactorings , 2014, ICPC 2014.

[15]  Osamu Mizuno,et al.  Historage: fine-grained version control system for Java , 2011, IWPSE-EVOL '11.