An Empirical Study on Inconsistent Changes to Code Clones at Release Level

Current research on code clones tries to address the question whether or not code clones are harmful for the quality of software. As most of these studies are based on the fine-grained analysis of inconsistent changes at the revision level, they capture much of the chaotic and experimental nature inherent to any ongoing software development process. Conclusions drawn from the inspection of highly fluctuating and short-lived clones are likely to exaggerate the ill effects of inconsistent changes. To gain a broader perspective, we perform an empirical study on the effect of inconsistent changes on software quality at the release level. Based on a case study on two open source software systems, we observe that only 1% to 3% of inconsistent changes to clones introduce software defects, as opposed to substantially higher percentages reported by other studies. Our findings suggest that developers are able to effectively manage and control the evolution of cloned code at the release level.

[1]  Jens Krinke,et al.  Identifying similar code with program dependence graphs , 2001, Proceedings Eighth Working Conference on Reverse Engineering.

[2]  Martin P. Robillard,et al.  Clonetracker: tool support for code clone management , 2008, ICSE '08.

[3]  Rainer Koschke Identifying and Removing Software Clones , 2008, Software Evolution.

[4]  Romain Robbes,et al.  Recovering inter-project dependencies in software ecosystems , 2010, ASE.

[5]  Andrian Marcus,et al.  Identification of high-level concept clones in source code , 2001, Proceedings 16th Annual International Conference on Automated Software Engineering (ASE 2001).

[6]  Michel Wermelinger,et al.  Assessing the effect of clones on changeability , 2008, 2008 IEEE International Conference on Software Maintenance.

[7]  Walter F. Tichy,et al.  Proceedings 25th International Conference on Software Engineering , 2003, 25th International Conference on Software Engineering, 2003. Proceedings..

[8]  Brenda S. Baker,et al.  On finding duplication and near-duplication in large software systems , 1995, Proceedings of 2nd Working Conference on Reverse Engineering.

[9]  Chanchal Kumar Roy,et al.  A Mutation/Injection-Based Automatic Framework for Evaluating Code Clone Detection Tools , 2009, 2009 International Conference on Software Testing, Verification, and Validation Workshops.

[10]  Paolo Nesi,et al.  Proceedings of the Third European Conference on Software Maintenance and Reengineering, Cahapel of St. Agnes, University of Amsterdam, the Netherlands, March 3-5, 1999 , 1999 .

[11]  Martin P. Robillard,et al.  Tracking Code Clones in Evolving Software , 2007, 29th International Conference on Software Engineering (ICSE'07).

[12]  Michael W. Godfrey,et al.  Supporting the analysis of clones in software systems: Research Articles , 2006 .

[13]  Peter Kampstra,et al.  Beanplot: A Boxplot Alternative for Visual Comparison of Distributions , 2008 .

[14]  Jens Krinke,et al.  A Study of Consistent and Inconsistent Changes to Code Clones , 2007, 14th Working Conference on Reverse Engineering (WCRE 2007).

[15]  Chanchal Kumar Roy,et al.  Comparison and evaluation of code clone detection techniques and tools: A qualitative approach , 2009, Sci. Comput. Program..

[16]  Giuliano Antoniol,et al.  Comparison and Evaluation of Clone Detection Tools , 2007, IEEE Transactions on Software Engineering.

[17]  Hoan Anh Nguyen,et al.  Clone-Aware Configuration Management , 2009, 2009 IEEE/ACM International Conference on Automated Software Engineering.

[18]  Shinji Kusumoto,et al.  CCFinder: A Multilinguistic Token-Based Code Clone Detection System for Large Scale Source Code , 2002, IEEE Trans. Software Eng..

[19]  Michael W. Godfrey,et al.  Subjectivity in Clone Judgment: Can We Ever Agree? , 2006, Duplication, Redundancy, and Similarity in Software.

[20]  Chanchal K. Roy,et al.  A Survey on Software Clone Detection Research , 2007 .

[21]  Rainer Koschke,et al.  Clone Detection Using Abstract Syntax Suffix Trees , 2006, 2006 13th Working Conference on Reverse Engineering.

[22]  Nils Göde,et al.  Evolution of Type-1 Clones , 2009, 2009 Ninth IEEE International Working Conference on Source Code Analysis and Manipulation.

[23]  Chanchal K. Roy,et al.  Recommending change clusters to support software investigation: an empirical study , 2010 .

[24]  Michael W. Godfrey,et al.  "Cloning Considered Harmful" Considered Harmful , 2006, 2006 13th Working Conference on Reverse Engineering.

[25]  Elmar Jürgens,et al.  Do code clones matter? , 2009, 2009 IEEE 31st International Conference on Software Engineering.

[26]  Lerina Aversano,et al.  How Clones are Maintained: An Empirical Study , 2007, 11th European Conference on Software Maintenance and Reengineering (CSMR'07).

[27]  Premkumar T. Devanbu,et al.  Clones: What is that smell? , 2010, MSR.

[28]  Chanchal Kumar Roy,et al.  Scenario-Based Comparison of Clone Detection Techniques , 2008, 2008 16th IEEE International Conference on Program Comprehension.

[29]  Rainer Koschke,et al.  Survey of Research on Software Clones , 2006, Duplication, Redundancy, and Similarity in Software.

[30]  Ahmed E. Hassan,et al.  An experience report on scaling tools for mining software repositories using MapReduce , 2010, ASE '10.

[31]  Arie van Deursen,et al.  Managing code clones using dynamic change tracking and resolution , 2009, 2009 IEEE International Conference on Software Maintenance.

[32]  J. Howard Johnson,et al.  Substring matching for clone detection and change tracking , 1994, Proceedings 1994 International Conference on Software Maintenance.

[33]  Nils Göde,et al.  Modeling Clone Evolution , 2009 .

[34]  Renato De Mori,et al.  Pattern matching for clone and concept detection , 2004, Automated Software Engineering.

[35]  Jens Krinke,et al.  Is Cloned Code More Stable than Non-cloned Code? , 2008, 2008 Eighth IEEE International Working Conference on Source Code Analysis and Manipulation.

[36]  Stéphane Ducasse,et al.  A language independent approach for detecting duplicated code , 1999, Proceedings IEEE International Conference on Software Maintenance - 1999 (ICSM'99). 'Software Maintenance for Business Change' (Cat. No.99CB36360).

[37]  Rainer Koschke,et al.  Incremental Clone Detection , 2009, 2009 13th European Conference on Software Maintenance and Reengineering.

[38]  Michael W. Godfrey,et al.  Supporting the analysis of clones in software systems , 2006, J. Softw. Maintenance Res. Pract..

[39]  Tibor Gyimóthy,et al.  Clone Smells in Software Evolution , 2007, 2007 IEEE International Conference on Software Maintenance.

[40]  Giuliano Antoniol,et al.  Analyzing cloning evolution in the Linux kernel , 2002, Inf. Softw. Technol..

[41]  Harald C. Gall,et al.  Relation of Code Clones and Change Couplings , 2006, FASE.

[42]  Serge Demeyer,et al.  Software Evolution , 2010 .

[43]  Philip S. Yu,et al.  GPLAG: detection of software plagiarism by program dependence graph analysis , 2006, KDD '06.

[44]  Zhendong Su,et al.  DECKARD: Scalable and Accurate Tree-Based Detection of Code Clones , 2007, 29th International Conference on Software Engineering (ICSE'07).

[45]  Miryung Kim,et al.  An empirical study of code clone genealogies , 2005, ESEC/FSE-13.

[46]  Zhendong Su,et al.  Scalable detection of semantic clones , 2008, 2008 ACM/IEEE 30th International Conference on Software Engineering.

[47]  Lerina Aversano,et al.  An empirical study on the maintenance of source code clones , 2010, Empirical Software Engineering.

[48]  Ettore Merlo,et al.  Assessing the benefits of incorporating function clone detection in a development process , 1997, 1997 Proceedings International Conference on Software Maintenance.

[49]  J. Sim,et al.  The kappa statistic in reliability studies: use, interpretation, and sample size requirements. , 2005, Physical therapy.

[50]  Ying Zou,et al.  A Technique for Just-in-Time Clone Detection in Large Scale Systems , 2010, 2010 IEEE 18th International Conference on Program Comprehension.