An empirical study on clone stability

Code cloning is a controversial software engineering practice due to contradictory claims regarding its effect on software maintenance. Code stability is a recently introduced measurement technique that has been used to determine the impact of code cloning by quantifying the changeability of a code region. Although most existing stability analysis studies agree that cloned code is more stable than non-cloned code, the studies have two major flaws: (i) each study only considered a single stability measurement (e.g., lines of code changed, frequency of change, age of change); and, (ii) only a small number of subject systems were analyzed and these were of limited variety. In this paper, we present a comprehensive empirical study on code stability using four different stability measuring methods. We use a recently introduced hybrid clone detection tool, NiCAD, to detect the clones and analyze their stability in different dimensions: by clone type, by measuring method, by programming language, and by system size and age. Our in-depth investigation on 12 diverse subject systems written in three programming languages considering three types of clones reveals that: (i) cloned code is generally less stable than non-cloned code, and more specifically both Type-1 and Type-2 clones show higher instability than Type-3 clones; (ii) clones in both Java and C systems exhibit higher instability compared to the clones in C# systems; (iii) a system's development strategy might play a key role in defining its comparative code stability scenario; and, (iv) cloned and non-cloned regions of a subject system do not follow any consistent change pattern.

[1]  Chanchal Kumar Roy,et al.  A Mutation/Injection-Based Automatic Framework for Evaluating Code Clone Detection Tools , 2009, 2009 International Conference on Software Testing, Verification, and Validation Workshops.

[2]  Michel Wermelinger,et al.  Tracking clones' imprint , 2010, IWSC '10.

[3]  Shinji Kusumoto,et al.  Is duplicate code more frequently modified than non-duplicate code in software evolution?: an empirical study on open source software , 2010, IWPSE-EVOL '10.

[4]  Chanchal K. Roy,et al.  Analyzing and Forecasting Near-Miss Clones in Evolving Software: An Empirical Study , 2011, 2011 16th IEEE International Conference on Engineering of Complex Computer Systems.

[5]  Manishankar Mondal,et al.  Comparative stability of cloned and non-cloned code: an empirical study , 2012, SAC '12.

[6]  Chanchal Kumar Roy,et al.  NICAD: Accurate Detection of Near-Miss Intentional Clones Using Flexible Pretty-Printing and Code Normalization , 2008, 2008 16th IEEE International Conference on Program Comprehension.

[7]  Chanchal Kumar Roy,et al.  The NiCad Clone Detector , 2011, 2011 IEEE 19th International Conference on Program Comprehension.

[8]  Elmar Jürgens,et al.  Do code clones matter? , 2009, 2009 IEEE 31st International Conference on Software Engineering.

[9]  Chanchal Kumar Roy,et al.  Scenario-Based Comparison of Clone Detection Techniques , 2008, 2008 16th IEEE International Conference on Program Comprehension.

[10]  Chanchal Kumar Roy,et al.  Evaluating Code Clone Genealogies at Release Level: An Empirical Study , 2010, 2010 10th IEEE Working Conference on Source Code Analysis and Manipulation.

[11]  Lerina Aversano,et al.  How Clones are Maintained: An Empirical Study , 2007, 11th European Conference on Software Maintenance and Reengineering (CSMR'07).

[12]  Miryung Kim,et al.  An empirical study of code clone genealogies , 2005, ESEC/FSE-13.

[13]  Lerina Aversano,et al.  An empirical study on the maintenance of source code clones , 2010, Empirical Software Engineering.

[14]  Manishankar Mondal,et al.  Dispersion of changes in cloned and non-cloned code , 2012, 2012 6th International Workshop on Software Clones (IWSC).

[15]  Jens Krinke,et al.  A Study of Consistent and Inconsistent Changes to Code Clones , 2007, 14th Working Conference on Reverse Engineering (WCRE 2007).

[16]  Chanchal Kumar Roy,et al.  Comparison and evaluation of code clone detection techniques and tools: A qualitative approach , 2009, Sci. Comput. Program..

[17]  Chanchal Kumar Roy,et al.  Near-miss function clones in open source software : an empirical study , 2009 .

[18]  Chanchal Kumar Roy,et al.  Tuning research tools for scalability and performance: The NiCad experience , 2014, Sci. Comput. Program..

[19]  Chanchal Kumar Roy,et al.  An automatic framework for extracting and classifying near-miss clone genealogies , 2011, 2011 27th IEEE International Conference on Software Maintenance (ICSM).

[20]  Chanchal Kumar Roy,et al.  An Empirical Study of Function Clones in Open Source Software , 2008, 2008 15th Working Conference on Reverse Engineering.

[21]  Michel Wermelinger,et al.  Assessing the effect of clones on changeability , 2008, 2008 IEEE International Conference on Software Maintenance.

[22]  Nils Göde,et al.  Clone Stability , 2011, 2011 15th European Conference on Software Maintenance and Reengineering.

[23]  Bashar Nuseibeh,et al.  Evaluating the Harmfulness of Cloning: A Change Based Experiment , 2007, Fourth International Workshop on Mining Software Repositories (MSR'07:ICSE Workshops 2007).

[24]  Michael W. Godfrey,et al.  “Cloning considered harmful” considered harmful: patterns of cloning in software , 2008, Empirical Software Engineering.

[25]  Jens Krinke,et al.  Is cloned code older than non-cloned code? , 2011, IWSC '11.

[26]  Jens Krinke,et al.  Is Cloned Code More Stable than Non-cloned Code? , 2008, 2008 Eighth IEEE International Working Conference on Source Code Analysis and Manipulation.