Relation of Code Clones and Change Couplings

Code clones have long been recognized as bad smells in software systems and are considered to cause maintenance problems during evolution. It is broadly assumed that the more clones two files share, the more often they have to be changed together. This relation between clones and change couplings has been postulated but neither demonstrated nor quantified yet. However, given such a relation it would simplify the identification of restructuring candidates and reduce change couplings. In this paper, we examine this relation and discuss if a correlation between code clones and change couplings can be verified. For that, we propose a framework to examine code clones and relate them to change couplings taken from release history analysis. We validated our framework with the open source project Mozilla and the results of the validation show that although the relation is statistically unverifiable it derives a reasonable amount of cases where the relation exists. Therefore, to discover clone candidates for restructuring we additionally propose a set of metrics and a visualization technique. This allows one to spot where a correlation between cloning and change coupling exists and, as a result, which files should be restructured to ease further evolution.

[1]  Harald C. Gall,et al.  Detection of logical coupling based on product release history , 1998, Proceedings. International Conference on Software Maintenance (Cat. No. 98CB36272).

[2]  Ettore Merlo,et al.  Assessing the benefits of incorporating function clone detection in a development process , 1997, 1997 Proceedings International Conference on Software Maintenance.

[3]  Miryung Kim,et al.  An ethnographic study of copy and paste programming practices in OOPL , 2004, Proceedings. 2004 International Symposium on Empirical Software Engineering, 2004. ISESE '04..

[4]  Shinji Kusumoto,et al.  CCFinder: A Multilinguistic Token-Based Code Clone Detection System for Large Scale Source Code , 2002, IEEE Trans. Software Eng..

[5]  Stéphane Ducasse,et al.  Visual Detection of Duplicated Code , 1998, ECOOP Workshops.

[6]  Brenda S. Baker,et al.  A Program for Identifying Duplicated Code , 1992 .

[7]  Elizabeth Burd,et al.  Evaluating clone detection tools for use during preventative maintenance , 2002, Proceedings. Second IEEE International Workshop on Source Code Analysis and Manipulation.

[8]  Harald C. Gall,et al.  Fine-grained analysis of change couplings , 2005, Fifth IEEE International Workshop on Source Code Analysis and Manipulation (SCAM'05).

[9]  M. Di Penta,et al.  Identifying clones in the Linux kernel , 2001, Proceedings First IEEE International Workshop on Source Code Analysis and Manipulation.

[10]  Stéphane Ducasse,et al.  Polymetric Views - A Lightweight Visual Approach to Reverse Engineering , 2003, IEEE Trans. Software Eng..

[11]  Harald C. Gall,et al.  Populating a Release History Database from version control and bug tracking systems , 2003, International Conference on Software Maintenance, 2003. ICSM 2003. Proceedings..

[12]  Ettore Merlo,et al.  Experiment on the automatic detection of function clones in a software system using metrics , 1996, 1996 Proceedings of International Conference on Software Maintenance.

[13]  Meir M. Lehman,et al.  Program evolution: processes of software change , 1985 .

[14]  Jens Krinke,et al.  Identifying similar code with program dependence graphs , 2001, Proceedings Eighth Working Conference on Reverse Engineering.

[15]  Penny Grubb,et al.  Software Maintenance: Concepts and Practice , 2003 .

[16]  Shinji Kusumoto,et al.  Gemini: maintenance support environment based on code clone analysis , 2002, Proceedings Eighth IEEE Symposium on Software Metrics.

[17]  Oscar Nierstrasz,et al.  Object-oriented reengineering patterns , 2004, Proceedings. 26th International Conference on Software Engineering.

[18]  Stéphane Ducasse,et al.  A language independent approach for detecting duplicated code , 1999, Proceedings IEEE International Conference on Software Maintenance - 1999 (ICSM'99). 'Software Maintenance for Business Change' (Cat. No.99CB36360).

[19]  Harald C. Gall,et al.  CVS release history data for detecting logical couplings , 2003, Sixth International Workshop on Principles of Software Evolution, 2003. Proceedings..

[20]  Miryung Kim,et al.  An empirical study of code clone genealogies , 2005, ESEC/FSE-13.

[21]  Michael W. Godfrey,et al.  Aiding comprehension of cloning through categorization , 2004 .