A Survey on Code Clone, Its Behavior and Applications

Code Clones are separate fragments of code that are very similar to a piece of code in matter or in its functionality. It is a type of Bad Smell that increases the project size and maintenance cost. However, the existing research elaborates several detection techniques. But the data from the available research is still insufficient to reach at some conclusion. The aim of this survey is to investigate all detection techniques and to analyze the Code Clone behavior and motivation behind the cloning. In this paper, 16 techniques for detecting the clones are summarized. The paper presents detailed analysis of 76 research papers. The research identified that there are various tools that are available for detecting Code Clones. We also investigate the approaches followed in the tools and further summarized the Code Clone patterns that are used for qualitative analysis. Overall, our findings indicate that the management of Clones should be started at the earliest.

[1]  Chanchal Kumar Roy,et al.  SimCad: An extensible and faster clone detection tool for large scale software systems , 2013, 2013 21st International Conference on Program Comprehension (ICPC).

[2]  Shinji Kusumoto,et al.  Gemini: maintenance support environment based on code clone analysis , 2002, Proceedings Eighth IEEE Symposium on Software Metrics.

[3]  Shane McIntosh,et al.  An empirical study of build maintenance effort , 2011, 2011 33rd International Conference on Software Engineering (ICSE).

[4]  Stéphane Ducasse,et al.  A language independent approach for detecting duplicated code , 1999, Proceedings IEEE International Conference on Software Maintenance - 1999 (ICSM'99). 'Software Maintenance for Business Change' (Cat. No.99CB36360).

[5]  Martin P. Robillard,et al.  Clonetracker: tool support for code clone management , 2008, ICSE '08.

[6]  Benjamin C. M. Fung,et al.  BinClone: Detecting Code Clones in Malware , 2014, 2014 Eighth International Conference on Software Security and Reliability.

[7]  Toshihiro Kamiya,et al.  Agec: An execution-semantic clone detection tool , 2013, 2013 21st International Conference on Program Comprehension (ICPC).

[8]  Sumit Kumar Yadav,et al.  A hybrid-token and textual based approach to find similar code segments , 2013, 2013 Fourth International Conference on Computing, Communications and Networking Technologies (ICCCNT).

[9]  Harald C. Gall,et al.  Relation of Code Clones and Change Couplings , 2006, FASE.

[10]  Rainer Koschke,et al.  Studying clone evolution using incremental clone detection , 2013, J. Softw. Evol. Process..

[11]  Lerina Aversano,et al.  An empirical study on the maintenance of source code clones , 2010, Empirical Software Engineering.

[12]  Vladimir Itsykson,et al.  Clone detection: Why, what and how? , 2010, 2010 6th Central and Eastern European Software Engineering Conference (CEE-SECR).

[13]  Manishankar Mondal,et al.  An Empirical Study of the Impacts of Clones in Software Maintenance , 2011, 2011 IEEE 19th International Conference on Program Comprehension.

[14]  Md. Rafiqul Islam,et al.  A study of code cloning in server pages of web applications developed using classic ASP.NET and ASP.NET MVC framework , 2011, 14th International Conference on Computer and Information Technology (ICCIT 2011).

[15]  Chanchal Kumar Roy,et al.  Detecting Clones Across Microsoft .NET Programming Languages , 2012, 2012 19th Working Conference on Reverse Engineering.

[16]  Elmar Jürgens,et al.  CloneDetective - A workbench for clone detection research , 2009, 2009 IEEE 31st International Conference on Software Engineering.

[17]  William F. Smyth,et al.  Efficient token based clone detection with flexible tokenization , 2007, FSE 2007.

[18]  Foutse Khomh,et al.  An empirical study on the fault-proneness of clone migration in clone genealogies , 2014, 2014 Software Evolution Week - IEEE Conference on Software Maintenance, Reengineering, and Reverse Engineering (CSMR-WCRE).

[19]  R. Radhika,et al.  Detection of Type-1 and Type-2 Code Clones Using Textual Analysis and Metrics , 2010, 2010 International Conference on Recent Trends in Information, Telecommunication and Computing.

[20]  Salwa K. Abd-El-Hafiz,et al.  A Metrics-Based Data Mining Approach for Software Clone Detection , 2012, 2012 IEEE 36th Annual Computer Software and Applications Conference.

[21]  Ettore Merlo,et al.  Assessing the benefits of incorporating function clone detection in a development process , 1997, 1997 Proceedings International Conference on Software Maintenance.

[22]  Magdalena Balazinska,et al.  Measuring clone based reengineering opportunities , 1999, Proceedings Sixth International Software Metrics Symposium (Cat. No.PR00403).

[23]  Michael W. Godfrey,et al.  “Cloning considered harmful” considered harmful: patterns of cloning in software , 2008, Empirical Software Engineering.

[24]  Hoan Anh Nguyen,et al.  ClemanX: Incremental clone detection tool for evolving software , 2009, 2009 31st International Conference on Software Engineering - Companion Volume.

[25]  Jens Krinke,et al.  A Study of Consistent and Inconsistent Changes to Code Clones , 2007, 14th Working Conference on Reverse Engineering (WCRE 2007).

[26]  Yuanyuan Zhou,et al.  CP-Miner: finding copy-paste and related bugs in large-scale software code , 2006, IEEE Transactions on Software Engineering.

[27]  Warren Toomey,et al.  Ctcompare: Code clone detection using hashed token sequences , 2012, 2012 6th International Workshop on Software Clones (IWSC).

[28]  Giuseppe Scanniello,et al.  A Tree Kernel based approach for clone detection , 2010, 2010 IEEE International Conference on Software Maintenance.

[29]  António Menezes Leitão Detection of Redundant Code Using R2D2 , 2004, Software Quality Journal.

[30]  Jeffrey G. Gray,et al.  Representing clones in a localized manner , 2011, IWSC '11.

[31]  Jens Krinke,et al.  Is Cloned Code More Stable than Non-cloned Code? , 2008, 2008 Eighth IEEE International Working Conference on Source Code Analysis and Manipulation.

[32]  Shinji Kusumoto,et al.  CCFinder: A Multilinguistic Token-Based Code Clone Detection System for Large Scale Source Code , 2002, IEEE Trans. Software Eng..

[33]  Tibor Gyimóthy,et al.  Clone Smells in Software Evolution , 2007, 2007 IEEE International Conference on Software Maintenance.

[34]  Lerina Aversano,et al.  How Clones are Maintained: An Empirical Study , 2007, 11th European Conference on Software Maintenance and Reengineering (CSMR'07).

[35]  Chanchal Kumar Roy,et al.  SeByte: A semantic clone detection tool for intermediate languages , 2012, 2012 20th IEEE International Conference on Program Comprehension (ICPC).

[36]  Chanchal K. Roy,et al.  A Survey on Software Clone Detection Research , 2007 .

[37]  Rainer Koschke,et al.  Clone Detection Using Abstract Syntax Suffix Trees , 2006, 2006 13th Working Conference on Reverse Engineering.

[38]  Nils Göde,et al.  Evolution of Type-1 Clones , 2009, 2009 Ninth IEEE International Working Conference on Source Code Analysis and Manipulation.

[39]  Maninder Singh,et al.  Semantic Code Clone Detection Using Parse Trees and Grammar Recovery , 2013 .

[40]  Shinji Kusumoto,et al.  ARIES: refactoring support tool for code clone , 2005, ACM SIGSOFT Softw. Eng. Notes.

[41]  Martin P. Robillard,et al.  Clone region descriptors: Representing and tracking duplication in source code , 2010, TSEM.

[42]  Shinji Kusumoto,et al.  Incremental Code Clone Detection: A PDG-based Approach , 2011, 2011 18th Working Conference on Reverse Engineering.

[43]  Ying Zou,et al.  An Empirical Study on Inconsistent Changes to Code Clones at Release Level , 2009, 2009 16th Working Conference on Reverse Engineering.

[44]  Zhenchang Xing,et al.  Cloning practices: Why developers clone and what can be changed , 2012, 2012 28th IEEE International Conference on Software Maintenance (ICSM).

[45]  Jeffrey C. Carver,et al.  Cloning: The need to understand developer intent , 2013, 2013 7th International Workshop on Software Clones (IWSC).

[46]  Hao Chen,et al.  AnDarwin: Scalable Detection of Android Application Clones Based on Semantics , 2015, IEEE Transactions on Mobile Computing.

[47]  Michel Wermelinger,et al.  Assessing the effect of clones on changeability , 2008, 2008 IEEE International Conference on Software Maintenance.

[48]  Stan Jarzabek,et al.  A Data Mining Approach for Detecting Higher-Level Clones in Software , 2009, IEEE Transactions on Software Engineering.

[49]  Ahmed E. Hassan,et al.  A Framework for Studying Clones In Large Software Systems , 2007 .

[50]  Miryung Kim,et al.  An ethnographic study of copy and paste programming practices in OOPL , 2004, Proceedings. 2004 International Symposium on Empirical Software Engineering, 2004. ISESE '04..

[51]  Michael W. Godfrey,et al.  A Study of Cloning in the Linux SCSI Drivers , 2011, 2011 IEEE 11th International Working Conference on Source Code Analysis and Manipulation.

[52]  Yang Yuan,et al.  Boreas: an accurate and scalable token-based approach to code clone detection , 2012, 2012 Proceedings of the 27th IEEE/ACM International Conference on Automated Software Engineering.

[53]  Bashar Nuseibeh,et al.  Evaluating the Harmfulness of Cloning: A Change Based Experiment , 2007, Fourth International Workshop on Mining Software Repositories (MSR'07:ICSE Workshops 2007).