An Assessment of Type-3 Clones as Detected by State-of-the-Art Tools

Code reuse through copying and pasting leads to so-called software clones. These clones can be roughly categorized into identical fragments (type-1 clones), fragments with parameter substitution (type-2 clones), and similar fragments that differ through modified,deleted, or added statements (type-3 clones). Although there has been extensive research on detecting clones, detection of type-3 clones is still an open research issue due to the inherent vaguenessin their definition. In this paper, we analyze type-3 clones detected by state-of-the-art tools and investigate type-3 clones in terms of their syntactic differences. Then, we derive their underlying semantic abstractions from their syntactic differences. Finally, we investigate whether there are any additional code characteristics that indicate that a tool-suggested clone candidate is a real type-3 clone from a human's perspective. Our findings can help developers of clone detectors to improve their tools.

[1]  Michael W. Godfrey,et al.  Toward a Taxonomy of Clones in Source Code: A Case Study , 2003 .

[2]  Rainer Koschke Identifying and Removing Software Clones , 2008, Software Evolution.

[3]  Mark Harman,et al.  KClone: A Proposed Approach to Fast Precise Code Clone Detection , 2009 .

[4]  Chanchal K. Roy,et al.  A Survey on Software Clone Detection Research , 2007 .

[5]  Andrew Walenstein Code Clones: Reconsidering Terminology , 2006, Duplication, Redundancy, and Similarity in Software.

[6]  Jens Krinke,et al.  Identifying similar code with program dependence graphs , 2001, Proceedings Eighth Working Conference on Reverse Engineering.

[7]  Magdalena Balazinska,et al.  Advanced clone-analysis to support object-oriented system refactoring , 2000, Proceedings Seventh Working Conference on Reverse Engineering.

[8]  Mohammad El-Ramly,et al.  Similarity in Programs , 2006, Duplication, Redundancy, and Similarity in Software.

[9]  Michael W. Godfrey,et al.  A Taxonomy of Clones in Source Code: The Re–Engineers Most Wanted List , 2003 .

[10]  Rainer Koschke,et al.  Survey of Research on Software Clones , 2006, Duplication, Redundancy, and Similarity in Software.

[11]  Ettore Merlo,et al.  Experiment on the automatic detection of function clones in a software system using metrics , 1996, 1996 Proceedings of International Conference on Software Maintenance.

[12]  Michael W. Godfrey,et al.  Supporting the analysis of clones in software systems , 2006, J. Softw. Maintenance Res. Pract..

[13]  Michael W. Godfrey,et al.  Supporting the analysis of clones in software systems: Research Articles , 2006 .

[14]  Chanchal Kumar Roy,et al.  Comparison and evaluation of code clone detection techniques and tools: A qualitative approach , 2009, Sci. Comput. Program..

[15]  Giuliano Antoniol,et al.  Comparison and Evaluation of Clone Detection Tools , 2007, IEEE Transactions on Software Engineering.

[16]  Steven D. Harlow,et al.  A Judgment Analysis Program for Clustering Similar Judgmental Systems , 1970 .

[17]  Christopher W. Fraser,et al.  Clone Detection via Structural Abstraction , 2007, WCRE.

[18]  R. Koschke,et al.  Frontiers of software clone management , 2008, 2008 Frontiers of Software Maintenance.

[19]  Brenda S. Baker,et al.  On finding duplication and near-duplication in large software systems , 1995, Proceedings of 2nd Working Conference on Reverse Engineering.

[20]  Rainer Koschke,et al.  Empirical evaluation of clone detection using syntax suffix trees , 2008, Empirical Software Engineering.

[21]  Aiko M. Hormann,et al.  Programs for Machine Learning. Part I , 1962, Inf. Control..