Aiding comprehension of cloning through categorization

Management of duplicated code in software systems is important in ensuring its graceful evolution. Commonly clone detection tools return large numbers of detected clones with little or no information about them, making clone management impractical and unscalable. We have used taxonomy of clones to augment current clone detection tools in order to increase the user comprehension of duplication of code within software systems and filter false positives from the clone set. We support our arguments by means of 2 case studies, where we found that as much as 53% of clones can be grouped to form function clones or partial function clones and we were able to filter out as many as 65% of clones as false positives from the reported clone pairs.

[1]  Giuliano Antoniol,et al.  Analyzing cloning evolution in the Linux kernel , 2002, Inf. Softw. Technol..

[2]  Magdalena Balazinska,et al.  Advanced clone-analysis to support object-oriented system refactoring , 2000, Proceedings Seventh Working Conference on Reverse Engineering.

[3]  Ettore Merlo,et al.  Experiment on the automatic detection of function clones in a software system using metrics , 1996, 1996 Proceedings of International Conference on Software Maintenance.

[4]  Magdalena Balazinska,et al.  Partial redesign of Java software systems based on clone analysis , 1999, Sixth Working Conference on Reverse Engineering (Cat. No.PR00303).

[5]  Michael W. Godfrey,et al.  Toward a Taxonomy of Clones in Source Code: A Case Study , 2003 .

[6]  Kostas Kontogiannis,et al.  Evaluation experiments on the detection of programming patterns using software metrics , 1997, Proceedings of the Fourth Working Conference on Reverse Engineering.

[7]  Magdalena Balazinska,et al.  Measuring clone based reengineering opportunities , 1999, Proceedings Sixth International Software Metrics Symposium (Cat. No.PR00403).

[8]  J. Howard Johnson,et al.  Identifying redundancy in source code using fingerprints , 1993, CASCON.

[9]  Brenda S. Baker,et al.  On finding duplication and near-duplication in large software systems , 1995, Proceedings of 2nd Working Conference on Reverse Engineering.

[10]  M. Di Penta,et al.  Identifying clones in the Linux kernel , 2001, Proceedings First IEEE International Workshop on Source Code Analysis and Manipulation.

[11]  Michael W. Godfrey,et al.  Evolution in open source software: a case study , 2000, Proceedings 2000 International Conference on Software Maintenance.

[12]  Michael W. Godfrey,et al.  A Taxonomy of Clones in Source Code: The Re–Engineers Most Wanted List , 2003 .

[13]  Shinji Kusumoto,et al.  CCFinder: A Multilinguistic Token-Based Code Clone Detection System for Large Scale Source Code , 2002, IEEE Trans. Software Eng..

[14]  J. Howard Johnson,et al.  Substring matching for clone detection and change tracking , 1994, Proceedings 1994 International Conference on Software Maintenance.

[15]  Jens Krinke,et al.  Identifying similar code with program dependence graphs , 2001, Proceedings Eighth Working Conference on Reverse Engineering.

[16]  Susan Horwitz,et al.  Using Slicing to Identify Duplication in Source Code , 2001, SAS.

[17]  Renato De Mori,et al.  Pattern matching for clone and concept detection , 2004, Automated Software Engineering.

[18]  Shinji Kusumoto,et al.  Gemini: maintenance support environment based on code clone analysis , 2002, Proceedings Eighth IEEE Symposium on Software Metrics.

[19]  Stéphane Ducasse,et al.  A language independent approach for detecting duplicated code , 1999, Proceedings IEEE International Conference on Software Maintenance - 1999 (ICSM'99). 'Software Maintenance for Business Change' (Cat. No.99CB36360).