A Technique for Just-InTime Clone Detection in Large Scale Systems

Existing clone tracking tools have limited support for sharing clone information between developers in a large scale system. Developers are not notified when new clones are introduced by other developers or when existing clones are modified. We propose a client-server architecture that centrally detects and maintains clone information for an entire software system stored in a version control system. Clients retrieve a list of clones relevant to the code they are working on from the server. Whenever an update is committed to the version control system, the server detects and incrementally updates clone information. We propose techniques to improve the speed of the incremental clone detection. In order to reduce the number of comparisons required for clone detection, we select representative clones from the existing clone list. We build a string-based technique to compare the newly committed code with the representative clones and to update the clone list. In a case study, we show that our approach significantly reduces the clone detection time, while supporting clone detection across the entire software system.

[1]  J. Howard Johnson,et al.  Substring matching for clone detection and change tracking , 1994, Proceedings 1994 International Conference on Software Maintenance.

[2]  Arie van Deursen,et al.  Managing code clones using dynamic change tracking and resolution , 2009, 2009 IEEE International Conference on Software Maintenance.

[3]  Shinji Kusumoto,et al.  CCFinder: A Multilinguistic Token-Based Code Clone Detection System for Large Scale Source Code , 2002, IEEE Trans. Software Eng..

[4]  Rainer Koschke,et al.  Incremental Clone Detection , 2009, 2009 13th European Conference on Software Maintenance and Reengineering.

[5]  Brenda S. Baker,et al.  On finding duplication and near-duplication in large software systems , 1995, Proceedings of 2nd Working Conference on Reverse Engineering.

[6]  Stéphane Ducasse,et al.  A language independent approach for detecting duplicated code , 1999, Proceedings IEEE International Conference on Software Maintenance - 1999 (ICSM'99). 'Software Maintenance for Business Change' (Cat. No.99CB36360).

[7]  Hajimu Iida,et al.  SHINOBI: A Tool for Automatic Code Clone Detection in the IDE , 2009, 2009 16th Working Conference on Reverse Engineering.

[8]  James R. Cordy,et al.  Comprehending reality - practical barriers to industrial adoption of software maintenance automation , 2003, 11th IEEE International Workshop on Program Comprehension, 2003..

[9]  Michael W. Godfrey,et al.  "Cloning Considered Harmful" Considered Harmful , 2006, 2006 13th Working Conference on Reverse Engineering.

[10]  Donald E. Knuth,et al.  Fast Pattern Matching in Strings , 1977, SIAM J. Comput..

[11]  Daqing Hou,et al.  CReN: a tool for tracking copy-and-paste code clones and renaming identifiers consistently in the IDE , 2007, eclipse '07.

[12]  Martin P. Robillard,et al.  Tracking Code Clones in Evolving Software , 2007, 29th International Conference on Software Engineering (ICSE'07).