Clone Detection vs. Pattern Mining: The Battle

In this paper we compare two approaches to discover recurrent fragments in source code: clone detection and frequent subtree mining. We apply both approaches to a mediumsized Java case and compare qualitatively and quantitatively their results in terms of what types of code fragments are detected, as well as their size, relevance, coverage, and level of detail. We conclude that both approaches are complementary, while existing overlap may be used for cross-validation of the approaches.

[1]  Paolo Tonella,et al.  A Survey of Automated Code-Level Aspect Mining Techniques , 2007, LNCS Trans. Aspect Oriented Softw. Dev..

[2]  Chanchal K. Roy,et al.  A Survey on Software Clone Detection Research , 2007 .

[3]  Mauricio A. Saca Refactoring improving the design of existing code , 2017, 2017 IEEE 37th Central America and Panama Convention (CONCAPAN XXXVII).

[4]  Hiroki Arimura,et al.  Efficient Substructure Discovery from Large Semi-Structured Data , 2001, IEICE Trans. Inf. Syst..

[5]  Kim Mens,et al.  Mining Patterns in Source Code Using Tree Mining Algorithms , 2019, DS.

[6]  Chanchal Kumar Roy,et al.  Evaluating Modern Clone Detection Tools , 2014, 2014 IEEE International Conference on Software Maintenance and Evolution.

[7]  Henrik Bærbak Christensen,et al.  Frameworks: putting design patterns into perspective , 2004, ITiCSE '04.

[8]  Stanley M. Sutton,et al.  N degrees of separation: multi-dimensional separation of concerns , 1999, Proceedings of the 1999 International Conference on Software Engineering (IEEE Cat. No.99CB37002).

[9]  Kim Mens,et al.  Mining Source Code for Structural Regularities , 2010, 2010 17th Working Conference on Reverse Engineering.

[10]  Arie van Deursen,et al.  On the use of clone detection for identifying crosscutting concern code , 2005, IEEE Transactions on Software Engineering.

[11]  Stefan Wagner,et al.  On Automatically Collectable Metrics for Software Maintainability Evaluation , 2014, 2014 Joint Conference of the International Workshop on Software Measurement and the International Conference on Software Process and Product Measurement.