SPAPE: A semantic-preserving amorphous procedure extraction method for near-miss clones

Cloned code, also known as duplicated code, is among the bad ''code smells''. Procedure extraction can be used to remove clones and to make a software system more maintainable. While the existing procedure extraction techniques can handle automatic extraction of exact clones effectively, they fail to do so for near-miss clones, which are the code fragments that are similar but not the same. To address this gap, we developed SPAPE, a novel semantic-preserving amorphous procedure extraction method to extract near-miss clones. SPAPE relaxes the constraint of having the same syntax and uses the structural semantic information. We evaluated the performance, effectiveness, and benefits of SPAPE. Our results show that SPAPE can extract more near-miss clones than the best applicable method for ten open-source-software products in an efficient and effective fashion. We conclude that SPAPE can be a useful contribution to the toolsets of software managers and developers, and it can help them improve code structure and reduce software maintenance and overall project costs.

[1]  Chanchal K. Roy,et al.  Recommending change clusters to support software investigation: an empirical study , 2010 .

[2]  Elmar Jürgens,et al.  Do code clones matter? , 2009, 2009 IEEE 31st International Conference on Software Engineering.

[3]  Michel Wermelinger,et al.  Tracking clones' imprint , 2010, IWSC '10.

[4]  Shinji Kusumoto,et al.  Identifying Refactoring Opportunities for Removing Code Clones with A Metrics-based Approach , 2011 .

[5]  Chanchal Kumar Roy,et al.  NICAD: Accurate Detection of Near-Miss Intentional Clones Using Flexible Pretty-Printing and Code Normalization , 2008, 2008 16th IEEE International Conference on Program Comprehension.

[6]  Katsuhisa Maruyama,et al.  A security-aware refactoring tool for Java programs , 2011, WRT '11.

[7]  Chanchal K. Roy,et al.  A Survey on Software Clone Detection Research , 2007 .

[8]  Zhendong Niu,et al.  Identifying Fragments to be Extracted from Long Methods , 2009, 2009 16th Asia-Pacific Software Engineering Conference.

[9]  Stéphane Ducasse,et al.  Insights into system-wide code duplication , 2004, 11th Working Conference on Reverse Engineering.

[10]  Pierre-Etienne Moreau,et al.  A collection of C, C++ and Java code understanding and refactoring plugins , 2005, ICSM.

[11]  Mark Harman,et al.  Syntax-Directed Amorphous Slicing , 2004, Automated Software Engineering.

[12]  James R. Cordy,et al.  Practical language-independent detection of near-miss clones , 2004, CASCON.

[13]  Susan Horwitz,et al.  Using Slicing to Identify Duplication in Source Code , 2001, SAS.

[14]  Chung-Horng Lung,et al.  Program restructuring using clustering techniques , 2006, J. Syst. Softw..

[15]  Marija Katić,et al.  Towards an appropriate software refactoring tool support , 2009 .

[16]  Chanchal Kumar Roy,et al.  An automatic framework for extracting and classifying near-miss clone genealogies , 2011, 2011 27th IEEE International Conference on Software Maintenance (ICSM).

[17]  Alexander Chatzigeorgiou,et al.  Identification of Extract Method Refactoring Opportunities , 2009, 2009 13th European Conference on Software Maintenance and Reengineering.

[18]  Chanchal Kumar Roy,et al.  Detection and analysis of near-miss software clones , 2009, 2009 IEEE International Conference on Software Maintenance.

[19]  Mark Harman,et al.  Amorphous procedure extraction , 2004 .

[20]  Stephan Erb A Survey of Software Refactoring Tools , 2010 .

[21]  Mark Harman,et al.  Amorphous program slicing , 2003, J. Syst. Softw..

[22]  Jong-Deok Choi,et al.  Interprocedural pointer alias analysis , 1999, TOPL.

[23]  Robert Tairas Clone maintenance through analysis and refactoring , 2008, FSEDS '08.

[24]  Chanchal Kumar Roy,et al.  Comparison and evaluation of code clone detection techniques and tools: A qualitative approach , 2009, Sci. Comput. Program..

[25]  Wang Tiantian,et al.  A Semantics-Preserving Amorphous Procedure Extraction Method for C Clone Code , 2013 .

[26]  Chanchal K. Roy,et al.  Analyzing and Forecasting Near-Miss Clones in Evolving Software: An Empirical Study , 2011, 2011 16th IEEE International Conference on Engineering of Complex Computer Systems.

[27]  Andrew P. Black,et al.  How We Refactor, and How We Know It , 2012, IEEE Trans. Software Eng..

[28]  Mohammad Alshayeb,et al.  Software refactoring at the function level using new Adaptive K-Nearest Neighbor algorithm , 2010, Adv. Eng. Softw..

[29]  Zhendong Niu,et al.  Schedule of Bad Smell Detection and Resolution: A New Way to Save Effort , 2012, IEEE Transactions on Software Engineering.

[30]  Shinji Kusumoto,et al.  A visualization method of program dependency graph for identifying extract method opportunity , 2011, WRT '11.

[31]  Chanchal Kumar Roy,et al.  Near-miss function clones in open source software : an empirical study , 2009 .

[32]  Susan Horwitz,et al.  Effective, automatic procedure extraction , 2003, 11th IEEE International Workshop on Program Comprehension, 2003..

[33]  Manishankar Mondal,et al.  Comparative stability of cloned and non-cloned code: an empirical study , 2012, SAC '12.

[34]  Andrew P. Black,et al.  Breaking the barriers to successful refactoring: observations and tools for extract method , 2008, ICSE.

[35]  Baowen Xu,et al.  Dependence analysis for C programs with combinability of dataflow facts under consideration , 2009, Wuhan University Journal of Natural Sciences.

[36]  Michel Wermelinger,et al.  Assessing the effect of clones on changeability , 2008, 2008 IEEE International Conference on Software Maintenance.

[37]  Miryung Kim,et al.  An Empirical Study of Long-Lived Code Clones , 2011, FASE.

[38]  Rei Thiessen,et al.  Expression data flow graph: precise flow-sensitive pointer analysis for C programs , 2011 .

[39]  Yuanyuan Zhou,et al.  CP-Miner: finding copy-paste and related bugs in large-scale software code , 2006, IEEE Transactions on Software Engineering.

[40]  Ira D. Baxter,et al.  The Design Maintenance System ® (DMS ® ) A Tool for Automating Software Quality Enhancement , 2012 .