Eliminating duplication in source code via procedure extraction

Duplication in source code is a widespread phenomenon that increases program size and complexity, and makes program maintenance more difficult. A solution to this problem is to detect clones (instances of copied code) and to eliminate them. Elimination works by extracting the cloned code into a separate new procedure, and replacing each clone by a call to this procedure. Several automatic approaches to detecting clones have been reported in the literature. In this paper we address the issue of automatically extracting a previously detected group of clones into a separate procedure. We present an algorithm that can extract “difficult” groups of clones, and a study that shows that difficult clone groups arise frequently in practice, and that our algorithm handles them well.

[1]  Sumit Kumar,et al.  Better Slicing of Programs with Jumps and Switches , 2002, FASE.

[2]  Bjorn De Sutter,et al.  Compiler techniques for code compaction , 2000, TOPL.

[3]  William G. Griswold,et al.  Automated assistance for program restructuring , 1993, TSEM.

[4]  Ettore Merlo,et al.  Assessing the benefits of incorporating function clone detection in a development process , 1997, 1997 Proceedings International Conference on Software Maintenance.

[5]  David A. Padua,et al.  Dependence graphs and compiler optimizations , 1981, POPL '81.

[6]  Susan Horwitz,et al.  Incremental program testing using program dependence graphs , 1993, POPL '93.

[7]  Jong-Deok Choi,et al.  Static slicing in the presence of goto statements , 1994, TOPL.

[8]  Susan Horwitz,et al.  Effective, automatic procedure extraction , 2003, 11th IEEE International Workshop on Program Comprehension, 2003..

[9]  Keith D. Cooper,et al.  Enhanced code compression for embedded RISC processors , 1999, PLDI '99.

[10]  Thomas Ball,et al.  Slicing Programs with Arbitrary Control-flow , 1993, AADEBUG.

[11]  Brenda S. Baker,et al.  On finding duplication and near-duplication in large software systems , 1995, Proceedings of 2nd Working Conference on Reverse Engineering.

[12]  Renato De Mori,et al.  Pattern matching for clone and concept detection , 2004, Automated Software Engineering.

[13]  Vance E. Waddle,et al.  An E log E Line Crossing Algorithm for Levelled Graphs , 1999, Graph Drawing.

[14]  Susan Horwitz,et al.  Using Slicing to Identify Duplication in Source Code , 2001, SAS.

[15]  Joe D. Warren,et al.  The program dependence graph and its use in optimization , 1984, TOPL.

[16]  Magdalena Balazinska,et al.  Partial redesign of Java software systems based on clone analysis , 1999, Sixth Working Conference on Reverse Engineering (Cat. No.PR00303).

[17]  Michael Joseph Zastre,et al.  Compacting Object Code via Parameterized Procedural Abstraction , 1995 .