Analyzing the Effect of Preprocessor Annotations on Code Clones

The C preprocessor cpp is a powerful and language-independent tool, widely used to implement variable software in different programming languages (C, C++) using conditional compilation. Preprocessor annotations can used on different levels of granularity such as functions or statements. In this paper, we investigate whether there is a relation between code clones and preprocessor annotations. Specifically, we address the question whether the discipline of annotation has an effect on code clones. To this end, we perform a case study on fifteen different C programs and analyze them regarding code clones and #ifdef occurrences. We found only minor effects of annotations on code clones, but a relationship between annotations that align with the code structure (and code clones). With this work, we provide new insights why code clones occur in C programs. Furthermore, the results can support the decision whether or not it is beneficial to remove clones.

[1]  Michael D. Ernst,et al.  An Empirical Analysis of C Preprocessor Use , 2002, IEEE Trans. Software Eng..

[2]  Bjarne Stroustrup,et al.  The Design and Evolution of C , 1994 .

[3]  Lerina Aversano,et al.  How Clones are Maintained: An Empirical Study , 2007, 11th European Conference on Software Maintenance and Reengineering (CSMR'07).

[4]  Ettore Merlo,et al.  Experiment on the automatic detection of function clones in a software system using metrics , 1996, 1996 Proceedings of International Conference on Software Maintenance.

[5]  Ira D. Baxter,et al.  Preprocessor conditional removal by simple partial evaluation , 2001, Proceedings Eighth Working Conference on Reverse Engineering.

[6]  Christian Prehofer,et al.  Feature-Oriented Programming: A Fresh Look at Objects , 1997, ECOOP.

[7]  Eugene Miya,et al.  On "Software engineering" , 1985, SOEN.

[8]  Mark Carpenter,et al.  The New Statistical Analysis of Data , 2000, Technometrics.

[9]  Stan Jarzabek,et al.  A Data Mining Approach for Detecting Higher-Level Clones in Software , 2009, IEEE Transactions on Software Engineering.

[10]  Chanchal K. Roy,et al.  A Survey on Software Clone Detection Research , 2007 .

[11]  Sven Apel,et al.  Virtual Separation of Concerns - A Second Chance for Preprocessors , 2009, J. Object Technol..

[12]  Arie van Deursen,et al.  On the use of clone detection for identifying crosscutting concern code , 2005, IEEE Transactions on Software Engineering.

[13]  Brenda S. Baker,et al.  On finding duplication and near-duplication in large software systems , 1995, Proceedings of 2nd Working Conference on Reverse Engineering.

[14]  Elmar Jürgens,et al.  Do code clones matter? , 2009, 2009 IEEE 31st International Conference on Software Engineering.

[15]  Shinji Kusumoto,et al.  CCFinder: A Multilinguistic Token-Based Code Clone Detection System for Large Scale Source Code , 2002, IEEE Trans. Software Eng..

[16]  Henry Spencer,et al.  #ifdef Considered Harmful, or Portability Experience with C News , 1992, USENIX Summer.

[17]  Gul A. Agha,et al.  Concurrent object-oriented programming , 1993, CACM.

[18]  Sven Apel,et al.  Analyzing the discipline of preprocessor annotations in 30 million lines of C code , 2011, AOSD '11.

[19]  Stéphane Ducasse,et al.  A language independent approach for detecting duplicated code , 1999, Proceedings IEEE International Conference on Software Maintenance - 1999 (ICSM'99). 'Software Maintenance for Business Change' (Cat. No.99CB36360).

[20]  Chanchal Kumar Roy,et al.  Near-miss function clones in open source software : an empirical study , 2009 .

[21]  Sven Apel,et al.  Guaranteeing Syntactic Correctness for All Product Line Variants: A Language-Independent Approach , 2009, TOOLS.

[22]  Paul Clements,et al.  Software product lines - practices and patterns , 2001, SEI series in software engineering.

[23]  Chanchal Kumar Roy,et al.  Comparison and evaluation of code clone detection techniques and tools: A qualitative approach , 2009, Sci. Comput. Program..

[24]  Andrian Marcus,et al.  Source code files as structured documents , 2002, Proceedings 10th International Workshop on Program Comprehension.

[25]  Miryung Kim,et al.  An empirical study of code clone genealogies , 2005, ESEC/FSE-13.

[26]  Lawrence L. Giventer Statistical Analysis for Public Administration , 1995 .

[27]  Bjarne Stroustrup,et al.  The C++ programming language (2nd ed.) , 1991 .

[28]  Jens Krinke,et al.  Identifying similar code with program dependence graphs , 2001, Proceedings Eighth Working Conference on Reverse Engineering.

[29]  Ettore Merlo,et al.  Assessing the benefits of incorporating function clone detection in a development process , 1997, 1997 Proceedings International Conference on Software Maintenance.

[30]  Sven Apel,et al.  An analysis of the variability in forty preprocessor-based software product lines , 2010, 2010 ACM/IEEE 32nd International Conference on Software Engineering.

[31]  Bjarne Stroustrup,et al.  C++ Programming Language , 1986, IEEE Softw..

[32]  Elmar Jürgens,et al.  CloneDetective - A workbench for clone detection research , 2009, 2009 IEEE 31st International Conference on Software Engineering.

[33]  Zhendong Su,et al.  DECKARD: Scalable and Accurate Tree-Based Detection of Code Clones , 2007, 29th International Conference on Software Engineering (ICSE'07).

[34]  Sven Apel,et al.  Granularity in software product lines , 2008, 2008 ACM/IEEE 30th International Conference on Software Engineering.

[35]  Jean-Marie Favre Understanding-in-the-large , 1997, Proceedings Fifth International Workshop on Program Comprehension. IWPC'97.