An exploratory study of identifier renamings

Identifiers play an important role in source code understandability, maintainability, and fault-proneness. This paper reports a study of identifier renamings in software systems, studying how terms (identifier atomic components) change in source code identifiers. Specifically, the paper (i) proposes a term renaming taxonomy, (ii) presents an approximate lightweight code analysis approach to detect and classify term renamings automatically into the taxonomy dimensions, and (iii) reports an exploratory study of term renamings in two open-source systems, Eclipse-JDT and Tomcat. We thus report evidence that not only synonyms are involved in renamings but also (in a small fraction) more unexpected changes occur: surprisingly, we detected hypernym (a more abstract term, e.g., size vs. length) and hyponym (a more concrete term, e.g., restriction vs. rule) renamings, and antonym renamings (a term replaced with one having the opposite meaning, e.g., closing vs. opening). Despite being only a fraction of all renamings, synonym, hyponym, hypernym, and antonym renamings may hint at some program understanding issues and, thus, could be used in a renamingrecommendation system to improve code quality.

[1]  Robert D. Macredie,et al.  The effects of comments and identifier names on program comprehensibility: an experimental investigation , 1996, J. Program. Lang..

[2]  Rudolf Ferenc,et al.  Using the Conceptual Cohesion of Classes for Fault Prediction in Object-Oriented Systems , 2008, IEEE Transactions on Software Engineering.

[3]  George A. Miller,et al.  WordNet: A Lexical Database for English , 1995, HLT.

[4]  Oscar Nierstrasz,et al.  Finding refactorings via change metrics , 2000, OOPSLA '00.

[5]  Andrian Marcus,et al.  On the Use of Domain Terms in Source Code , 2008, 2008 16th IEEE International Conference on Program Comprehension.

[6]  Giuliano Antoniol,et al.  Recovering the Evolution Stable Part Using an ECGM Algorithm: Is There a Tunnel in Mozilla? , 2009, 2009 13th European Conference on Software Maintenance and Reengineering.

[7]  David W. Binkley,et al.  What’s in a Name? A Study of Identifiers , 2006, 14th IEEE International Conference on Program Comprehension (ICPC'06).

[8]  David W. Binkley,et al.  Effective identifier names for comprehension and memory , 2007, Innovations in Systems and Software Engineering.

[9]  Paolo Tonella,et al.  Restructuring program identifier names , 2000, Proceedings 2000 International Conference on Software Maintenance.

[10]  Yann-Gaël Guéhéneuc,et al.  Recognizing Words from Source Code Identifiers Using Speech Recognition Techniques , 2010, 2010 14th European Conference on Software Maintenance and Reengineering.

[11]  Eleni Stroulia,et al.  Refactoring Detection based on UMLDiff Change-Facts Queries , 2006, 2006 13th Working Conference on Reverse Engineering.

[12]  Markus Pizka,et al.  Concise and Consistent Naming , 2005, IWPC.

[13]  Ralph E. Johnson,et al.  Automated Detection of Refactorings in Evolving Components , 2006, ECOOP.

[14]  Giuliano Antoniol,et al.  Analyzing the Evolution of the Source Code Vocabulary , 2009, 2009 13th European Conference on Software Maintenance and Reengineering.

[15]  Giuliano Antoniol,et al.  An automatic approach to identify class evolution discontinuities , 2004, Proceedings. 7th International Workshop on Principles of Software Evolution, 2004..

[16]  Gerardo Canfora,et al.  Tracking Your Changes: A Language-Independent Approach , 2009, IEEE Software.

[17]  Giuliano Antoniol,et al.  Recovering Traceability Links between Code and Documentation , 2002, IEEE Trans. Software Eng..

[18]  Emily Hill,et al.  Mining source code to automatically split identifiers for software analysis , 2009, 2009 6th IEEE International Working Conference on Mining Software Repositories.

[19]  Vladimir I. Levenshtein,et al.  Binary codes capable of correcting deletions, insertions, and reversals , 1965 .

[20]  Jonathan I. Maletic,et al.  Proceedings of the 3rd international workshop on Traceability in emerging forms of software engineering , 2005 .

[21]  Markus Pizka,et al.  Concise and consistent naming , 2005, 13th International Workshop on Program Comprehension (IWPC'05).

[22]  Qiang Tu,et al.  Tracking structural evolution using origin analysis , 2002, IWPSE '02.

[23]  Martin F. Porter,et al.  An algorithm for suffix stripping , 1997, Program.

[24]  Walter F. Tichy,et al.  Renaming Detection , 2004, Automated Software Engineering.

[25]  Giuliano Antoniol,et al.  3rd international workshop on traceability in emerging forms of software engineering (TEFSE 2005) , 2005, ASE '05.

[26]  David W. Binkley,et al.  Syntactic Identifier Conciseness and Consistency , 2006, 2006 Sixth IEEE International Workshop on Source Code Analysis and Manipulation.