Predicting Fault-Prone Modules Using the Length of Identifiers

Identifiers such as variable names and function names in source code are essential information to understand code. The naming for identifiers affects on code understandability, thus, we expect that they affect on software quality. In this study, we examine the relationship between the length of identifiers and existence of software faults in a software module. The results of experiment using the random forest technique show that there is a positive relationship between the length of identifier and existence of software faults.

[1]  Thomas Zimmermann,et al.  When do changes induce fixes? On Fridays , 2005 .

[2]  Ahmed E. Hassan,et al.  Explaining software defects using topic models , 2012, 2012 9th IEEE Working Conference on Mining Software Repositories (MSR).

[3]  David W. Binkley,et al.  What’s in a Name? A Study of Identifiers , 2006, 14th IEEE International Conference on Program Comprehension (ICPC'06).

[4]  Andreas Zeller,et al.  When do changes induce fixes? , 2005, ACM SIGSOFT Softw. Eng. Notes.

[5]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[6]  Markus Pizka,et al.  Concise and consistent naming , 2005, 13th International Workshop on Program Comprehension (IWPC'05).

[7]  Thomas J. Ostrand,et al.  \{PROMISE\} Repository of empirical software engineering data , 2007 .

[8]  Yijun Yu,et al.  Relating Identifier Naming Flaws and Code Quality: An Empirical Study , 2009, 2009 16th Working Conference on Reverse Engineering.