Identifying class name inconsistency in hierarchy: a first simple heuristic

Giving good class names is an important task. Good programmers often report that they take several attempts to find an adequate one. Often programmers do not name consistently classes within a package, project or hierarchy. This is a problem because it hampers understanding the systems. In this article we present a simple heuristic (a distribution) to characterise class naming. We combine such a heuristic with structural information to identify inconsistent class names. In addition, we use this simple heuristic to give packages a shape. We applied such heuristic to 285 packages in Pharo to identify misnamed classes. Some of these misnamed classes are reported and discussed here.

[1]  Andrian Marcus,et al.  Recovering documentation-to-source-code traceability links using latent semantic indexing , 2003, 25th International Conference on Software Engineering, 2003. Proceedings..

[2]  Bruce McMillin,et al.  Software engineering: What is it? , 2018, 2018 IEEE Aerospace Conference.

[3]  Norman Wilde,et al.  The role of concepts in program comprehension , 2002, Proceedings 10th International Workshop on Program Comprehension.

[4]  Nicolas Anquetil,et al.  Legacy Software Restructuring: Analyzing a Concrete Case , 2011, 2011 15th European Conference on Software Maintenance and Reengineering.

[5]  Stéphane Ducasse,et al.  Semantic clustering: Identifying topics in source code , 2007, Inf. Softw. Technol..

[6]  Andrian Marcus,et al.  Identification of high-level concept clones in source code , 2001, Proceedings 16th Annual International Conference on Automated Software Engineering (ASE 2001).

[7]  Andrian Marcus,et al.  On the Use of Domain Terms in Source Code , 2008, 2008 16th IEEE International Conference on Program Comprehension.

[8]  Ahmed E. Hassan,et al.  A survey on the use of topic models when mining software repositories , 2015, Empirical Software Engineering.

[9]  Stéphane Ducasse,et al.  Enriching reverse engineering with semantic clustering , 2005, 12th Working Conference on Reverse Engineering (WCRE'05).

[10]  Stéphane Ducasse,et al.  A categorization of classes based on the visualization of their internal structure: the class blueprint , 2001, OOPSLA '01.

[11]  Einar W. Høst,et al.  Debugging Method Names , 2009, ECOOP.

[12]  David W. Binkley,et al.  What’s in a Name? A Study of Identifiers , 2006, 14th IEEE International Conference on Program Comprehension (ICPC'06).

[13]  Nicolas Anquetil,et al.  Assessing the relevance of identifier names in a legacy software system , 1998, CASCON.

[14]  Bogdan Dit,et al.  Feature location in source code: a taxonomy and survey , 2013, J. Softw. Evol. Process..