A Framework for Examining Topical Locality in Object-Oriented Software

The software entities of an object-oriented system should be organized in such a way that "spatial relatedness entails semantic relatedness". We refer this as the tenet of "topical locality" and argue that it is fundamental for the code base to be navigable. In this paper, we propose a novel experimental framework to test this key tenet and use large-scale open-source projects to assess three relationships. In particular, we find that: (1) class name along with header comments conveys class body's topic; (2) a code line is indicative of its surroundings; and (3) a contiguous code fragment may serve as a snapshot of the entire class. Our work not only shows the foundations necessary for the success of many code navigation approaches, but also opens avenues for further tool enhancements.

[1]  Joseph J. LaViola,et al.  Code bubbles: a working set-based interface for code understanding and maintenance , 2010, CHI.

[2]  Katsuro Inoue,et al.  MUDABlue: an automatic categorization system for open source repositories , 2004, 11th Asia-Pacific Software Engineering Conference.

[3]  C. Elkan,et al.  Topic Models , 2008 .

[4]  L. Leemis Applied Linear Regression Models , 1991 .

[5]  Gregor Kiczales,et al.  Aspect-oriented programming , 2001, ESEC/FSE-9.

[6]  Andrian Marcus,et al.  Supporting program comprehension with source code summarization , 2010, 2010 ACM/IEEE 32nd International Conference on Software Engineering.

[7]  Nora Broy,et al.  Dynamic Software Visualization with BusyBorg - A Proof of Concept , 2011, 2011 IEEE 35th Annual Computer Software and Applications Conference.

[8]  João Araújo,et al.  Searching for Opportunities of Refactoring Sequences: Reducing the Search Space , 2008, 2008 32nd Annual IEEE International Computer Software and Applications Conference.

[9]  Brian D. Davison Topical locality in the Web , 2000, SIGIR '00.

[10]  Christopher D. Manning,et al.  Introduction to Information Retrieval , 2010, J. Assoc. Inf. Sci. Technol..

[11]  David Kortenkamp,et al.  Prototypes, Location, and Associative Networks (PLAN): Towards a Unified Theory of Cognitive Mapping , 1995, Cogn. Sci..

[12]  J. Neter,et al.  Applied Linear Regression Models , 1983 .

[13]  Claes Wohlin,et al.  Experimentation in software engineering: an introduction , 2000 .

[14]  Jonathan Sillito,et al.  Searching and skimming: An exploratory study , 2009, 2009 IEEE International Conference on Software Maintenance.

[15]  Nan Niu,et al.  Source code indexing for automated tracing , 2011, TEFSE '11.

[16]  Anthony Cox,et al.  Theoretical Considerations on Navigating Codespace with Spatial Cognition , 2005, PPIG.

[17]  Leon Moonen,et al.  Exploring software systems , 2003, International Conference on Software Maintenance, 2003. ICSM 2003. Proceedings..

[18]  Ted J. Biggerstaff,et al.  Program understanding and the concept assignment problem , 1994, CACM.

[19]  Thomas R. G. Green,et al.  Programming plans, imagery, and visual programming , 1995, INTERACT.

[20]  Ahmed E. Hassan,et al.  Validating the Use of Topic Models for Software Evolution , 2010, 2010 10th IEEE Working Conference on Source Code Analysis and Manipulation.

[21]  Ilka Philippow,et al.  Searching Design Patterns in Source Code , 2005, COMPSAC.