Facilitating Scenario-Based Program Comprehension with Topic Models

Researchers and practitioners have been seeking automatic and semi-automatic approaches to support program comprehension. However, not too much attention has been given to the discussion about program comprehension scenarios and further exploration based on scenarios. In this paper, we explored program comprehension from the perspective of developers, analyzed the demands of developers, refined two program comprehension scenarios (Program Users Scenario and Program Owners Scenario), and mainly researched on the latter. In the Program Users Scenario, where developers need help to quickly understand a program and be able to use it fast, we found that topic modeling provides a promising way to facilitate program comprehension. Using topic modeling, features and structures can be discovered automatically from textual software assets. We also developed JSEA, a tool that provides semi-automatic program comprehension assistance. JSEA utilizes essential information automatically generated from Java projects to construct a project overview and give developers search capability. Experiments with 12 volunteers on two open source Java projects suggest that JSEA can support Java developers in comprehending programs in the Program Users Scenario.

[1]  Zhendong Su,et al.  On the naturalness of software , 2012, ICSE 2012.

[2]  Ahmed E. Hassan,et al.  A survey on the use of topic models when mining software repositories , 2015, Empirical Software Engineering.

[3]  Alberto Bacchelli,et al.  Expectations, outcomes, and challenges of modern code review , 2013, 2013 35th International Conference on Software Engineering (ICSE).

[4]  Thomas Leich,et al.  Do background colors improve program comprehension in the #ifdef hell? , 2012, Empirical Software Engineering.

[5]  Susan Wiedenbeck,et al.  An exploratory study of program comprehension strategies of procedural and object-oriented programmers , 2001, Int. J. Hum. Comput. Stud..

[6]  Sushil Krishna Bajracharya,et al.  A theory of aspects as latent topics , 2008, OOPSLA.

[7]  Stanley Letovsky,et al.  Cognitive processes in program comprehension , 1986, J. Syst. Softw..

[8]  Nicholas A. Kraft,et al.  A case study of program comprehension effort and technical debt estimations , 2016, 2016 IEEE 24th International Conference on Program Comprehension (ICPC).

[9]  Hirohide Haga,et al.  A Novel Approach to Program Comprehension Process Using Slicing Techniques , 2016, J. Comput..

[10]  Diomidis Spinellis,et al.  Refactoring--Does It Improve Software Quality? , 2007, Fifth International Workshop on Software Quality (WoSQ'07: ICSE Workshops 2007).

[11]  C. Elkan,et al.  Topic Models , 2008 .

[12]  Yan Liu,et al.  JSEA: A Program Comprehension Tool Adopting LDA-based Topic Modeling , 2017 .

[13]  Ben Shneiderman,et al.  Syntactic/semantic interactions in programmer behavior: A model and experimental results , 1979, International Journal of Computer & Information Sciences.

[14]  Thomas A. Corbi,et al.  Program Understanding: Challenge for the 1990s , 1989, IBM Syst. J..

[15]  Rainer Koschke,et al.  On the Comprehension of Program Comprehension , 2014, TSEM.

[16]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[17]  Ruven E. Brooks,et al.  Towards a Theory of the Comprehension of Computer Programs , 1983, Int. J. Man Mach. Stud..

[18]  Panagiotis Louridas,et al.  Static code analysis , 2006, IEEE Software.

[19]  Stephen W. Thomas Mining Unstructured Software Repositories Using IR Models , 2012 .

[20]  Arie van Deursen,et al.  A Systematic Survey of Program Comprehension through Dynamic Analysis , 2008, IEEE Transactions on Software Engineering.

[21]  David B. Skillicorn,et al.  Using heuristics to estimate an appropriate number of latent topics in source code analysis , 2013, Sci. Comput. Program..