On the use of relevance feedback in IR-based concept location

Concept location is a critical activity during software evolution as it produces the location where a change is to start in response to a modification request, such as, a bug report or a new feature request. Lexical-based concept location techniques rely on matching the text embedded in the source code to queries formulated by the developers. The efficiency of such techniques is strongly dependent on the ability of the developer to write good queries. We propose an approach to augment information retrieval (IR) based concept location via an explicit relevance feedback (RF) mechanism. RF is a two-part process in which the developer judges existing results returned by a search and the IR system uses this information to perform a new search, returning more relevant information to the user. A set of case studies performed on open source software systems reveals the impact of RF on IR based concept location.

[1]  Giuliano Antoniol,et al.  Identifying the starting impact set of a maintenance request: a case study , 2000, Proceedings of the Fourth European Conference on Software Maintenance and Reengineering.

[2]  Martin P. Robillard,et al.  Topology analysis of software dependencies , 2008, TSEM.

[3]  Genny Tortora,et al.  Recovering traceability links in software artifact management systems using information retrieval methods , 2007, TSEM.

[4]  Andrian Marcus,et al.  An information retrieval approach to concept location in source code , 2004, 11th Working Conference on Reverse Engineering.

[5]  Ted J. Biggerstaff,et al.  The concept assignment problem in program understanding , 1993, [1993] Proceedings Working Conference on Reverse Engineering.

[6]  Andrian Marcus,et al.  Static techniques for concept location in object-oriented code , 2005, 13th International Workshop on Program Comprehension (IWPC'05).

[7]  Janice Singer,et al.  Hipikat: a project memory for software development , 2005, IEEE Transactions on Software Engineering.

[8]  Alfred V. Aho,et al.  CERBERUS: Tracing Requirements to Source Code Using Information Retrieval, Dynamic Analysis, and Program Analysis , 2008, 2008 16th IEEE International Conference on Program Comprehension.

[9]  Václav Rajlich,et al.  Incremental change in object-oriented programming , 2004, IEEE Software.

[10]  A. Dekhtyar,et al.  Make the Most of Your Time: How Should the Analyst Work with Automated Traceability Tools? , 2007, Third International Workshop on Predictor Models in Software Engineering (PROMISE'07: ICSE Workshops 2007).

[11]  Jane Huffman Hayes,et al.  Advancing candidate link generation for requirements tracing: the study of methods , 2006, IEEE Transactions on Software Engineering.

[12]  Andrea De Lucia,et al.  Incremental Approach and User Feedbacks: a Silver Bullet for Traceability Recovery , 2006, 2006 22nd IEEE International Conference on Software Maintenance.

[13]  Letha H. Etzkorn,et al.  Source Code Retrieval for Bug Localization Using Latent Dirichlet Allocation , 2008, 2008 15th Working Conference on Reverse Engineering.

[14]  Denys Poshyvanyk,et al.  Source Code Exploration with Google , 2006, 2006 22nd IEEE International Conference on Software Maintenance.

[15]  Arie van Deursen,et al.  Can LSI help reconstructing requirements traceability in design and test? , 2006, Conference on Software Maintenance and Reengineering (CSMR'06).

[16]  Hinrich Schütze,et al.  Introduction to information retrieval , 2008 .

[17]  Andrian Marcus,et al.  Recovery of Traceability Links between Software Documentation and Source Code , 2005, Int. J. Softw. Eng. Knowl. Eng..

[18]  Gerardo Canfora,et al.  Impact analysis by mining software and change request repositories , 2005, 11th IEEE International Software Metrics Symposium (METRICS'05).

[19]  J.A. Gomez,et al.  Locating user functionality in old code , 1992, Proceedings Conference on Software Maintenance 1992.

[20]  Gerard Salton,et al.  The SMART Retrieval System—Experiments in Automatic Document Processing , 1971 .

[21]  Denys Poshyvanyk,et al.  Combining Formal Concept Analysis with Information Retrieval for Concept Location in Source Code , 2007, 15th IEEE International Conference on Program Comprehension (ICPC '07).

[22]  Wei Zhao,et al.  SNIAFL: towards a static non-interactive approach to feature location , 2004, Proceedings. 26th International Conference on Software Engineering.

[23]  Giuliano Antoniol,et al.  Recovering Traceability Links between Code and Documentation , 2002, IEEE Trans. Software Eng..

[24]  J. J. Rocchio,et al.  Relevance feedback in information retrieval , 1971 .

[25]  Genny Tortora,et al.  Can Information Retrieval Techniques Effectively Support Traceability Link Recovery? , 2006, 14th IEEE International Conference on Program Comprehension (ICPC'06).

[26]  Thorsten Joachims,et al.  A Probabilistic Analysis of the Rocchio Algorithm with TFIDF for Text Categorization , 1997, ICML.

[27]  Yann-Gaël Guéhéneuc,et al.  Feature Location Using Probabilistic Ranking of Methods Based on Execution Scenarios and Information Retrieval , 2007, IEEE Transactions on Software Engineering.

[28]  Denys Poshyvanyk,et al.  Feature location via information retrieval based filtering of a single scenario execution trace , 2007, ASE.

[29]  Peter W. Foltz,et al.  An introduction to latent semantic analysis , 1998 .

[30]  Michael McGill,et al.  Introduction to Modern Information Retrieval , 1983 .