Information Retrieval Applications in Software Maintenance and Evolution

There is a growing interest in creating tools that can assist engineers in all phases of the software life cycle. This assistance requires techniques that go beyond traditional static and dynamic analysis. An example of such a technique applies information retrieval (IR), which exploits information found in a project’s natural language. Such information can be extracted from the source code’s identifiers and comments and in artifacts associated with the project, such as the requirements. The techniques described pertain to the maintenance and evolution phase of the software life cycle and focus on such problems as feature location and impact analysis. These techniques highlight the bright future that IR brings to addressing software engineering problems.

[1]  Mariam Kamkar An overview and comparative classification of program slicing techniques , 1995, J. Syst. Softw..

[2]  Letha H. Etzkorn,et al.  An approach to program understanding by natural language understanding , 1999, Natural Language Engineering.

[3]  Giuliano Antoniol,et al.  Recovering code to documentation links in OO systems , 1999, Sixth Working Conference on Reverse Engineering (Cat. No.PR00303).

[4]  Giuliano Antoniol,et al.  Identifying the starting impact set of a maintenance request: a case study , 2000, Proceedings of the Fourth European Conference on Software Maintenance and Reengineering.

[5]  Andrian Marcus,et al.  Supporting program comprehension using semantic and structural information , 2001, Proceedings of the 23rd International Conference on Software Engineering. ICSE 2001.

[6]  Andrian Marcus,et al.  Recovering documentation-to-source-code traceability links using latent semantic indexing , 2003, 25th International Conference on Software Engineering, 2003. Proceedings..

[7]  Wei Zhao,et al.  SNIAFL: towards a static non-interactive approach to feature location , 2004, Proceedings. 26th International Conference on Software Engineering.

[8]  Giuliano Antoniol,et al.  An automatic approach to identify class evolution discontinuities , 2004 .

[9]  Andrian Marcus,et al.  An information retrieval approach to concept location in source code , 2004, 11th Working Conference on Reverse Engineering.

[10]  Gerardo Canfora,et al.  Impact analysis by mining software and change request repositories , 2005, 11th IEEE International Software Metrics Symposium (METRICS'05).

[11]  Tibor Gyimóthy,et al.  Empirical validation of object-oriented metrics on open source software for fault prediction , 2005, IEEE Transactions on Software Engineering.

[12]  Katsuhiko Gondow,et al.  Toward mining "concept keywords" from identifiers in large software projects , 2005, MSR.

[13]  Jane Huffman Hayes,et al.  A Framework for Comparing Requirements Tracing Experiments , 2005, Int. J. Softw. Eng. Knowl. Eng..

[14]  Genny Tortora,et al.  Can Information Retrieval Techniques Effectively Support Traceability Link Recovery? , 2006, 14th IEEE International Conference on Program Comprehension (ICPC'06).

[15]  Dawn J Lawrie,et al.  AN EMPIRICAL COMPARISON OF TECHNIQUES FOR EXTRACTING CONCEPT ABBREVIATIONS FROM IDENTIFIERS , 2006 .

[16]  Gerardo Canfora,et al.  Supporting change request assignment in open source development , 2006, SAC.

[17]  W. Bruce Croft,et al.  LDA-based document models for ad-hoc retrieval , 2006, SIGIR.

[18]  Gerardo Canfora,et al.  Fine grained indexing of software repositories to support impact analysis , 2006, MSR '06.

[19]  Gerardo Canfora,et al.  Identifying Changed Source Code Lines from Version Repositories , 2007, Fourth International Workshop on Mining Software Repositories (MSR'07:ICSE Workshops 2007).

[20]  Sushil Krishna Bajracharya,et al.  Mining Eclipse Developer Contributions via Author-Topic Models , 2007, Fourth International Workshop on Mining Software Repositories (MSR'07:ICSE Workshops 2007).

[21]  D. Binkley,et al.  Software Fault Prediction using Language Processing , 2007, Testing: Academic and Industrial Conference Practice and Research Techniques - MUTATION (TAICPART-MUTATION 2007).

[22]  Lerina Aversano,et al.  Learning from bug-introducing changes to prevent fault prone code , 2007, IWPSE '07.

[23]  Denys Poshyvanyk,et al.  Combining Formal Concept Analysis with Information Retrieval for Concept Location in Source Code , 2007, 15th IEEE International Conference on Program Comprehension (ICPC '07).

[24]  Yann-Gaël Guéhéneuc,et al.  Feature Location Using Probabilistic Ranking of Methods Based on Execution Scenarios and Information Retrieval , 2007, IEEE Transactions on Software Engineering.

[25]  Genny Tortora,et al.  Recovering traceability links in software artifact management systems using information retrieval methods , 2007, TSEM.

[26]  Stéphane Ducasse,et al.  Semantic clustering: Identifying topics in source code , 2007, Inf. Softw. Technol..

[27]  Osamu Mizuno,et al.  Spam Filter Based Approach for Finding Fault-Prone Software Modules , 2007, Fourth International Workshop on Mining Software Repositories (MSR'07:ICSE Workshops 2007).

[28]  Emily Hill,et al.  Analysing source code: looking for useful verbdirect object pairs in all the right places , 2008, IET Softw..

[29]  Santonu Sarkar,et al.  Mining business topics in source code using latent dirichlet allocation , 2008, ISEC '08.