Information retrieval models for recovering traceability links between code and documentation

The research described in the paper is concerned with the application of information retrieval to software maintenance, and in particular to the problem of recovering traceability links between the source code of a system and its free text documentation. We introduce a method based on the general idea of vector space information retrieval and apply it in two case studies to trace C++ source code onto manual pages and Java code onto functional requirements. The case studies discussed in the paper replicate the studies presented by G. Antoniol et al. (1999; 2000), respectively where a probabilistic information retrieval model was applied. We compare the results of vector space and probabilistic models and formulate hypotheses to explain the differences.

[1]  Thomas M. Cover,et al.  Elements of Information Theory (Wiley Series in Telecommunications and Signal Processing) , 2006 .

[2]  Michael L. Begeman,et al.  gIBIS: a hypertext tool for exploratory policy discussion , 1988, CSCW '88.

[3]  Giuliano Antoniol,et al.  Recovering code to documentation links in OO systems , 1999, Sixth Working Conference on Reverse Engineering (Cat. No.PR00303).

[4]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[5]  William B. Frakes,et al.  Software reuse through information retrieval , 1986, SIGF.

[6]  Renato De Mori,et al.  Spoken Dialogues with Computers , 1998 .

[7]  Ricardo Baeza-Yates,et al.  Information Retrieval: Data Structures and Algorithms , 1992 .

[8]  Donna K. Harman,et al.  Ranking Algorithms , 1992, Information Retrieval: Data Structures & Algorithms.

[9]  D. Austin AN INFORMATION RETRIEVAL LANGUAGE FOR MARC , 1970 .

[10]  Giuliano Antoniol,et al.  Identifying the starting impact set of a maintenance request: a case study , 2000, Proceedings of the Fourth European Conference on Software Maintenance and Reengineering.

[11]  Malcolm Munro,et al.  An early impact analysis technique for software maintenance , 1994, J. Softw. Maintenance Res. Pract..

[12]  Susan P. Arnold,et al.  The Reuse System: Cataloging and Retrieval of Reusable Software , 1988, IEEE Computer Society International Conference.

[13]  Joseph A. Goguen,et al.  An Object-Oriented Tool for Tracing Requirements , 1996, IEEE Softw..

[14]  Gerard Salton,et al.  Term-Weighting Approaches in Automatic Text Retrieval , 1988, Inf. Process. Manag..

[15]  Bruce A. Burton,et al.  The Reusable Software Library , 1987, IEEE Software.

[16]  Gail E. Kaiser,et al.  An Information Retrieval Approach For Automatically Constructing Software Libraries , 1991, IEEE Trans. Software Eng..

[17]  Giuliano Antoniol,et al.  Tracing object-oriented code into functional requirements , 2000, Proceedings IWPC 2000. 8th International Workshop on Program Comprehension.

[18]  Vasant Dhar,et al.  Supporting Systems Development by Capturing Deliberations During Requirements Engineering , 1992, IEEE Trans. Software Eng..