Analysis and Improvement on Retrieval Methods for Traceability Links between Source Code and Documentation
暂无分享,去创建一个
Software documentation is usually expressed in natural languages and free text,in which it captures large useful information.Establishing traceability links between documentation and source code can be helpful in Software Engineering Management.Currently,the recovery of traceability links is mostly based on information retrieval techniques,e.g.,probabilistic model,vector space model and Latent Semantic Indexing(LSI).But previous work only treats documentation and source code as plain text files without considering the features with respect to Software Engineering.Four enhancing strategies are proposed to improve the traditional LSI method based on the features of software documentation and source code,namely,source code clustering,identifiers classifying,similarity thesaurus and hierarchical structure enhancement.Experimental results show that the four enhancement strategies can increase the precision by about 15%.So,the special characteristics of documentation and source code should be considered carefully during the recovering traceability links between them.