Comment on "an evaluation of query expansion by the addition of clustered terms for a document retrieval system"
暂无分享,去创建一个
I APPRECIATE the editor's invitation to supply comments concerning the interesting paper by Minker, Wilson and Zimmerman. To prevent a possible misinterpretation of the results contained in this paper, I would merely emphasize that one cannot conclude from these experiments that term clusters (or equivalently, keyword classifications or thesauruses) are not useful in retrieval. The authors have used a quite specific process to add to the original set of query terms, new terms related to the old ones, and obtained from an automatically produced term classification. The set of terms attached to the documents was left unchanged. The main effect was then a lengthening of the query vectors. Since the SMART system is used for search and retrieval purposes, a cosine function was probably used to obtain the similarity coefficient for each query-document pair. With longer query vectors, the magnitude of the cosine coefficient is likely to go down, everything else being equal, and improvements in retrieval performance are obtained only if the number of term matches between query terms and relevant document terms increases drastically, to compensate for the increased query tength. Alternative term cluster applications might be tested in the future, and more encouraging results might be obtained using one or more of the following strategies [1, 2]:
[1] Karen Sparck Jones. Automatic keyword classification for information retrieval , 1971 .
[2] Gerard Salton,et al. Experiments in Automatic Thesaurus Construction for Information Retrieval , 1971, IFIP Congress.