论文信息 - Finding content-bearing terms using term similarities

Finding content-bearing terms using term similarities

This paper explores the issue of using different co-occurrence similarities between terms for separating query terms that are useful for retrieval from those that are harmful. The hypothesis under examination is that useful terms tend to be more similar to each other than to other query terms. Preliminary experiments with similarities computed using first-order and second-order co-occurrence seem to confirm the hypothesis. Term similarities could then be used for determining which query terms are useful and best reflect the user's information need. A possible application would be to use this source of evidence for tuning the weights of the query terms.

Justin Picard | Justin Picard

[1] Alan F. Smeaton,et al. The Retrieval Effects of Query Expansion on a Feedback Document Retrieval System , 1983, Comput. J..

[2] David Yarowsky,et al. Word-Sense Disambiguation Using Statistical Models of Roget’s Categories Trained on Large Corpora , 2010, COLING.

[3] Hinrich Schütze,et al. A Cooccurrence-Based Thesaurus and Two Applications to Information Retrieval , 1994, Inf. Process. Manag..

[4] W. Bruce Croft,et al. Lexical ambiguity and information retrieval , 1992, TOIS.

[5] Van Rijsbergen,et al. A theoretical basis for the use of co-occurence data in information retrieval , 1977 .

[6] Peter Willett,et al. The limitations of term co-occurrence data for query expansion in document retrieval systems , 1991, J. Am. Soc. Inf. Sci..

[7] W. Bruce Croft,et al. Query expansion using local and global document analysis , 1996, SIGIR '96.

[8] K. Sparck Jones,et al. A TEST FOR THE SEPARATION OF RELEVANT AND NON‐RELEVANT DOCUMENTS IN EXPERIMENTAL RETRIEVAL COLLECTIONS , 1973 .