Relevance Feedback for Best Match Term Weighting Algorithms in Information Retrieval

Personalisation in full text retrieval or full text filtering implies reweighting of the query terms based on some explicit or implicit feedback from the user. Relevance feedback inputs the user's judgements on previously retrieved documents to construct a personalised query or user profile. This paper studies relevance feedback within two probabilistic models of information retrieval: the first based on statistical language models and the second based on the binary independence probabilistic model. The paper shows the resemblance of the approaches to relevance feedback of these models, introduces new approaches to relevance feedback for both models, and evaluates the new relevance feedback algorithms on the TREC collection. The paper shows that there are no significant differences between simple and sophisticated approaches to relevance feedback.

[1]  J. J. Rocchio,et al.  Relevance feedback in information retrieval , 1971 .

[2]  Kenney Ng A Maximum Likelihood Ratio Information Retrieval Model , 1999, TREC.

[3]  Ellen M. Voorhees,et al.  Overview of the Seventh Text REtrieval Conference , 1998 .

[4]  Richard M. Schwartz,et al.  A hidden Markov model information retrieval system , 1999, SIGIR '99.

[5]  Djoerd Hiemstra,et al.  A probabilistic justification for using tf×idf term weighting in information retrieval , 2000, International Journal on Digital Libraries.

[6]  Stephen P. Harter,et al.  A probabilistic approach to automatic keyword indexing. Part II. An algorithm for probabilistic indexing , 1975, J. Am. Soc. Inf. Sci..

[7]  Stephen E. Robertson,et al.  Some simple effective approximations to the 2-Poisson model for probabilistic weighted retrieval , 1994, SIGIR '94.

[8]  S. Robertson The probability ranking principle in IR , 1997 .

[9]  Stephen E. Robertson,et al.  The TREC-8 Filtering Track Final Report , 1999, TREC.

[10]  Stephen E. Robertson,et al.  A probabilistic model of information retrieval: development and comparative experiments - Part 2 , 2000, Inf. Process. Manag..

[11]  Djoerd Hiemstra,et al.  A Linguistically Motivated Probabilistic Model of Information Retrieval , 1998, ECDL.

[12]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[13]  Gerard Salton,et al.  The SMART Retrieval System—Experiments in Automatic Document Processing , 1971 .

[14]  Stephen E. Robertson,et al.  Relevance weighting of search terms , 1976, J. Am. Soc. Inf. Sci..

[15]  Djoerd Hiemstra,et al.  Twenty-One at TREC-8: using Language Technology for Information Retrieval , 1999, TREC.

[16]  Djoerd Hiemstra,et al.  Using language models for information retrieval , 2001 .