论文信息 - A simple probabilistic model for the relevance assessment of documents

A simple probabilistic model for the relevance assessment of documents

Abstract When assessing the relevance of documents, different jurors usually do not completely agree. A simple model is set up to take this fact into account by assuming that the relevance assigned by the juror is a random variable. It leads to some interesting conclusions: The worst possible method to assess the relevance is a mere bisection into relevant and irrelevant. Even an ideal system cannot consistently find all relevant documents and only those, which is empirically well known. The retrieval system should also assign a measure of relevance rather than divide the set of all documents only into those found and those not found; in particular, Boolean operations should be supplemented by a ranking algorithm.

Friedrich Gebhardt | F. Gebhardt

[1] Aviezri S. Fraenkel,et al. Legal Information Retrieval , 1968, Adv. Comput..

[2] Michael E. Lesk,et al. Computer Evaluation of Indexing and Text Processing , 1968, JACM.

[3] Stephen P. Harter,et al. The Cranfield II Relevance Assessments: A Critical Evaluation , 1971, The Library Quarterly.

[4] Tefko Saracevic. Selected results from an inquiry into testing of information retrieval systems , 1971 .

[5] Gerard Salton,et al. The SMART Retrieval System—Experiments in Automatic Document Processing , 1971 .