论文信息 - Relevance judgments between TREC and Non-TREC assessors

Relevance judgments between TREC and Non-TREC assessors

This paper investigates the agreement of relevance assessments between official TREC judgments and those generated from an interactive IR experiment. Results show that 63% of documents judged relevant by our users matched official TREC judgments. Several factors contributed to differences in the agreements: the number of retrieved relevant documents; the number of relevant documents judged; system effectiveness per topic and the ranking of relevant documents.

Mark Sanderson | Paul D. Clough | Azzah Al-Maskari

[1] Pertti Vakkari,et al. The influence of relevance levels on the effectiveness of interactive information retrieval , 2004, J. Assoc. Inf. Sci. Technol..

[2] Ellen M. Voorhees,et al. Variations in relevance judgments and the measurement of retrieval effectiveness , 1998, SIGIR '98.

[3] Falk Scholer,et al. User performance versus precision measures for simple search tasks , 2006, SIGIR.

[4] Jaana Kekäläinen,et al. IR evaluation methods for retrieving highly relevant documents , 2000, SIGIR '00.

[5] Ellen M. Voorhees. Variations in relevance judgments and the measurement of retrieval effectiveness , 2000, Inf. Process. Manag..

[6] Eero Sormunen,et al. Liberal relevance criteria of TREC -: counting on negligible documents? , 2002, SIGIR '02.