The Cranfield II Relevance Assessments: A Critical Evaluation

The relevance assessments belonging to the Cranfield II document/query collection are shown to be faulty, in the sense that "many" relevant documents were not so identified by the Cranfield judges. The implications of these omissions for the evaluation of information retrieval experiments based on the Cranfield collection are examined in detail. It is shown that numerical measures of retrieval effectiveness may be greatly altered by consideration of the "missing" relevant documents and that a ranking of retrieval methods according to order of performance may vary as well.