论文信息 - A Practical Sampling Strategy for Efficient Retrieval Evaluation

A Practical Sampling Strategy for Efficient Retrieval Evaluation

We consider the problem of large-scale retrieval evaluation, with a focus on the considerable effort required to judge tens of thousands of documents using traditional test collection construction methodologies. Recently, two methods based on random sampling were proposed to help alleviate this burden: While the first method proposed by Aslam et al. is very accurate and efficient, it is also very complex, and while the second method proposed by Yilmaz et al. is relatively simple, its accuracy and efficiency are significantly lower than the former. In this work, we propose a new method for large-scale retrieval evaluation based on random sampling which combines the strengths of each of the above methods: it maintains the simplicity of the Yilmaz et al. method while achieving the performance of the Aslam et al. method. Furthermore, we demonstrate that this new sampling method can be adapted to incorporate both randomly sampled and fixed relevance judgments, as were available in the most recent TREC Terabyte track, for example.

J. Aslam | Virgil Pavlu

[1] W. L. Stevens,et al. Sampling Without Replacement with Probability Proportional to Size , 1958 .

[2] Editors , 1986, Brain Research Bulletin.

[3] J. Rice. Mathematical Statistics and Data Analysis , 1988 .

[4] Donna K. Harman,et al. Overview of the Third Text REtrieval Conference (TREC-3) , 1995, TREC.

[5] Cyril Cleverdon,et al. The Cranfield tests on index language devices , 1997 .

[6] Charles L. A. Clarke,et al. Efficient construction of large test collections , 1998, SIGIR '98.

[7] Donna K. Harman,et al. Overview of the Eighth Text REtrieval Conference (TREC-8) , 1999, TREC.

[8] Ian Soboroff,et al. Ranking retrieval systems without relevance judgments , 2001, SIGIR '01.

[9] Susan T. Dumais,et al. Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval , 2004, SIGIR 2004.

[10] Javed A. Aslam,et al. A unified model for metasearch, pooling, and system evaluation , 2003, CIKM '03.

[11] N. Butt. Sampling with Unequal Probabilities , 2003 .