A new statistical strategy for pooling: ELI
暂无分享,去创建一个
[1] James Allan,et al. Minimal test collections for retrieval evaluation , 2006, SIGIR.
[2] James Allan,et al. Evaluation over thousands of queries , 2008, SIGIR '08.
[3] Emine Yilmaz,et al. A simple and efficient sampling method for estimating AP and NDCG , 2008, SIGIR '08.
[4] Justin Zobel,et al. How reliable are the results of large-scale information retrieval experiments? , 1998, SIGIR '98.
[5] Pu Li,et al. Test theory for assessing IR test collections , 2007, SIGIR.
[6] Tetsuya Sakai,et al. Alternatives to Bpref , 2007, SIGIR.
[7] Ellen M. Voorhees,et al. Retrieval evaluation with incomplete information , 2004, SIGIR '04.
[8] Ian Soboroff,et al. Ranking retrieval systems without relevance judgments , 2001, SIGIR '01.
[9] Emine Yilmaz,et al. A statistical method for system evaluation using incomplete judgments , 2006, SIGIR.
[10] Javed A. Aslam,et al. A unified model for metasearch, pooling, and system evaluation , 2003, CIKM '03.
[11] Ben Carterette,et al. Million Query Track 2007 Overview , 2008, TREC.
[12] Charles L. A. Clarke,et al. Efficient construction of large test collections , 1998, SIGIR '98.
[13] Mark Sanderson,et al. Forming test collections with no system pooling , 2004, SIGIR '04.
[14] Cyril W. Cleverdon,et al. The significance of the Cranfield tests on index languages , 1991, SIGIR '91.
[15] Alistair Moffat,et al. Strategic system comparisons via targeted relevance judgments , 2007, SIGIR.
[16] Ingemar J. Cox,et al. Prioritizing relevance judgments to improve the construction of IR test collections , 2011, CIKM '11.
[17] C. J. van Rijsbergen,et al. Report on the need for and provision of an 'ideal' information retrieval test collection , 1975 .
[18] Ellen M. Voorhees,et al. Variations in relevance judgments and the measurement of retrieval effectiveness , 1998, SIGIR '98.