A simple and efficient sampling method for estimating AP and NDCG
暂无分享,去创建一个
[1] James Allan,et al. Minimal test collections for retrieval evaluation , 2006, SIGIR.
[2] F ChenStanley,et al. An Empirical Study of Smoothing Techniques for Language Modeling , 1996, ACL.
[3] Ian Soboroff,et al. A comparison of pooled and sampled relevance judgments , 2007, EVIA@NTCIR.
[4] Jaana Kekäläinen,et al. Cumulated gain-based evaluation of IR techniques , 2002, TOIS.
[5] Rajesh Shenoy,et al. On the robustness of relevance measures with incomplete judgments , 2007, SIGIR.
[6] D. K. Harmon,et al. Overview of the Third Text Retrieval Conference (TREC-3) , 1996 .
[7] Emine Yilmaz,et al. A statistical method for system evaluation using incomplete judgments , 2006, SIGIR.
[8] Tetsuya Sakai,et al. Alternatives to Bpref , 2007, SIGIR.
[9] Alistair Moffat,et al. Strategic system comparisons via targeted relevance judgments , 2007, SIGIR.
[10] B. E. Eckbo,et al. Appendix , 1826, Epilepsy Research.
[11] Paul Over,et al. The TREC VIdeo Retrieval Evaluation (TRECVID): A Case Study and Status Report , 2004, RIAO.
[12] Cyril Cleverdon,et al. The Cranfield tests on index language devices , 1997 .
[13] Charles L. A. Clarke,et al. The TREC 2006 Terabyte Track , 2006, TREC.
[14] Ellen M. Voorhees,et al. Retrieval evaluation with incomplete information , 2004, SIGIR '04.
[15] Emine Yilmaz,et al. Estimating average precision with incomplete and imperfect judgments , 2006, CIKM '06.