Using Replicates in Information Retrieval Evaluation
暂无分享,去创建一个
[1] Leonid Boytsov,et al. Deciding on an adjustment for multiplicity in IR experiments , 2013, SIGIR.
[2] Ellen M. Voorhees,et al. Retrieval System Evaluation , 2005 .
[3] Cyril Cleverdon,et al. The Cranfield tests on index language devices , 1997 .
[4] Ben Carterette,et al. Multiple testing in statistical analysis of systems-based information retrieval experiments , 2012, TOIS.
[5] Alistair Moffat,et al. EvaluatIR: an online tool for evaluating and comparing IR systems , 2009, SIGIR.
[6] Ying Zhang,et al. Differences in effectiveness across sub-collections , 2012, CIKM.
[7] Robert Tibshirani,et al. An Introduction to the Bootstrap , 1994 .
[8] Ellen M. Voorhees,et al. The Philosophy of Information Retrieval Evaluation , 2001, CLEF.
[9] Alistair Moffat,et al. Statistical power in retrieval experimentation , 2008, CIKM '08.
[10] Mark Sanderson,et al. Test Collection Based Evaluation of Information Retrieval Systems , 2010, Found. Trends Inf. Retr..
[11] Peter Bailey,et al. User Variability and IR System Evaluation , 2015, SIGIR.
[12] Y. Benjamini,et al. Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .
[13] Alistair Moffat,et al. Score standardization for inter-collection comparison of retrieval systems , 2008, SIGIR '08.
[14] Gordon V. Cormack,et al. Statistical precision of information retrieval evaluation , 2006, SIGIR.
[15] Alistair Moffat,et al. Rank-biased precision for measurement of retrieval effectiveness , 2008, TOIS.
[16] David A. Hull. Using statistical testing in the evaluation of retrieval experiments , 1993, SIGIR.
[17] Tetsuya Sakai,et al. A Simple and Effective Approach to Score Standardisation , 2016, ICTIR.
[18] Ellen M. Voorhees,et al. Evaluating evaluation measure stability , 2000, Annual International ACM SIGIR Conference on Research and Development in Information Retrieval.
[19] Paul Over,et al. Blind Men and Elephants: Six Approaches to TREC data , 1999, Information Retrieval.
[20] Tetsuya Sakai,et al. Statistical reform in information retrieval? , 2014, SIGF.
[21] K. Sparck Jones,et al. INFORMATION RETRIEVAL TEST COLLECTIONS , 1976 .
[22] Noriko Kando,et al. On information retrieval metrics designed for evaluation with incomplete relevance assessments , 2008, Information Retrieval.
[23] Ben Carterette,et al. System effectiveness, user models, and user utility: a conceptual framework for investigation , 2011, SIGIR.
[24] Jacques Savoy,et al. Statistical inference in retrieval effectiveness evaluation , 1997, Inf. Process. Manag..
[25] J. Neter,et al. Applied Linear Regression Models , 1983 .
[26] Ben Carterette. Bayesian Inference for Information Retrieval Evaluation , 2015, ICTIR.
[27] Emine Yilmaz,et al. A simple and efficient sampling method for estimating AP and NDCG , 2008, SIGIR '08.
[28] Mark Sanderson,et al. Information retrieval system evaluation: effort, sensitivity, and reliability , 2005, SIGIR '05.
[29] Alistair Moffat,et al. Users versus models: what observation tells us about effectiveness metrics , 2013, CIKM.
[30] Jaana Kekäläinen,et al. Cumulated gain-based evaluation of IR techniques , 2002, TOIS.
[31] Ben Carterette,et al. Simulating simple user behavior for system effectiveness evaluation , 2011, CIKM '11.
[32] Stephen Robertson,et al. ON DOCUMENT POPULATIONS AND MEASURES OF IR EFFECTIVENESS , 2007 .
[33] Stephen E. Robertson,et al. On per-topic variance in IR evaluation , 2012, SIGIR '12.
[34] James Allan,et al. A comparison of statistical significance tests for information retrieval evaluation , 2007, CIKM '07.