Offline Evaluation without Gain
暂无分享,去创建一个
[1] Xueqi Cheng,et al. Top-k learning to rank: labeling, ranking and evaluation , 2012, SIGIR '12.
[2] Ben Carterette,et al. System effectiveness, user models, and user utility: a conceptual framework for investigation , 2011, SIGIR.
[3] Alistair Moffat,et al. Pairwise Crowd Judgments: Preference, Absolute, and Ratio , 2018, ADCS.
[4] Peter Bailey,et al. Relevance assessment: are judges exchangeable and does it matter , 2008, SIGIR '08.
[5] Sergei Vassilvitskii,et al. Generalized distances between rankings , 2010, WWW '10.
[6] Olivier Chapelle,et al. Expected reciprocal rank for graded relevance , 2009, CIKM.
[7] Mark Sanderson,et al. Do user preferences and evaluation measures line up? , 2010, SIGIR.
[8] Dietmar Jannach,et al. Are we really making much progress? A worrying analysis of recent neural recommendation approaches , 2019, RecSys.
[9] Jimmy J. Lin,et al. The Neural Hype and Comparisons Against Weak Baselines , 2019, SIGIR Forum.
[10] Nir Ailon,et al. Ranking from pairs and triplets: information quality, evaluation methods and query complexity , 2011, WSDM '11.
[11] Charles L. A. Clarke,et al. A Family of Rank Similarity Measures Based on Maximized Effectiveness Difference , 2015, IEEE Transactions on Knowledge and Data Engineering.
[12] Ellen M. Voorhees,et al. Variations in relevance judgments and the measurement of retrieval effectiveness , 1998, SIGIR '98.
[13] Charles L. A. Clarke,et al. Offline Evaluation by Maximum Similarity to an Ideal Ranking , 2020, CIKM.
[14] Massimo Melucci,et al. Weighted Rank Correlation in Information Retrieval Evaluation , 2009, AIRS.
[15] Falk Scholer,et al. On Crowdsourcing Relevance Magnitudes for Information Retrieval Evaluation , 2017, ACM Trans. Inf. Syst..
[16] Alistair Moffat,et al. A similarity measure for indefinite rankings , 2010, TOIS.
[17] Chris Buckley,et al. Topic prediction based on comparative retrieval rankings , 2004, SIGIR '04.
[18] Tetsuya Sakai,et al. Alternatives to Bpref , 2007, SIGIR.
[19] J. Shane Culpepper,et al. Query Driven Algorithm Selection in Early Stage Retrieval , 2018, WSDM.
[20] Mingxuan Sun,et al. Visualizing differences in web search algorithms using the expected weighted hoeffding distance , 2010, WWW '10.
[21] Peter Schäuble,et al. Determining the effectiveness of retrieval algorithms , 1991, Inf. Process. Manag..
[22] Milad Shokouhi,et al. Expected browsing utility for web search evaluation , 2010, CIKM.
[23] Mark E. Rorvig,et al. The Simple Scalability of Documents. , 1990 .
[24] Alistair Moffat,et al. Rank-biased precision for measurement of retrieval effectiveness , 2008, TOIS.
[25] Paul N. Bennett,et al. Pairwise ranking aggregation in a crowdsourced setting , 2013, WSDM.
[26] Charles L. A. Clarke,et al. Assessing Top- Preferences , 2020, ACM Trans. Inf. Syst..