Good Evaluation Measures based on Document Preferences
暂无分享,去创建一个
[1] Peter Bailey,et al. Incorporating User Expectations and Behavior into the Measurement of Search Effectiveness , 2017, ACM Trans. Inf. Syst..
[2] Tetsuya Sakai,et al. Randomised vs. Prioritised Pools for Relevance Assessments: Sample Size Considerations , 2019, AIRS.
[3] Andrew Turpin,et al. Do batch and user evaluations give the same results? , 2000, SIGIR '00.
[4] Tetsuya Sakai,et al. Alternatives to Bpref , 2007, SIGIR.
[5] Fernando Diaz,et al. Contextual and dimensional relevance judgments for reusable SERP-level evaluation , 2014, WWW.
[6] Peter Schäuble,et al. Determining the effectiveness of retrieval algorithms , 1991, Inf. Process. Manag..
[7] Tetsuya Sakai,et al. Evaluating diversified search results using per-intent graded relevance , 2011, SIGIR.
[8] Alan Halverson,et al. Generating labels from clicks , 2009, WSDM '09.
[9] Gabriella Kazai,et al. User intent and assessor disagreement in web search evaluation , 2013, CIKM.
[10] Filip Radlinski,et al. Evaluating the accuracy of implicit feedback from clicks and query reformulations in Web search , 2007, TOIS.
[11] Tetsuya Sakai. Evaluation with informational and navigational intents , 2012, WWW.
[12] Tetsuya Sakai,et al. Diversified search evaluation: lessons from the NTCIR-9 INTENT task , 2012, Information Retrieval.
[13] Tetsuya Sakai,et al. Ranking Rich Mobile Verticals based on Clicks and Abandonment , 2017, CIKM.
[14] Olivier Chapelle,et al. Expected reciprocal rank for graded relevance , 2009, CIKM.
[15] Falk Scholer,et al. User performance versus precision measures for simple search tasks , 2006, SIGIR.
[16] Alistair Moffat,et al. Rank-biased precision for measurement of retrieval effectiveness , 2008, TOIS.
[17] Dirk Lewandowski,et al. What Users See - Structures in Search Engine Results Pages , 2009, Inf. Sci..
[18] Tetsuya Sakai,et al. Metrics, Statistics, Tests , 2013, PROMISE Winter School.
[19] Falk Scholer,et al. Metric and Relevance Mismatch in Retrieval Evaluation , 2009, AIRS.
[20] Norman Cliff,et al. Confidence intervals for Kendall's tau , 1997 .
[21] Jaana Kekäläinen,et al. Cumulated gain-based evaluation of IR techniques , 2002, TOIS.
[22] Xiaojie Yuan,et al. Are click-through data adequate for learning web search rankings? , 2008, CIKM '08.
[23] Ben Carterette,et al. Using preference judgments for novel document retrieval , 2012, SIGIR '12.
[24] Mark Sanderson,et al. Do user preferences and evaluation measures line up? , 2010, SIGIR.
[25] Yong Yu,et al. Select-the-Best-Ones: A new way to judge relative relevance , 2011, Inf. Process. Manag..
[26] Yiqun Liu,et al. When does Relevance Mean Usefulness and User Satisfaction in Web Search? , 2016, SIGIR.
[27] Ben Carterette,et al. Preference based evaluation measures for novelty and diversity , 2013, SIGIR.