Exploiting user disagreement for web search evaluation: an experimental approach
暂无分享,去创建一个
[1] Charles L. A. Clarke,et al. Modeling user variance in time-biased gain , 2012, HCIR '12.
[2] Djoerd Hiemstra,et al. Federated search in the wild: the combined power of over a hundred search engines , 2012, CIKM '12.
[3] Dong Nguyen,et al. Overview of the TREC 2013 Federated Web Search Track (draft) , 2013 .
[4] Djoerd Hiemstra,et al. Overview of the TREC 2014 Federated Web Search Track , 2013, TREC.
[5] Sreenivas Gollapudi,et al. Diversifying search results , 2009, WSDM '09.
[6] Charles L. A. Clarke,et al. Overview of the TREC 2010 Web Track , 2010, TREC.
[7] Milad Shokouhi,et al. Expected browsing utility for web search evaluation , 2010, CIKM.
[8] C. Buckley,et al. Overview of the TREC 2010 Relevance Feedback Track ( Notebook ) , 2010 .
[9] Jaana Kekäläinen,et al. Using graded relevance assessments in IR evaluation , 2002, J. Assoc. Inf. Sci. Technol..
[10] Ben Carterette,et al. The effect of assessor error on IR system evaluation , 2010, SIGIR.
[11] Efthimis N. Efthimiadis,et al. Analyzing and evaluating query reformulation strategies in web search logs , 2009, CIKM.
[12] Evangelos Kanoulas,et al. Empirical justification of the gain and discount function for nDCG , 2009, CIKM.
[13] Peter Bailey,et al. Relevance assessment: are judges exchangeable and does it matter , 2008, SIGIR '08.
[14] Ben Carterette,et al. Million Query Track 2007 Overview , 2008, TREC.
[15] Ellen M. Voorhees. Variations in relevance judgments and the measurement of retrieval effectiveness , 2000, Inf. Process. Manag..
[16] Charles L. A. Clarke,et al. An Effectiveness Measure for Ambiguous and Underspecified Queries , 2009, ICTIR.
[17] Robert Burgin. Variations in Relevance Judgments and the Evaluation of Retrieval Performance , 1992, Inf. Process. Manag..
[18] Ellen M. Voorhees,et al. Variations in relevance judgments and the measurement of retrieval effectiveness , 1998, SIGIR '98.
[19] J. Shane Culpepper,et al. Including summaries in system evaluation , 2009, SIGIR.
[20] Stephen E. Robertson,et al. A new rank correlation coefficient for information retrieval , 2008, SIGIR '08.
[21] Stephen E. Robertson,et al. Extending average precision to graded relevance judgments , 2010, SIGIR.
[22] Mark Sanderson,et al. Quantifying test collection quality based on the consistency of relevance judgements , 2011, SIGIR.
[23] Ingemar J. Cox,et al. On Aggregating Labels from Multiple Crowd Workers to Infer Relevance of Documents , 2012, ECIR.
[24] Thomas Demeester,et al. What Snippets Say about Pages in Federated Web Search , 2012, AIRS.
[25] Jaana Kekäläinen,et al. Cumulated gain-based evaluation of IR techniques , 2002, TOIS.
[26] Alistair Moffat,et al. Rank-biased precision for measurement of retrieval effectiveness , 2008, TOIS.