IR Evaluation: Modeling User Behavior for Measuring Effectiveness
暂无分享,去创建一个
[1] Tetsuya Sakai,et al. Summaries, ranked retrieval and sessions: a unified framework for information access evaluation , 2013, SIGIR.
[2] K. Sparck Jones,et al. INFORMATION RETRIEVAL TEST COLLECTIONS , 1976 .
[3] Jean Tague-Sutcliffe,et al. The Pragmatics of Information Retrieval Experimentation Revisited , 1997, Inf. Process. Manag..
[4] Charles L. A. Clarke,et al. Reliable information retrieval evaluation with incomplete and biased judgements , 2007, SIGIR.
[5] Ben Carterette,et al. System effectiveness, user models, and user utility: a conceptual framework for investigation , 2011, SIGIR.
[6] Milad Shokouhi,et al. Expected browsing utility for web search evaluation , 2010, CIKM.
[7] Alistair Moffat,et al. Statistical power in retrieval experimentation , 2008, CIKM '08.
[8] Alistair Moffat,et al. Click-based evidence for decaying weight distributions in search effectiveness metrics , 2010, Information Retrieval.
[9] Tague-SutcliffeJean. The pragmatics of information retrieval experimentation, revisited , 1992 .
[10] Stephen E. Robertson,et al. Extending average precision to graded relevance judgments , 2010, SIGIR.
[11] José Luis Vicedo González,et al. TREC: Experiment and evaluation in information retrieval , 2007, J. Assoc. Inf. Sci. Technol..
[12] Benjamin Piwowarski,et al. Web Search Engine Evaluation Using Clickthrough Data and a User Model , 2007 .
[13] Mark Sanderson,et al. Forming test collections with no system pooling , 2004, SIGIR '04.
[14] Olivier Chapelle,et al. Expected reciprocal rank for graded relevance , 2009, CIKM.
[15] Cyril Cleverdon,et al. The Cranfield tests on index language devices , 1997 .
[16] Emine Yilmaz,et al. The maximum entropy method for analyzing retrieval measures , 2005, SIGIR '05.
[17] Mounia Lalmas,et al. Absence time and user engagement: evaluating ranking functions , 2013, WSDM '13.
[18] Ben Carterette,et al. Evaluating multi-query sessions , 2011, SIGIR.
[19] Filip Radlinski,et al. Optimized interleaving for online retrieval evaluation , 2013, WSDM.
[20] Andrew Trotman,et al. Sound and complete relevance assessment for XML retrieval , 2008, TOIS.
[21] Yiqun Liu,et al. Automatic search engine performance evaluation with click-through data analysis , 2007, WWW '07.
[22] Charles L. A. Clarke,et al. Time-based calibration of effectiveness measures , 2012, SIGIR '12.
[23] Charles L. A. Clarke,et al. Efficient construction of large test collections , 1998, SIGIR '98.
[24] Gabriella Kazai,et al. User intent and assessor disagreement in web search evaluation , 2013, CIKM.
[25] Mark Sanderson,et al. The relationship between IR effectiveness measures and user satisfaction , 2007, SIGIR.
[26] Ben Carterette,et al. Incorporating variability in user behavior into systems based evaluation , 2012, CIKM.
[27] Filip Radlinski,et al. Relevance and Effort: An Analysis of Document Utility , 2014, CIKM.
[28] Emine Yilmaz,et al. A simple and efficient sampling method for estimating AP and NDCG , 2008, SIGIR '08.
[29] James Allan,et al. A comparison of statistical significance tests for information retrieval evaluation , 2007, CIKM '07.
[30] Stephen E. Robertson,et al. On GMAP: and other transformations , 2006, CIKM '06.
[31] Stephen E. Robertson,et al. A new interpretation of average precision , 2008, SIGIR '08.
[32] Charles L. A. Clarke,et al. Novelty and diversity in information retrieval evaluation , 2008, SIGIR '08.
[33] Mark Sanderson,et al. Information retrieval system evaluation: effort, sensitivity, and reliability , 2005, SIGIR '05.
[34] Ben Carterette,et al. Robust test collections for retrieval evaluation , 2007, SIGIR.
[35] Ellen M. Voorhees,et al. TREC: Experiment and Evaluation in Information Retrieval (Digital Libraries and Electronic Publishing) , 2005 .
[36] Alistair Moffat,et al. Rank-biased precision for measurement of retrieval effectiveness , 2008, TOIS.
[37] Alistair Moffat,et al. Strategic system comparisons via targeted relevance judgments , 2007, SIGIR.
[38] Mark Sanderson,et al. Do user preferences and evaluation measures line up? , 2010, SIGIR.
[39] Thorsten Joachims,et al. Accurately interpreting clickthrough data as implicit feedback , 2005, SIGIR '05.
[40] James Allan,et al. Minimal test collections for retrieval evaluation , 2006, SIGIR.
[41] Charles L. A. Clarke,et al. Stochastic simulation of time-biased gain , 2012, CIKM '12.
[42] Gabriella Kazai,et al. eXtended cumulated gain measures for the evaluation of content-oriented XML retrieval , 2006, TOIS.
[43] Charles L. A. Clarke,et al. A comparative analysis of cascade measures for novelty and diversity , 2011, WSDM '11.
[44] Emine Yilmaz,et al. Estimating average precision when judgments are incomplete , 2007, Knowledge and Information Systems.
[45] Ian Soboroff,et al. Ranking retrieval systems without relevance judgments , 2001, SIGIR '01.
[46] Charles L. A. Clarke,et al. Modeling user variance in time-biased gain , 2012, HCIR '12.
[47] Ellen M. Voorhees,et al. Variations in relevance judgments and the measurement of retrieval effectiveness , 1998, SIGIR '98.
[48] Cassidy R. Sugimoto,et al. A systematic review of interactive information retrieval evaluation studies, 1967-2006 , 2013, J. Assoc. Inf. Sci. Technol..
[49] Gabriella Kazai,et al. Relevance dimensions in preference-based IR evaluation , 2013, SIGIR.
[50] Tetsuya Sakai,et al. Evaluating evaluation metrics based on the bootstrap , 2006, SIGIR.
[51] Jaana Kekäläinen,et al. Cumulated gain-based evaluation of IR techniques , 2002, TOIS.
[52] Emine Yilmaz,et al. A statistical method for system evaluation using incomplete judgments , 2006, SIGIR.
[53] Ben Carterette,et al. Simulating simple user behavior for system effectiveness evaluation , 2011, CIKM '11.