The twist measure for IR evaluation: Taking user's effort into account
暂无分享,去创建一个
Nicola Ferro | Gianmaria Silvello | Kalervo Järvelin | Ari Pirkola | Heikki Keskustalo | K. Järvelin | Ari Pirkola | Heikki Keskustalo | N. Ferro | G. Silvello
[1] Jonathan Barzilai,et al. On the foundations of measurement , 2001, 2001 IEEE International Conference on Systems, Man and Cybernetics. e-Systems and e-Man for Cybernetics in Cyberspace (Cat.No.01CH37236).
[2] Stephen E. Robertson,et al. Modelling A User Population for Designing Information Retrieval Metrics , 2008, EVIA@NTCIR.
[3] Ellen M. Voorhees,et al. Retrieval System Evaluation , 2005 .
[4] Ellen M. Voorhees,et al. Retrieval evaluation with incomplete information , 2004, SIGIR '04.
[5] David Hawking,et al. Overview of the TREC-2001 Web track , 2002 .
[6] José Luis Vicedo González,et al. TREC: Experiment and evaluation in information retrieval , 2007, J. Assoc. Inf. Sci. Technol..
[7] Ben Carterette,et al. Evaluating Web Retrieval Effectiveness , 2012 .
[8] Chris Buckley. Why current IR engines fail , 2004, SIGIR '04.
[9] Stephen E. Robertson,et al. A new interpretation of average precision , 2008, SIGIR '08.
[10] S S Stevens,et al. On the Theory of Scales of Measurement. , 1946, Science.
[11] R. Forthofer,et al. Rank Correlation Methods , 1981 .
[12] W. Ferger. The Nature and Use of the Harmonic Mean , 1931 .
[13] A. Tversky,et al. Foundations of Measurement, Vol. I: Additive and Polynomial Representations , 1991 .
[14] Diane Kelly,et al. Methods for Evaluating Interactive Information Retrieval Systems with Users , 2009, Found. Trends Inf. Retr..
[15] Emre Velipasaoglu,et al. Intent-based diversification of web search results: metrics and algorithms , 2011, Information Retrieval.
[16] Amanda Spink,et al. Determining the informational, navigational, and transactional intent of Web queries , 2008, Inf. Process. Manag..
[17] Nicola Ferro. Bridging Between Information Retrieval and Databases: PROMISE Winter School 2013, Bressanone, Italy, February 4-8, 2013. Revised Tutorial Lectures ... Applications, incl. Internet/Web, and HCI , 2014 .
[18] Donna K. Harman. Some thoughts on failure analysis for noisy data , 2008, AND '08.
[19] Tetsuya Sakai. Evaluation with informational and navigational intents , 2012, WWW.
[20] Jean Tague-Sutcliffe,et al. The Pragmatics of Information Retrieval Experimentation Revisited , 1997, Inf. Process. Manag..
[21] Alistair Moffat,et al. Precision-at-ten considered redundant , 2008, SIGIR '08.
[22] Mark Sanderson,et al. Test Collection Based Evaluation of Information Retrieval Systems , 2010, Found. Trends Inf. Retr..
[23] Tetsuya Sakai,et al. Metrics, Statistics, Tests , 2013, PROMISE Winter School.
[24] Elaine Toms. Task-based information searching and retrieval , 2011, Interactive Information Seeking, Behaviour and Retrieval.
[25] Maja Zumer,et al. Interactive Information Seeking, Behaviour and Retrieval , 2012, Program.
[26] Alistair Moffat,et al. The Effect of Pooling and Evaluation Depth on Metric Stability , 2010, EVIA@NTCIR.
[27] Ben Carterette,et al. Chapter 5 Evaluating Web Retrieval Effectiveness , 2012 .
[28] Noriko Kando,et al. On information retrieval metrics designed for evaluation with incomplete relevance assessments , 2008, Information Retrieval.
[29] Tague-SutcliffeJean. The pragmatics of information retrieval experimentation, revisited , 1992 .
[30] Ellen M. Voorhees,et al. Evaluation by highly relevant documents , 2001, SIGIR '01.
[31] Djoerd Hiemstra,et al. Information Access Evaluation. Multilinguality, Multimodality, and Visual Analytics , 2012, Lecture Notes in Computer Science.
[32] Peter Ingwersen,et al. The Turn - Integration of Information Seeking and Retrieval in Context , 2005, The Kluwer International Series on Information Retrieval.
[33] Giuseppe Santucci,et al. Interactive Analysis and Exploration of Experimental Evaluation Results , 2011, EuroHCIR.
[34] Ellen M. Voorhees,et al. Overview of the TREC 2004 Robust Retrieval Track , 2004 .
[35] Tetsuya Sakai,et al. Alternatives to Bpref , 2007, SIGIR.
[36] Andrew Trotman,et al. Sound and complete relevance assessment for XML retrieval , 2008, TOIS.
[37] Jaana Kekäläinen,et al. Using graded relevance assessments in IR evaluation , 2002, J. Assoc. Inf. Sci. Technol..
[38] Alistair Moffat,et al. Users versus models: what observation tells us about effectiveness metrics , 2013, CIKM.
[39] Mark D. Smucker,et al. Human performance and retrieval precision revisited , 2010, SIGIR.
[40] Leo Egghe,et al. The measures precision, recall, fallout and miss as a function of the number of retrieved documents and their mutual interrelations , 2008, Inf. Process. Manag..
[41] Ellen M. Voorhees,et al. TREC: Experiment and Evaluation in Information Retrieval (Digital Libraries and Electronic Publishing) , 2005 .
[42] Christina Lioma,et al. The tipping point: F-score as a function of the number of retrieved items , 2012, Inf. Process. Manag..
[43] Stephen E. Robertson,et al. Extending average precision to graded relevance judgments , 2010, SIGIR.
[44] Ben Carterette,et al. System effectiveness, user models, and user utility: a conceptual framework for investigation , 2011, SIGIR.
[45] Fabian Steeg,et al. Information-Retrieval: Evaluation , 2010 .
[46] Jaana Kekäläinen,et al. Intuition-supporting visualization of user's performance based on explicit negative higher-order relevance , 2008, SIGIR '08.
[47] S. S. Stevens,et al. On the averaging of data. , 1955, Science.
[48] Charles L. A. Clarke,et al. Overview of the TREC 2012 Web Track , 2012, TREC.
[49] Andrei Broder,et al. A taxonomy of web search , 2002, SIGF.
[50] Cyril Cleverdon,et al. The Cranfield tests on index language devices , 1997 .
[51] Giuseppe Santucci,et al. Visual Comparison of Ranked Result Cumulated Gains , 2011, EuroVA@EuroVis.
[52] Charles L. A. Clarke,et al. Time-based calibration of effectiveness measures , 2012, SIGIR '12.
[53] Charles L. A. Clarke,et al. Stochastic simulation of time-biased gain , 2012, CIKM '12.
[54] Giuseppe Santucci,et al. To Re-rank or to Re-query: Can Visual Analytics Solve This Dilemma? , 2011, CLEF.
[55] Ian Soboroff,et al. Dynamic test collections: measuring search effectiveness on the live web , 2006, SIGIR.
[56] Tetsuya Sakai,et al. Evaluating evaluation metrics based on the bootstrap , 2006, SIGIR.
[57] Tetsuya Sakai,et al. On the reliability of information retrieval metrics based on graded relevance , 2007, Inf. Process. Manag..
[58] Jaana Kekäläinen,et al. Cumulated gain-based evaluation of IR techniques , 2002, TOIS.
[59] Kalervo Järvelin. User-Oriented Evaluation in IR , 2012, PROMISE Winter School.
[60] Jean Tague-Sutcliffe,et al. Some Perspectives on the Evaluation of Information Retrieval Systems , 1996, J. Am. Soc. Inf. Sci..
[61] Jacques Savoy. Why do successful search systems fail for some topics , 2007, SAC '07.
[62] Giuseppe Santucci,et al. VIRTUE: A visual tool for information retrieval performance evaluation and failure analysis , 2014, J. Vis. Lang. Comput..
[63] Giuseppe Santucci,et al. Visual interactive failure analysis: supporting users in information retrieval evaluation , 2012, IIR.
[64] Charles L. A. Clarke,et al. Overview of the TREC 2011 Web Track , 2011, TREC.
[65] Ellen M. Voorhees,et al. The effect of topic set size on retrieval experiment error , 2002, SIGIR '02.
[66] Giuseppe Santucci,et al. Cumulated Relative Position: A Metric for Ranking Evaluation , 2012, IIR.
[67] Alistair Moffat,et al. Rank-biased precision for measurement of retrieval effectiveness , 2008, TOIS.
[68] Jaana Kekäläinen,et al. Binary and graded relevance in IR evaluations--Comparison of the effects on ranking of IR systems , 2005, Inf. Process. Manag..