Better than Their Reputation? On the Reliability of Relevance Assessments with Students
暂无分享,去创建一个
[1] Hinrich Schütze,et al. Introduction to information retrieval , 2008 .
[2] Philipp Mayr,et al. Implications of Inter-Rater Agreement on a Student Information Retrieval Evaluation , 2010, LWA.
[3] Klaus Krippendorff,et al. Computing Krippendorff's Alpha-Reliability , 2011 .
[4] J. R. Landis,et al. The measurement of observer agreement for categorical data. , 1977, Biometrics.
[5] Omar Alonso,et al. Crowdsourcing Assessments for XML Ranked Retrieval , 2010, ECIR.
[6] J. Fleiss. Measuring nominal scale agreement among many raters. , 1971 .
[7] Fernando Diaz,et al. A Methodology for Evaluating Aggregated Search Results , 2011, ECIR.
[8] Andrew Trotman,et al. Sound and complete relevance assessment for XML retrieval , 2008, TOIS.
[9] York Sure-Vetter,et al. Applying Science Models for Search , 2011, ISI.
[10] Ellen M. Voorhees. Variations in relevance judgments and the measurement of retrieval effectiveness , 2000, Inf. Process. Manag..
[11] Yong Yu,et al. Select-the-Best-Ones: A new way to judge relative relevance , 2011, Inf. Process. Manag..
[12] K. Krippendorff. Reliability in Content Analysis: Some Common Misconceptions and Recommendations , 2004 .
[13] R Core Team,et al. R: A language and environment for statistical computing. , 2014 .
[14] Peter Ingwersen,et al. Developing a Test Collection for the Evaluation of Integrated Search , 2010, ECIR.
[15] Alan F. Smeaton,et al. A study of inter-annotator agreement for opinion retrieval , 2009, SIGIR.
[16] Pia Borlund,et al. The concept of relevance in IR , 2003, J. Assoc. Inf. Sci. Technol..
[17] York Sure-Vetter,et al. Science models as value-added services for scholarly information systems , 2011, Scientometrics.
[18] D. Wentura,et al. Wissenschaftliche Beobachtung: eine Einführung , 1997 .
[19] Ron Artstein,et al. Survey Article: Inter-Coder Agreement for Computational Linguistics , 2008, CL.
[20] Peter Bailey,et al. Relevance assessment: are judges exchangeable and does it matter , 2008, SIGIR '08.
[21] Ellen M. Voorhees,et al. Topic set size redux , 2009, SIGIR.