Obtaining High-Quality Relevance Judgments Using Crowdsourcing
暂无分享,去创建一个
[1] John Le,et al. Ensuring quality in crowdsourced search relevance evaluation: The effects of training question distribution , 2010 .
[2] Gabriella Kazai,et al. Worker types and personality traits in crowdsourcing relevance labels , 2011, CIKM '11.
[3] K. Krippendorff,et al. The Content Analysis Reader , 2008 .
[4] Ohad Shamir,et al. Vox Populi: Collecting High-Quality Labels from a Crowd , 2009, COLT.
[5] Lorrie Faith Cranor,et al. Are your participants gaming the system?: screening mechanical turk workers , 2010, CHI.
[6] Panagiotis G. Ipeirotis,et al. Quality management on Amazon Mechanical Turk , 2010, HCOMP '10.
[7] Omar Alonso,et al. Crowdsourcing for relevance evaluation , 2008, SIGF.
[8] Aniket Kittur,et al. Crowdsourcing user studies with Mechanical Turk , 2008, CHI.
[9] Mark Sanderson,et al. Quantifying test collection quality based on the consistency of relevance judgements , 2011, SIGIR.
[10] Jeroen B. P. Vuurens,et al. How Much Spam Can You Take? An Analysis of Crowdsourcing Results to Increase Accuracy , 2011 .
[11] Gabriella Kazai,et al. In Search of Quality in Crowdsourcing for Search Engine Evaluation , 2011, ECIR.
[12] Arjen P. de Vries,et al. Increasing cheat robustness of crowdsourcing tasks , 2013, Information Retrieval.
[13] Aniket Kittur,et al. Instrumenting the crowd: using implicit behavioral measures to predict task performance , 2011, UIST.
[14] Ben Carterette,et al. An Analysis of Assessor Behavior in Crowdsourced Preference Judgments , 2010 .
[15] Michael A. Kouritzin,et al. On Detecting Fake Coin Flip Sequences , 2008 .
[16] Ellen M. Voorhees,et al. Variations in relevance judgments and the measurement of retrieval effectiveness , 1998, SIGIR '98.
[17] Roi Blanco,et al. Repeatable and reliable search system evaluation using crowdsourcing , 2011, SIGIR.
[18] Panagiotis G. Ipeirotis,et al. Get another label? improving data quality and data mining using multiple, noisy labelers , 2008, KDD.