Quality Management in Crowdsourcing using Gold Judges Behavior
暂无分享,去创建一个
[1] J. Friedman. Special Invited Paper-Additive logistic regression: A statistical view of boosting , 2000 .
[2] J. Friedman. Greedy function approximation: A gradient boosting machine. , 2001 .
[3] Brendan T. O'Connor,et al. Cheap and Fast – But is it Good? Evaluating Non-Expert Annotations for Natural Language Tasks , 2008, EMNLP.
[4] Aniket Kittur,et al. Crowdsourcing user studies with Mechanical Turk , 2008, CHI.
[5] Jeff Howe,et al. Crowdsourcing: Why the Power of the Crowd Is Driving the Future of Business , 2008, Human Resource Management International Digest.
[6] Peter Bailey,et al. Relevance assessment: are judges exchangeable and does it matter , 2008, SIGIR '08.
[7] John Le,et al. Ensuring quality in crowdsourced search relevance evaluation: The effects of training question distribution , 2010 .
[8] Stefanie Nowak,et al. How reliable are annotations via crowdsourcing: a study about inter-annotator agreement for multi-label image annotation , 2010, MIR '10.
[9] Gjergji Kasneci,et al. Bayesian Knowledge Corroboration with Logical Rules and User Feedback , 2010, ECML/PKDD.
[10] Ben Carterette,et al. An Analysis of Assessor Behavior in Crowdsourced Preference Judgments , 2010 .
[11] Dana Chandler,et al. Preventing Satisficing in Online Surveys: A "Kapcha" to Ensure Higher Quality Data , 2010 .
[12] Panagiotis G. Ipeirotis,et al. Quality management on Amazon Mechanical Turk , 2010, HCOMP '10.
[13] Lorrie Faith Cranor,et al. Are your participants gaming the system?: screening mechanical turk workers , 2010, CHI.
[14] Matthew Lease,et al. Crowdsourcing Document Relevance Assessment with Mechanical Turk , 2010, Mturk@HLT-NAACL.
[15] Roi Blanco,et al. Repeatable and reliable search system evaluation using crowdsourcing , 2011, SIGIR.
[16] Aaron D. Shaw,et al. Designing incentives for inexpert human raters , 2011, CSCW.
[17] Gabriella Kazai,et al. In Search of Quality in Crowdsourcing for Search Engine Evaluation , 2011, ECIR.
[18] Shipeng Yu,et al. An Entropic Score to Rank Annotators for Crowdsourced Labeling Tasks , 2011, 2011 Third National Conference on Computer Vision, Pattern Recognition, Image Processing and Graphics.
[19] Shipeng Yu,et al. Ranking annotators for crowdsourced labeling tasks , 2011, NIPS.
[20] Benjamin B. Bederson,et al. Human computation: a survey and taxonomy of a growing field , 2011, CHI.
[21] Mark Sanderson,et al. Quantifying test collection quality based on the consistency of relevance judgements , 2011, SIGIR.
[22] Eric Horvitz,et al. Combining human and machine intelligence in large-scale crowdsourcing , 2012, AAMAS.
[23] Gabriella Kazai,et al. An analysis of human factors and label accuracy in crowdsourcing relevance judgments , 2013, Information Retrieval.
[24] Tom Minka,et al. How To Grade a Test Without Knowing the Answers - A Bayesian Graphical Model for Adaptive Crowdsourcing and Aptitude Testing , 2012, ICML.
[25] Omar Alonso,et al. Implementing crowdsourcing-based relevance experimentation: an industrial perspective , 2013, Information Retrieval.
[26] Aniket Kittur,et al. CrowdScape: interactively visualizing user behavior and output , 2012, UIST.
[27] Dirk Hovy,et al. Learning Whom to Trust with MACE , 2013, NAACL.
[28] Michael S. Bernstein,et al. The future of crowd work , 2013, CSCW.
[29] Nicholas R. Jennings,et al. Efficient budget allocation with accuracy guarantees for crowdsourcing classification tasks , 2013, AAMAS.
[30] Phuoc Tran-Gia,et al. Predicting result quality in Crowdsourcing using application layer monitoring , 2014, 2014 IEEE Fifth International Conference on Communications and Electronics (ICCE).