论文信息 - Evaluating the crowd with confidence

Evaluating the crowd with confidence

Worker quality control is a crucial aspect of crowdsourcing systems; typically occupying a large fraction of the time and money invested on crowdsourcing. In this work, we devise techniques to generate confidence intervals for worker error rate estimates, thereby enabling a better evaluation of worker quality. We show that our techniques generate correct confidence intervals on a range of real-world datasets, and demonstrate wide applicability by using them to evict poorly performing workers, and provide confidence intervals on the accuracy of the answers.

Aditya G. Parameswaran | Hector Garcia-Molina | Manas Joglekar | H. Garcia-Molina | Manas R. Joglekar

[1] Aditya Ramesh. Identifying Reliable Workers Swiftly , 2012 .

[2] D. Massart,et al. Dealing with missing data: Part II , 2001 .

[3] Roger A. Sugden,et al. Multiple Imputation for Nonresponse in Surveys , 1988 .

[4] Beng Chin Ooi,et al. CDAS: A Crowdsourcing Data Analytics System , 2012, Proc. VLDB Endow..

[5] Larry Wasserman,et al. All of Statistics , 2004 .

[6] Gerardo Hermosillo,et al. Supervised learning from multiple experts: whom to trust when everyone lies a bit , 2009, ICML '09.

[7] Yuandong Tian,et al. Learning from crowds in the presence of schools of thought , 2012, KDD.

[8] Javier R. Movellan,et al. Whose Vote Should Count More: Optimal Integration of Labels from Labelers of Unknown Expertise , 2009, NIPS.

[9] E. B. Wilson. Probable Inference, the Law of Succession, and Statistical Inference , 1927 .

[10] Chris Callison-Burch,et al. Feasibility of Human-in-the-loop Minimum Error Rate Training , 2009, EMNLP.

[11] Pietro Perona,et al. Crowdclustering , 2011, NIPS.

[12] Jennifer Widom,et al. CrowdScreen: algorithms for filtering data with humans , 2012, SIGMOD Conference.

[13] David R. Karger,et al. Human-powered Sorts and Joins , 2011, Proc. VLDB Endow..

[14] A. P. Dawid,et al. Maximum Likelihood Estimation of Observer Error‐Rates Using the EM Algorithm , 1979 .

[15] Dinei A. F. Florêncio,et al. Crowdsourcing subjective image quality evaluation , 2011, 2011 18th IEEE International Conference on Image Processing.

[16] Maya R. Gupta,et al. Theory and Use of the EM Algorithm , 2011, Found. Trends Signal Process..

[17] G. McLachlan,et al. The EM algorithm and extensions , 1996 .

[18] Aditya G. Parameswaran,et al. Active sampling for entity matching , 2012, KDD.

[19] Panagiotis G. Ipeirotis,et al. Get another label? improving data quality and data mining using multiple, noisy labelers , 2008, KDD.

[20] Aditya G. Parameswaran,et al. Smart Drill Down , 2014, ArXiv.

[21] Jennifer Widom,et al. Human-assisted graph search: it's okay to ask questions , 2011, Proc. VLDB Endow..

[22] Pietro Perona,et al. Online crowdsourcing: Rating annotators and obtaining cost-effective labels , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Workshops.

[23] Chris Kanich,et al. Re: CAPTCHAs-Understanding CAPTCHA-Solving Services in an Economic Context , 2010, USENIX Security Symposium.

[24] Jaime G. Carbonell,et al. Efficiently learning the accuracy of labeling sources for selective sampling , 2009, KDD.

[25] Shipeng Yu,et al. Eliminating Spammers and Ranking Annotators for Crowdsourced Labeling Tasks , 2012, J. Mach. Learn. Res..

[26] Aditya G. Parameswaran,et al. So who won?: dynamic max discovery with the crowd , 2012, SIGMOD Conference.

[27] Brendan T. O'Connor,et al. Cheap and Fast – But is it Good? Evaluating Non-Expert Annotations for Natural Language Tasks , 2008, EMNLP.

[28] Tim Kraska,et al. CrowdER: Crowdsourcing Entity Resolution , 2012, Proc. VLDB Endow..

[29] G. Casella,et al. Statistical Inference , 2003, Encyclopedia of Social Network Analysis and Mining.

[30] Omar Alonso,et al. Crowdsourcing for relevance evaluation , 2008, SIGF.

[31] Steve Cooper,et al. Reflections on Stanford's MOOCs , 2013, CACM.

[32] Devavrat Shah,et al. Budget-Optimal Task Allocation for Reliable Crowdsourcing Systems , 2011, Oper. Res..

[33] D. Rubin,et al. Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .