Double or Nothing: Multiplicative Incentive Mechanisms for Crowdsourcing

Crowdsourcing has gained immense popularity in machine learning applications for obtaining large amounts of labeled data. Crowdsourcing is cheap and fast, but suffers from the problem of low-quality data. To address this fundamental challenge in crowdsourcing, we propose a simple payment mechanism to incentivize workers to answer only the questions that they are sure of and skip the rest. We show that surprisingly, under a mild and natural "no-free-lunch" requirement, this mechanism is the one and only incentive-compatible payment mechanism possible. We also show that among all possible incentive-compatible mechanisms (that may or may not satisfy no-free-lunch), our mechanism makes the smallest possible payment to spammers. Interestingly, this unique mechanism takes a "multiplicative" form. The simplicity of the mechanism is an added benefit. In preliminary experiments involving over several hundred workers, we observe a significant reduction in the error rates under our unique mechanism for the same or lower monetary expenditure.

[1]  A. Buja,et al.  Loss Functions for Binary Class Probability Estimation and Classification: Structure and Applications , 2005 .

[2]  D. Prelec A Bayesian Truth Serum for Subjective Data , 2004, Science.

[3]  Z. Popovic,et al.  Crystal structure of a monomeric retroviral protease solved by protein folding game players , 2011, Nature Structural &Molecular Biology.

[4]  Chee Peng Lim,et al.  A hybrid neural network model for noisy data regression , 2004, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[5]  Fei Xia,et al.  Preliminary Experiments with Amazon’s Mechanical Turk for Annotating Medical Named Entities , 2010, Mturk@HLT-NAACL.

[6]  L. V. Williams,et al.  Prediction Markets , 2003 .

[7]  Jeroen B. P. Vuurens,et al.  How Much Spam Can You Take? An Analysis of Crowdsourcing Results to Increase Accuracy , 2011 .

[8]  David C. Parkes,et al.  A Robust Bayesian Truth Serum for Small Populations , 2012, AAAI.

[9]  Manuel Blum,et al.  reCAPTCHA: Human-Based Character Recognition via Web Security Measures , 2008, Science.

[10]  Joseph E. Burns,et al.  Note: This Copy Is for Your Personal Non-commercial Use Only. to Order Presentation-ready Copies for Distribution to Your Colleagues or Clients, Contact Us at Www.rsna.org/rsnarights. Distributed Human Intelligence for Colonic Polyp Classification in Computer-aided Detection for Ct Colonography 1 , 2022 .

[11]  Tim Kraska,et al.  CrowdDB: answering queries with crowdsourcing , 2011, SIGMOD '11.

[12]  Rocco A. Servedio,et al.  Random classification noise defeats all convex potential boosters , 2008, ICML '08.

[13]  Xi Chen,et al.  Optimal PAC Multiple Arm Identification with Applications to Crowdsourcing , 2014, ICML.

[14]  D. Angluin,et al.  Learning From Noisy Examples , 1988, Machine Learning.

[15]  Ariel D. Procaccia,et al.  Incentive compatible regression learning , 2008, SODA '08.

[16]  Michael I. Jordan,et al.  Bayesian Bias Mitigation for Crowdsourcing , 2011, NIPS.

[17]  Paul N. Bennett,et al.  Pairwise ranking aggregation in a crowdsourced setting , 2013, WSDM.

[18]  John C. Platt,et al.  Learning from the Wisdom of Crowds by Minimax Entropy , 2012, NIPS.

[19]  Estevam R. Hruschka,et al.  Coupled semi-supervised learning for information extraction , 2010, WSDM '10.

[20]  Michael S. Bernstein,et al.  Soylent: a word processor with a crowd inside , 2010, UIST.

[21]  Gerardo Hermosillo,et al.  Learning From Crowds , 2010, J. Mach. Learn. Res..

[22]  Yuval Peres,et al.  Approval Voting and Incentives in Crowdsourcing , 2015, ICML.

[23]  Carlo Zaniolo,et al.  An adaptive learning approach for noisy data streams , 2004, Fourth IEEE International Conference on Data Mining (ICDM'04).

[24]  Andrew S. I. D. Lang,et al.  Using Amazon Mechanical Turk to Transcribe Historical Handwritten Documents , 2011 .

[25]  Ning Chen,et al.  Cheap labor can be expensive , 2007, SODA '07.

[26]  Jian Peng,et al.  Variational Inference for Crowdsourcing , 2012, NIPS.

[27]  Nihar B. Shah,et al.  Regularized Minimax Conditional Entropy for Crowdsourcing , 2015, ArXiv.

[28]  Yannis A. Dimitriadis,et al.  Learning from noisy information in FasArt and FasBack neuro-fuzzy systems , 2001, Neural Networks.

[29]  Eric Horvitz,et al.  Combining human and machine intelligence in large-scale crowdsourcing , 2012, AAMAS.

[30]  Gabriella Kazai,et al.  Crowdsourcing for book search evaluation: impact of hit design on comparative system ranking , 2011, SIGIR.

[31]  Ashish Khetan,et al.  Reliable Crowdsourcing under the Generalized Dawid-Skene Model , 2016, ArXiv.

[32]  Fang Fang,et al.  "Putting Your Money Where Your Mouth Is" - A Betting Platform for Better Prediction , 2007 .

[33]  L. J. Savage Elicitation of Personal Probabilities and Expectations , 1971 .

[34]  Mark A. Musen,et al.  Crowdsourcing the Verification of Relationships in Biomedical Ontologies , 2013, AMIA.

[35]  Jason Baldridge,et al.  How well does active learning actually work? Time-based evaluation of cost-reduction strategies for language documentation. , 2009, EMNLP.

[36]  Martin J. Wainwright,et al.  A Permutation-Based Model for Crowd Labeling: Optimal Estimation and Robustness , 2016, IEEE Transactions on Information Theory.

[37]  Xi Chen,et al.  Competitive analysis of the top-K ranking problem , 2016, SODA.

[38]  P. Whitla,et al.  Crowdsourcing and its application in marketing activities , 2009 .

[39]  Yang Cai,et al.  Optimum Statistical Estimation with Strategic Data Sources , 2014, COLT.

[40]  Kwong-Sak Leung,et al.  Task Matching in Crowdsourcing , 2011, 2011 International Conference on Internet of Things and 4th International Conference on Cyber, Physical and Social Computing.

[41]  Martin J. Wainwright,et al.  Estimation from Pairwise Comparisons: Sharp Minimax Bounds with Topology Dependence , 2015, J. Mach. Learn. Res..

[42]  Devavrat Shah,et al.  Iterative Learning for Reliable Crowdsourcing Systems , 2011, NIPS.

[43]  Jenny Chen,et al.  Opportunities for Crowdsourcing Research on Amazon Mechanical Turk , 2011 .

[44]  Martin J. Wainwright,et al.  Simple, Robust and Optimal Ranking from Pairwise Comparisons , 2015, J. Mach. Learn. Res..

[45]  R. Nosofsky,et al.  Seven plus or minus two: a commentary on capacity limitations. , 1994, Psychological review.

[46]  Cordelia Schmid,et al.  A sparse texture representation using local affine regions , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[47]  Paul Resnick,et al.  Eliciting Informative Feedback: The Peer-Prediction Method , 2005, Manag. Sci..

[48]  Kelly Reynolds,et al.  Using Machine Learning to Detect Cyberbullying , 2011, 2011 10th International Conference on Machine Learning and Applications and Workshops.

[49]  W. Jones,et al.  Optimal Number of Questionnaire Response Categories , 2013 .

[50]  Naresh Manwani,et al.  Noise Tolerance Under Risk Minimization , 2011, IEEE Transactions on Cybernetics.

[51]  Mark H. Ellisman,et al.  DP2: Distributed 3D image segmentation using micro-labor workforce , 2013, Bioinform..

[52]  Alon Y. Halevy,et al.  Crowdsourcing systems on the World-Wide Web , 2011, Commun. ACM.

[53]  J. Bohannon Human subject research. Social science for pennies. , 2011, Science.

[54]  John Le,et al.  Ensuring quality in crowdsourced search relevance evaluation: The effects of training question distribution , 2010 .

[55]  Panagiotis G. Ipeirotis,et al.  Repeated labeling using multiple noisy labelers , 2012, Data Mining and Knowledge Discovery.

[56]  Gagan Goel,et al.  Mechanism Design for Crowdsourcing: An Optimal 1-1/e Competitive Budget-Feasible Mechanism for Large Markets , 2014, 2014 IEEE 55th Annual Symposium on Foundations of Computer Science.

[57]  Andy J. King,et al.  Skin self-examinations and visual identification of atypical nevi: comparing individual and crowdsourcing approaches. , 2013, Cancer epidemiology.

[58]  David Mease,et al.  Boosted Classification Trees and Class Probability/Quantile Estimation , 2007, J. Mach. Learn. Res..

[59]  Jeffrey P. Bigham,et al.  VizWiz: nearly real-time answers to visual questions , 2010, W4A.

[60]  Chien-Ju Ho,et al.  Adaptive Task Assignment for Crowdsourced Classification , 2013, ICML.

[61]  Liu Yang,et al.  Negative Results for Active Learning with Convex Losses , 2010, AISTATS.

[62]  Anirban Dasgupta,et al.  Crowdsourced judgement elicitation with endogenous proficiency , 2013, WWW.

[63]  G. A. Miller THE PSYCHOLOGICAL REVIEW THE MAGICAL NUMBER SEVEN, PLUS OR MINUS TWO: SOME LIMITS ON OUR CAPACITY FOR PROCESSING INFORMATION 1 , 1956 .

[64]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[65]  A. Raftery,et al.  Strictly Proper Scoring Rules, Prediction, and Estimation , 2007 .

[66]  Fei-Fei Li,et al.  Novel Dataset for Fine-Grained Image Categorization : Stanford Dogs , 2012 .

[67]  Martin J. Wainwright,et al.  Stochastically Transitive Models for Pairwise Comparisons: Statistical and Computational Issues , 2015, IEEE Transactions on Information Theory.

[68]  Xi Chen,et al.  Spectral Methods Meet EM: A Provably Optimal Algorithm for Crowdsourcing , 2014, J. Mach. Learn. Res..

[69]  Peter Buhlmann,et al.  BOOSTING ALGORITHMS: REGULARIZATION, PREDICTION AND MODEL FITTING , 2007, 0804.2752.

[70]  G. Brier VERIFICATION OF FORECASTS EXPRESSED IN TERMS OF PROBABILITY , 1950 .

[71]  Benjamin M. Good,et al.  Crowdsourcing for bioinformatics , 2013, Bioinform..

[72]  A. P. Dawid,et al.  Maximum Likelihood Estimation of Observer Error‐Rates Using the EM Algorithm , 1979 .

[73]  Lakshminarayanan Subramanian,et al.  Reputation-based Worker Filtering in Crowdsourcing , 2014, NIPS.

[74]  Paul Resnick,et al.  Eliciting Informative Feedback: The Peer-Prediction Method , 2005, Manag. Sci..

[75]  C. Lintott,et al.  Galaxy Zoo: morphologies derived from visual inspection of galaxies from the Sloan Digital Sky Survey , 2008, 0804.4483.

[76]  Bin Bi,et al.  Iterative Learning for Reliable Crowdsourcing Systems , 2012 .

[77]  Yoav Shoham,et al.  Eliciting truthful answers to multiple-choice questions , 2009, EC '09.

[78]  Turk Paul Wais,et al.  Towards Building a High-Quality Workforce with Mechanical , 2010 .

[79]  Pramod K. Varshney,et al.  Reliable Crowdsourcing for Multi-Class Labeling Using Coding Theory , 2013, IEEE Journal of Selected Topics in Signal Processing.

[80]  Tara N. Sainath,et al.  Deep Neural Networks for Acoustic Modeling in Speech Recognition: The Shared Views of Four Research Groups , 2012, IEEE Signal Processing Magazine.

[81]  Vincent Conitzer,et al.  Prediction Markets, Mechanism Design, and Cooperative Game Theory , 2009, UAI.