Defeating Tyranny of the Masses in Crowdsourcing: Accounting for Low-Skilled and Adversarial Workers

Crowdsourcing has emerged as a useful learning paradigm which allows us to instantly recruit workers on the web to solve large scale problems, such as quick annotation of image, web page, or document databases. Automated inference engines that fuse the answers or opinions from the crowd to make critical decisions are susceptible to unreliable, low-skilled and malicious workers who tend to mislead the system towards inaccurate inferences. We present a probabilistic generative framework to model worker responses for multicategory crowdsourcing tasks based on two novel paradigms. First, we decompose worker reliability into skill level and intention. Second, we introduce a stochastic model for answer generation that plausibly captures the interplay between worker skills, intentions, and task difficulties. This framework allows us to model and estimate a broad range of worker "types". A generalized Expectation Maximization algorithm is presented to jointly estimate the unknown ground truth answers along with worker and task parameters. As supported experimentally, the proposed scheme de-emphasizes answers from low skilled workers and leverages malicious workers to, in fact, improve crowd aggregation. Moreover, our approach is especially advantageous when there is an (a priori unknown) majority of low-skilled and/or malicious workers in the crowd.

[1]  Peter Druschel,et al.  Peer-to-peer systems , 2010, Commun. ACM.

[2]  Shipeng Yu,et al.  Eliminating Spammers and Ranking Annotators for Crowdsourced Labeling Tasks , 2012, J. Mach. Learn. Res..

[3]  David J. Miller,et al.  Unsupervised learning of parsimonious mixtures on large spaces with integrated feature and component selection , 2006, IEEE Transactions on Signal Processing.

[4]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[5]  Christos H. Papadimitriou,et al.  Free-riding and whitewashing in peer-to-peer systems , 2004, IEEE Journal on Selected Areas in Communications.

[6]  Xiao-Li Meng,et al.  The EM Algorithm—an Old Folk‐song Sung to a Fast New Tune , 1997 .

[7]  A. P. Dawid,et al.  Maximum Likelihood Estimation of Observer Error‐Rates Using the EM Algorithm , 1979 .

[8]  Maja Vukovic,et al.  Crowdsourcing for Enterprises , 2009, 2009 Congress on Services - I.

[9]  Gerardo Hermosillo,et al.  Learning From Crowds , 2010, J. Mach. Learn. Res..

[10]  Pietro Perona,et al.  The Multidimensional Wisdom of Crowds , 2010, NIPS.

[11]  John R. Douceur,et al.  The Sybil Attack , 2002, IPTPS.

[12]  Devavrat Shah,et al.  Budget-optimal crowdsourcing using low-rank matrix approximations , 2011, 2011 49th Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[13]  John C. Platt,et al.  Learning from the Wisdom of Crowds by Minimax Entropy , 2012, NIPS.

[14]  András Kocsor,et al.  A Multilingual Named Entity Recognition System Using Boosting and C4.5 Decision Tree Learning Algorithms , 2006, Discovery Science.

[15]  Devavrat Shah,et al.  Iterative Learning for Reliable Crowdsourcing Systems , 2011, NIPS.

[16]  Balázs Kégl,et al.  MULTIBOOST: A Multi-purpose Boosting Package , 2012, J. Mach. Learn. Res..

[17]  Javier R. Movellan,et al.  Whose Vote Should Count More: Optimal Integration of Labels from Labelers of Unknown Expertise , 2009, NIPS.

[18]  Yee Whye Teh,et al.  Inferring ground truth from multi-annotator ordinal data: a probabilistic approach , 2013, ArXiv.

[19]  R. Schapire The Strength of Weak Learnability , 1990, Machine Learning.