Achieving budget-optimality with adaptive schemes in crowdsourcing

Crowdsourcing platforms provide marketplaces where task requesters can pay to get labels on their data. Such markets have emerged recently as popular venues for collecting annotations that are crucial in training machine learning models in various applications. However, as jobs are tedious and payments are low, errors are common in such crowdsourced labels. A common strategy to overcome such noise in the answers is to add redundancy by getting multiple answers for each task and aggregating them using some methods such as majority voting. For such a system, there is a fundamental question of interest: how can we maximize the accuracy given a fixed budget on how many responses we can collect on the crowdsourcing system. We characterize this fundamental trade-off between the budget (how many answers the requester can collect in total) and the accuracy in the estimated labels. In particular, we ask whether adaptive task assignment schemes lead to a more efficient trade-off between the accuracy and the budget. Adaptive schemes, where tasks are assigned adaptively based on the data collected thus far, are widely used in practical crowdsourcing systems to efficiently use a given fixed budget. However, existing theoretical analyses of crowdsourcing systems suggest that the gain of adaptive task assignments is minimal. To bridge this gap, we investigate this question under a strictly more general probabilistic model, which has been recently introduced to model practical crowdsourced annotations. Under this generalized Dawid-Skene model, we characterize the fundamental trade-off between budget and accuracy. We introduce a novel adaptive scheme that matches this fundamental limit. We further quantify the fundamental gap between adaptive and non-adaptive schemes, by comparing the trade-off with the one for non-adaptive schemes. Our analyses confirm that the gap is significant.

[1]  A. P. Dawid,et al.  Maximum Likelihood Estimation of Observer Error‐Rates Using the EM Algorithm , 1979 .

[2]  David Williams,et al.  Probability with Martingales , 1991, Cambridge mathematical textbooks.

[3]  Noga Alon,et al.  The Probabilistic Method , 2015, Fundamentals of Ramsey Theory.

[4]  Andrew W. Moore,et al.  Hoeffding Races: Accelerating Model Selection Search for Classification and Function Approximation , 1993, NIPS.

[5]  Pietro Perona,et al.  Inferring Ground Truth from Subjective Labelling of Venus Images , 1994, NIPS.

[6]  V. Akila,et al.  Information , 2001, The Lancet.

[7]  Rong Jin,et al.  Learning with Multiple Labels , 2002, NIPS.

[8]  Panagiotis G. Ipeirotis,et al.  Get another label? improving data quality and data mining using multiple, noisy labelers , 2008, KDD.

[9]  Rüdiger L. Urbanke,et al.  Modern Coding Theory , 2008 .

[10]  M. Mézard,et al.  Information, Physics, and Computation , 2009 .

[11]  Javier R. Movellan,et al.  Whose Vote Should Count More: Optimal Integration of Labels from Labelers of Unknown Expertise , 2009, NIPS.

[12]  Andrea Montanari,et al.  Matrix completion from a few entries , 2009, ISIT.

[13]  Jaime G. Carbonell,et al.  Efficiently learning the accuracy of labeling sources for selective sampling , 2009, KDD.

[14]  Andrea Montanari,et al.  Matrix completion from a few entries , 2009, 2009 IEEE International Symposium on Information Theory.

[15]  Kun Deng,et al.  Active Learning from Multiple Noisy Labelers with Varied Costs , 2010, 2010 IEEE International Conference on Data Mining.

[16]  Pietro Perona,et al.  The Multidimensional Wisdom of Crowds , 2010, NIPS.

[17]  R. Preston McAfee,et al.  Who moderates the moderators?: crowdsourcing abuse detection in user-generated content , 2011, EC '11.

[18]  Approximating the Wisdom of the Crowd , 2011 .

[19]  Jian Peng,et al.  Variational Inference for Crowdsourcing , 2012, NIPS.

[20]  John C. Platt,et al.  Learning from the Wisdom of Crowds by Minimax Entropy , 2012, NIPS.

[21]  Bin Bi,et al.  Iterative Learning for Reliable Crowdsourcing Systems , 2012 .

[22]  Elchanan Mossel,et al.  Spectral redemption in clustering sparse networks , 2013, Proceedings of the National Academy of Sciences.

[23]  Chao Gao,et al.  Minimax Optimal Convergence Rates for Estimating Ground Truth from Crowdsourced Labels , 2013, 1310.5764.

[24]  Chien-Ju Ho,et al.  Adaptive Task Assignment for Crowdsourced Classification , 2013, ICML.

[25]  Devavrat Shah,et al.  Efficient crowdsourcing for multi-class labeling , 2013, SIGMETRICS '13.

[26]  Hongwei Li,et al.  Error Rate Analysis of Labeling by Crowdsourcing , 2013 .

[27]  Anirban Dasgupta,et al.  Aggregating crowdsourced binary ratings , 2013, WWW.

[28]  Hongwei Li,et al.  Error Rate Bounds and Iterative Weighted Majority Voting for Crowdsourcing , 2014, ArXiv.

[29]  Xi Chen,et al.  Spectral Methods Meet EM: A Provably Optimal Algorithm for Crowdsourcing , 2014, J. Mach. Learn. Res..

[30]  Devavrat Shah,et al.  Budget-Optimal Task Allocation for Reliable Crowdsourcing Systems , 2011, Oper. Res..

[31]  Nihar B. Shah,et al.  Regularized Minimax Conditional Entropy for Crowdsourcing , 2015, ArXiv.

[32]  Martin J. Wainwright,et al.  Distributed Estimation of Generalized Matrix Rank: Efficient Algorithms and Lower Bounds , 2015, ICML.

[33]  Laurent Massoulié,et al.  Non-backtracking Spectrum of Random Graphs: Community Detection and Non-regular Ramanujan Graphs , 2014, 2015 IEEE 56th Annual Symposium on Foundations of Computer Science.

[34]  Thomas Bonald,et al.  A Streaming Algorithm for Crowdsourced Data Classification , 2016, ArXiv.

[35]  Thomas Bonald,et al.  Crowdsourcing: Low complexity, Minimax Optimal Algorithms , 2016, ArXiv.

[36]  Jinwoo Shin,et al.  Optimality of Belief Propagation for Crowdsourced Classification , 2016, ICML.

[37]  Laurent Massoulié,et al.  On the capacity of information processing systems , 2016, COLT.

[38]  Gregory Valiant,et al.  Spectrum Estimation from Samples , 2016, ArXiv.

[39]  David P. Woodruff,et al.  Embeddings of Schatten Norms with Applications to Data Streams , 2017, ICALP.

[40]  Venkatesh Saligrama,et al.  Crowdsourcing with Sparsely Interacting Workers , 2017, ArXiv.