Error Rate Bounds and Iterative Weighted Majority Voting for Crowdsourcing

Crowdsourcing has become an eective and popular tool for human-powered computation to label large datasets. Since the workers can be unreliable, it is common in crowdsourcing to assign multiple workers to one task, and to aggregate the labels in order to obtain results of high quality. In this paper, we provide nite-sample exponential bounds on the error rate (in probability and in expectation) of general aggregation rules under the Dawid-Skene crowdsourcing model. The bounds are derived for multi-class labeling, and can be used to analyze many aggregation methods, including majority voting, weighted majority voting and the oracle Maximum A Posteriori (MAP) rule. We show that the oracle MAP rule approximately optimizes our upper bound on the mean error rate of weighted majority voting in certain setting. We propose an iterative weighted majority voting (IWMV) method that optimizes the error rate bound and approximates the oracle MAP rule. Its one step version has a provable theoretical guarantee on the error rate. The IWMV method is intuitive and computationally simple. Experimental results on simulated and real data show that IWMV performs at least on par with the state-of-the-art methods, and it has a much lower computational cost (around one hundred times faster) than the state-of-the-art methods.

[1]  Pietro Perona,et al.  The Multidimensional Wisdom of Crowds , 2010, NIPS.

[2]  Javier R. Movellan,et al.  Whose Vote Should Count More: Optimal Integration of Labels from Labelers of Unknown Expertise , 2009, NIPS.

[3]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[4]  Anirban Dasgupta,et al.  Aggregating crowdsourced binary ratings , 2013, WWW.

[5]  Jeff A. Bilmes,et al.  A gentle tutorial of the em algorithm and its application to parameter estimation for Gaussian mixture and hidden Markov models , 1998 .

[6]  Tom Minka,et al.  How To Grade a Test Without Knowing the Answers - A Bayesian Graphical Model for Adaptive Crowdsourcing and Aptitude Testing , 2012, ICML.

[7]  Xi Chen,et al.  Optimistic Knowledge Gradient Policy for Optimal Budget Allocation in Crowdsourcing , 2013, ICML.

[8]  Shigeo Abe DrEng Pattern Classification , 2001, Springer London.

[9]  John C. Platt,et al.  Learning from the Wisdom of Crowds by Minimax Entropy , 2012, NIPS.

[10]  D. Angluin,et al.  Learning From Noisy Examples , 1988, Machine Learning.

[11]  Brendan T. O'Connor,et al.  Cheap and Fast – But is it Good? Evaluating Non-Expert Annotations for Natural Language Tasks , 2008, EMNLP.

[12]  Pietro Perona,et al.  Inferring Ground Truth from Subjective Labelling of Venus Images , 1994, NIPS.

[13]  Chao Gao,et al.  Minimax Optimal Convergence Rates for Estimating Ground Truth from Crowdsourced Labels , 2013, 1310.5764.

[14]  A. P. Dawid,et al.  Maximum Likelihood Estimation of Observer Error‐Rates Using the EM Algorithm , 1979 .

[15]  Jian Peng,et al.  Variational Inference for Crowdsourcing , 2012, NIPS.

[16]  Rong Jin,et al.  Learning with Multiple Labels , 2002, NIPS.

[17]  Gerardo Hermosillo,et al.  Learning From Crowds , 2010, J. Mach. Learn. Res..

[18]  Panagiotis G. Ipeirotis,et al.  Get another label? improving data quality and data mining using multiple, noisy labelers , 2008, KDD.

[19]  Jennifer G. Dy,et al.  Active Learning from Crowds , 2011, ICML.

[20]  H. Hirsh,et al.  Approximating the Wisdom of the Crowd , 2011 .

[21]  David G. Stork,et al.  Pattern Classification , 1973 .

[22]  Nicolas de Condorcet Essai Sur L'Application de L'Analyse a la Probabilite Des Decisions Rendues a la Pluralite Des Voix , 2009 .

[23]  Ohad Shamir,et al.  Good learners for evil teachers , 2009, ICML '09.

[24]  Tail and Concentration Inequalities , 2011 .

[25]  Bin Bi,et al.  Iterative Learning for Reliable Crowdsourcing Systems , 2012 .

[26]  Mark W. Schmidt,et al.  Modeling annotator expertise: Learning when everybody knows a bit of something , 2010, AISTATS.

[27]  L. Gleser On the Distribution of the Number of Successes in Independent Trials , 1975 .

[28]  Chien-Ju Ho,et al.  Adaptive Task Assignment for Crowdsourced Classification , 2013, ICML.

[29]  Nagarajan Natarajan,et al.  Learning with Noisy Labels , 2013, NIPS.