OnTac: Online task assignment for crowdsourcing

How to integrate labels from multiple labelers in order to obtain an accurate estimate of the ground truth is a major topic of crowdsourcing. One challenging issue is that, the labelers' abilities may vary significantly and the tasks distinguish each other in difficulties. Moreover, for a crowdsourcing system, task distributors have no idea in advance how many labels will be enough for each task. Consequently, an online task assignment mechanism based on the labeler expertise and question heterogeneousness becomes necessary. In this paper, we present such an online task assignment algorithm based on a probabilistic model consisting of both labeler abilities and question difficulties. We apply the online EM (Expectation Maximization) algorithm to make online estimations of system parameters, based on which we assign tasks adaptively. A series of simulation results have been demonstrated to show that our proposed scheme outperforms the conventional EM algorithm in efficiency and accuracy.

[1]  Gerardo Hermosillo,et al.  Learning From Crowds , 2010, J. Mach. Learn. Res..

[2]  Devavrat Shah,et al.  Efficient crowdsourcing for multi-class labeling , 2013, SIGMETRICS '13.

[3]  Shipeng Yu,et al.  An Entropic Score to Rank Annotators for Crowdsourced Labeling Tasks , 2011, 2011 Third National Conference on Computer Vision, Pattern Recognition, Image Processing and Graphics.

[4]  Gagan Goel,et al.  Allocating tasks to workers with matching constraints: truthful mechanisms for crowdsourcing markets , 2014, WWW.

[5]  Javier R. Movellan,et al.  Whose Vote Should Count More: Optimal Integration of Labels from Labelers of Unknown Expertise , 2009, NIPS.

[6]  Pietro Perona,et al.  Online crowdsourcing: Rating annotators and obtaining cost-effective labels , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Workshops.

[7]  Jennifer G. Dy,et al.  Active Learning from Crowds , 2011, ICML.

[8]  Devavrat Shah,et al.  Iterative Learning for Reliable Crowdsourcing Systems , 2011, NIPS.

[9]  Chien-Ju Ho,et al.  Adaptive Task Assignment for Crowdsourced Classification , 2013, ICML.

[10]  Tom Minka,et al.  How To Grade a Test Without Knowing the Answers - A Bayesian Graphical Model for Adaptive Crowdsourcing and Aptitude Testing , 2012, ICML.

[11]  Thomas Pfeiffer,et al.  Adaptive Polling for Information Aggregation , 2012, AAAI.

[12]  Dan Klein,et al.  Online EM for Unsupervised Models , 2009, NAACL.

[13]  A. P. Dawid,et al.  Maximum Likelihood Estimation of Observer Error‐Rates Using the EM Algorithm , 1979 .

[14]  Bo Zhao,et al.  The wisdom of minority: discovering and targeting the right group of workers for crowdsourcing , 2014, WWW.