The importance of being earnest in crowdsourcing systems

This paper presents the first systematic investigation of the potential performance gains for crowdsourcing systems, deriving from available information at the requester about individual worker earnestness (reputation). In particular, we first formalize the optimal task assignment problem when workers' reputation estimates are available, as the maximization of a monotone (submodular) function subject to Matroid constraints. Then, being the optimal problem NP-hard, we propose a simple but efficient greedy heuristic task allocation algorithm. We also propose a simple “maximum a-posteriori“ decision rule. Finally, we test and compare different solutions, showing that system performance can greatly benefit from information about workers' reputation. Our main findings are that: i) even largely inaccurate estimates of workers' reputation can be effectively exploited in the task assignment to greatly improve system performance; ii) the performance of the maximum a-posteriori decision rule quickly degrades as worker reputation estimates become inaccurate; iii) when workers' reputation estimates are significantly inaccurate, the best performance can be obtained by combining our proposed task assignment algorithm with the LRA decision rule introduced in the literature.

[1]  Chryssis Georgiou,et al.  Reliable Internet-Based Master-Worker Computing in the Presence of Malicious Workers , 2012, Parallel Process. Lett..

[2]  Jaime G. Carbonell,et al.  Efficiently learning the accuracy of labeling sources for selective sampling , 2009, KDD.

[3]  Chryssis Georgiou,et al.  Algorithmic mechanisms for internet-based master-worker computing with untrusted and selfish workers , 2010, 2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS).

[4]  Michael S. Bernstein,et al.  The future of crowd work , 2013, CSCW.

[5]  Devavrat Shah,et al.  Efficient crowdsourcing for multi-class labeling , 2013, SIGMETRICS '13.

[6]  Eric Horvitz,et al.  Incentives for truthful reporting in crowdsourcing , 2012, AAMAS.

[7]  Andreas Krause,et al.  Truthful incentives in crowdsourcing tasks using regret minimization mechanisms , 2013, WWW.

[8]  Devavrat Shah,et al.  Budget-optimal crowdsourcing using low-rank matrix approximations , 2011, 2011 49th Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[9]  Thomas M. Cover,et al.  Elements of Information Theory: Cover/Elements of Information Theory, Second Edition , 2005 .

[10]  R. Preston McAfee,et al.  Who moderates the moderators?: crowdsourcing abuse detection in user-generated content , 2011, EC '11.

[11]  J. Meigs,et al.  WHO Technical Report , 1954, The Yale Journal of Biology and Medicine.

[12]  Chryssis Georgiou,et al.  Applying the dynamics of evolution to achieve reliability in master–worker computing , 2013, Concurr. Comput. Pract. Exp..

[13]  Kwong-Sak Leung,et al.  A Survey of Crowdsourcing Systems , 2011, 2011 IEEE Third Int'l Conference on Privacy, Security, Risk and Trust and 2011 IEEE Third Int'l Conference on Social Computing.

[14]  Kun Deng,et al.  Active Learning from Multiple Noisy Labelers with Varied Costs , 2010, 2010 IEEE International Conference on Data Mining.

[15]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[16]  Jan Vondrák,et al.  Maximizing a Monotone Submodular Function Subject to a Matroid Constraint , 2011, SIAM J. Comput..

[17]  Devavrat Shah,et al.  Budget-Optimal Task Allocation for Reliable Crowdsourcing Systems , 2011, Oper. Res..

[18]  Ittai Abraham,et al.  Adaptive Crowdsourcing Algorithms for the Bandit Survey Problem , 2013, COLT.