Tuning Crowdsourced Human Computation

As crowdsourcing has been dramatically investigated and utilized to address problems in the real world, it is essential and important to think about performance optimization. Analogous to computer systems with CPUs, treating each worker as a HPU (Human Processing Unit [1]) and studying the performance optimization on top of HPUs are interesting perspectives to resolve crowdsourcing issues. However, as we characterize HPUs in detail for this purpose, we find that there are significant differences between CPUs and HPUs, leading to the need of completely new optimization algorithms. In this paper, we study the specific optimization problem of obtaining results the fastest for a crowdsourced job with a fixed total budget. In crowdsourcing, jobs are usually broken down into sets of small tasks, which are assigned to workers one at a time. We consider three scenarios of increasing complexity: Identical Round Homogeneous Tasks, Multiplex Round Homogeneous Tasks, and Multiple Round Heterogeneous Tasks. For each scenario, we analyze the stochastic behavior of the HPU clock rate as a function of the remuneration offered. After that, we develop an optimum Budget Allocation Strategy to minimize the latency of the job completion. We validate our results through extensive simulations and experiments on Amazon Mechanical Turk.

[1]  Eytan Adar Why I Hate Mechanical Turk Research (and Workshops) , 2011 .

[2]  Patrick Minder,et al.  CrowdManager - Combinatorial Allocation and Pricing of Crowdsourcing Tasks with Time Constraints , 2012, EC 2012.

[3]  Aditya G. Parameswaran,et al.  Finish Them!: Pricing Algorithms for Human Computation , 2014, Proc. VLDB Endow..

[4]  Tova Milo,et al.  CrowdPlanr: Planning made easy with crowd , 2013, 2013 IEEE 29th International Conference on Data Engineering (ICDE).

[5]  Panagiotis G. Ipeirotis,et al.  Estimating the Completion Time of Crowdsourced Tasks Using Survival Analysis Models , 2011 .

[6]  Vikas Kumar,et al.  CrowdSearch: exploiting crowds for accurate real-time image search on mobile phones , 2010, MobiSys '10.

[7]  Tim Kraska,et al.  CrowdDB: answering queries with crowdsourcing , 2011, SIGMOD '11.

[8]  Hector Garcia-Molina,et al.  tDP: An Optimal-Latency Budget Allocation Strategy for Crowdsourced MAXIMUM Operations , 2015, SIGMOD Conference.

[9]  Panagiotis G. Ipeirotis,et al.  Quality management on Amazon Mechanical Turk , 2010, HCOMP '10.

[10]  Sanjeev Khanna,et al.  Using the crowd for top-k and group-by queries , 2013, ICDT '13.

[11]  Björn Hartmann,et al.  What's the Right Price? Pricing Tasks for Finishing on Time , 2011, Human Computation.

[12]  Bill Tomlinson,et al.  Who are the crowdworkers?: shifting demographics in mechanical turk , 2010, CHI Extended Abstracts.

[13]  Guoliang Li,et al.  Crowdsourced Data Management: A Survey , 2016, IEEE Transactions on Knowledge and Data Engineering.

[14]  Ohad Greenshpan,et al.  Asking the Right Questions in Crowd Data Sourcing , 2012, 2012 IEEE 28th International Conference on Data Engineering.

[15]  Lei Chen,et al.  Reducing Uncertainty of Schema Matching via Crowdsourcing , 2013, Proc. VLDB Endow..

[16]  Michael S. Bernstein,et al.  Analytic Methods for Optimizing Realtime Crowdsourcing , 2012, ArXiv.

[17]  Xi Fang,et al.  Crowdsourcing to smartphones: incentive mechanism design for mobile phone sensing , 2012, Mobicom '12.

[18]  Rob Miller,et al.  VizWiz: nearly real-time answers to visual questions , 2010, UIST.

[19]  François Bry,et al.  Human computation , 2018, it Inf. Technol..

[20]  Alex Pentland,et al.  Time-Critical Social Mobilization , 2010, Science.

[21]  Michael S. Bernstein,et al.  Crowds in two seconds: enabling realtime crowd-powered interfaces , 2011, UIST.

[22]  Tim Kraska,et al.  Leveraging transitive relations for crowdsourced joins , 2013, SIGMOD '13.

[23]  Duncan J. Watts,et al.  Financial incentives and the "performance of crowds" , 2009, HCOMP '09.

[24]  David Alan Grier,et al.  The Math Tables Project of the Work Projects Administration: The Reluctant Start of the Computing Era , 1998, IEEE Ann. Hist. Comput..

[25]  Reynold Cheng,et al.  On incentive-based tagging , 2013, 2013 IEEE 29th International Conference on Data Engineering (ICDE).

[26]  Aditya G. Parameswaran,et al.  So who won?: dynamic max discovery with the crowd , 2012, SIGMOD Conference.

[27]  Neoklis Polyzotis,et al.  Max algorithms in crowdsourcing environments , 2012, WWW.

[28]  Ming-Hsuan Yang,et al.  The HPU , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Workshops.

[29]  David R. Karger,et al.  Human-powered Sorts and Joins , 2011, Proc. VLDB Endow..

[30]  Aditya G. Parameswaran,et al.  Crowd-powered find algorithms , 2014, 2014 IEEE 30th International Conference on Data Engineering.

[31]  Beng Chin Ooi,et al.  CDAS: A Crowdsourcing Data Analytics System , 2012, Proc. VLDB Endow..

[32]  Lei Chen,et al.  Whom to Ask? Jury Selection for Decision Making Tasks on Micro-blog Services , 2012, Proc. VLDB Endow..

[33]  Pierre Senellart,et al.  Crowd mining , 2013, SIGMOD '13.

[34]  Jennifer Widom,et al.  CrowdScreen: algorithms for filtering data with humans , 2012, SIGMOD Conference.

[35]  Aditya G. Parameswaran,et al.  Answering Queries using Humans, Algorithms and Databases , 2011, CIDR.