Incentivizing high quality crowdwork

We study the causal effects of financial incentives on the quality of crowdwork. We focus on performance-based payments (PBPs), bonus payments awarded to workers for producing high quality work. We design and run randomized behavioral experiments on the popular crowdsourcing platform Amazon Mechanical Turk with the goal of understanding when, where, and why PBPs help, identifying properties of the payment, payment structure, and the task itself that make them most effective. We provide examples of tasks for which PBPs do improve quality. For such tasks, the effectiveness of PBPs is not too sensitive to the threshold for quality required to receive the bonus, while the magnitude of the bonus must be large enough to make the reward salient. We also present examples of tasks for which PBPs do not improve quality. Our results suggest that for PBPs to improve quality, the task must be effort-responsive: the task must allow workers to produce higher quality work by exerting more effort. We also give a simple method to determine if a task is effort-responsive a priori. Furthermore, our experiments suggest that all payments on Mechanical Turk are, to some degree, implicitly performance-based in that workers believe their work may be rejected if their performance is sufficiently poor. In the full version of this paper, we propose a new model of worker behavior that extends the standard principal-agent model from economics to include a worker's subjective beliefs about his likelihood of being paid, and show that the predictions of this model are in line with our experimental findings. This model may be useful as a foundation for theoretical studies of incentives in crowdsourcing markets.

[1]  Colin Camerer,et al.  The Effects of Financial Incentives in Experiments: A Review and Capital-Labor-Production Framework , 1999 .

[2]  Panagiotis G. Ipeirotis,et al.  Quality management on Amazon Mechanical Turk , 2010, HCOMP '10.

[3]  Chien-Ju Ho,et al.  Adaptive Task Assignment for Crowdsourced Classification , 2013, ICML.

[4]  J. Shaw,et al.  Are financial incentives related to performance? A meta-analytic review of empirical research. , 1998 .

[5]  Luciano Messori The Theory of Incentives I: The Principal-Agent Model , 2013 .

[6]  Geoffrey B. Sprinkle,et al.  A Review of the Effects of Financial Incentives on Performance in Laboratory Tasks: Implications for Management Accounting , 2000 .

[7]  Panagiotis G. Ipeirotis,et al.  Get another label? improving data quality and data mining using multiple, noisy labelers , 2008, KDD.

[8]  Richard Honeck,et al.  Experimental Design and Analysis , 2006 .

[9]  David G. Rand,et al.  The online laboratory: conducting experiments in a real labor market , 2010, ArXiv.

[10]  Wayne Lee EXperimental design and analysis , 1975 .

[11]  Ricardo Matsumura de Araújo,et al.  99designs: An Analysis of Creative Competition in Crowdsourced Design , 2013, HCOMP.

[12]  Aaron D. Shaw,et al.  Designing incentives for inexpert human raters , 2011, CSCW.

[13]  Christopher G. Harris You're Hired! An Examination of Crowdsourcing Incentive Models in Human Resource Tasks , 2011 .

[14]  Aleksandrs Slivkins,et al.  Adaptive contract design for crowdsourcing markets: bandit algorithms for repeated principal-agent problems , 2014, J. Artif. Intell. Res..

[15]  J. Cameron,et al.  Reinforcement, Reward, and Intrinsic Motivation: A Meta-Analysis , 1994 .

[16]  Deepak Malhotra,et al.  When 3 + 1 > 4: Gift Structure and Reciprocity in the Field , 2015, Manag. Sci..

[17]  Daniel Schall,et al.  Service-Oriented Crowdsourcing , 2012, SpringerBriefs in Computer Science.

[18]  Jian Peng,et al.  Variational Inference for Crowdsourcing , 2012, NIPS.

[19]  Jesse Chandler,et al.  Nonnaïveté among Amazon Mechanical Turk workers: Consequences and solutions for behavioral researchers , 2013, Behavior Research Methods.

[20]  Lydia B. Chilton,et al.  The labor economics of paid crowdsourcing , 2010, EC '10.

[21]  Horst Bunke,et al.  The IAM-database: an English sentence database for offline handwriting recognition , 2002, International Journal on Document Analysis and Recognition.

[22]  Duncan J. Watts,et al.  Financial incentives and the "performance of crowds" , 2009, HCOMP '09.

[23]  Pietro Perona,et al.  The Caltech-UCSD Birds-200-2011 Dataset , 2011 .

[24]  A. Rustichini,et al.  Pay Enough or Don't Pay at All , 2000 .

[25]  Aniket Kittur,et al.  An Assessment of Intrinsic and Extrinsic Motivation on Task Performance in Crowdsourcing Markets , 2011, ICWSM.

[26]  Omar Alonso,et al.  Implementing crowdsourcing-based relevance experimentation: an industrial perspective , 2013, Information Retrieval.

[27]  Chien-Ju Ho,et al.  Online Task Assignment in Crowdsourcing Markets , 2012, AAAI.

[28]  R. Hertwig,et al.  Experimental practices in economics: A methodological challenge for psychologists? , 2001, Behavioral and Brain Sciences.

[29]  B. Frey,et al.  The Cost of Price Incentives: An Empirical Analysis of Motivation Crowding-Out , 1997 .

[30]  Aniket Kittur,et al.  Crowdsourcing user studies with Mechanical Turk , 2008, CHI.

[31]  Leib Litman,et al.  The relationship between motivation, monetary compensation, and data quality among US- and India-based workers on Mechanical Turk , 2014, Behavior Research Methods.

[32]  Siddharth Suri,et al.  Conducting behavioral research on Amazon’s Mechanical Turk , 2010, Behavior research methods.

[33]  Yu-An Sun,et al.  Monetary Interventions in Crowdsourcing Task Switching , 2014, HCOMP.

[34]  Yu-An Sun,et al.  The Effects of Performance-Contingent Financial Incentives in Online Labor Markets , 2013, AAAI.

[35]  Daniel Schall,et al.  Service-Oriented Crowdsourcing: Architecture, Protocols and Algorithms , 2012 .

[36]  Bin Bi,et al.  Iterative Learning for Reliable Crowdsourcing Systems , 2012 .

[37]  R. Eisenberger,et al.  Detrimental effects of reward. Reality or myth? , 1996, The American psychologist.