How One Microtask Affects Another

Microtask platforms are becoming commonplace tools for performing human research, producing gold-standard data, and annotating large datasets. These platforms connect requesters (researchers or companies) with large populations (crowds) of workers, who perform small tasks, typically taking less than five minutes each. A topic of ongoing research concerns the design of tasks that elicit high quality annotations. Here we identify a seemingly banal feature of nearly all crowdsourcing workflows that profoundly impacts workers' responses. Microtask assignments typically consist of a sequence of tasks sharing a common format (e.g., circle galaxies in an image). Using image-labeling, a canonical microtask format, we show that earlier tasks can have a strong influence on responses to later tasks, shifting the distribution of future responses by 30-50% (total variational distance). Specifically, prior tasks influence the content that workers focus on, as well as the richness and specialization of responses. We call this phenomenon intertask effects. We compare intertask effects to framing, effected by stating the requester's research interest, and find that intertask effects are on par or stronger. If uncontrolled, intertask effects could be a source of systematic bias, but our results suggest that, with appropriate task design, they might be leveraged to hone worker focus and acuity, helping to elicit reproducible, expert-level judgments. Intertask effects are a crucial aspect of human computation that should be considered in the design of any crowdsourced study.

[1]  Mark A. Musen,et al.  Crowdsourcing the Verification of Relationships in Biomedical Ontologies , 2013, AMIA.

[2]  David E Huber,et al.  Immediate priming and cognitive aftereffects. , 2008, Journal of experimental psychology. General.

[3]  Lydia B. Chilton,et al.  Task search in a human computation market , 2010, HCOMP '10.

[4]  A. Buchner,et al.  Negative Priming as a Memory Phenomenon , 2007 .

[5]  Yu-An Sun,et al.  Monetary Interventions in Crowdsourcing Task Switching , 2014, HCOMP.

[6]  Aniket Kittur,et al.  CrowdForge: crowdsourcing complex work , 2011, UIST.

[7]  H. K. Beller Priming: effects of advance information on matching. , 1971, Journal of experimental psychology.

[8]  Stefano Tranquillini,et al.  Keep it simple: reward and task design in crowdsourcing , 2013, CHItaly '13.

[9]  Brendan T. O'Connor,et al.  Cheap and Fast – But is it Good? Evaluating Non-Expert Annotations for Natural Language Tasks , 2008, EMNLP.

[10]  Dana Chandler,et al.  Breaking Monotony with Meaning: Motivation in Crowdsourcing Markets , 2012, ArXiv.

[11]  Elizabeth Gerber,et al.  Affective computational priming and creativity , 2011, CHI.

[12]  Ilias Diakonikolas,et al.  Optimal Algorithms for Testing Closeness of Discrete Distributions , 2013, SODA.

[13]  P. Kellman,et al.  Perceptual learning and human expertise. , 2009, Physics of life reviews.

[14]  Michael S. Bernstein,et al.  Measuring Crowdsourcing Effort with Error-Time Curves , 2015, CHI.

[15]  Gabriella Kazai,et al.  An analysis of human factors and label accuracy in crowdsourcing relevance judgments , 2013, Information Retrieval.

[16]  Pietro Perona,et al.  Sleep spindle detection: crowdsourcing and evaluating performance of experts, non-experts, and automated methods , 2014, Nature Methods.

[17]  David M. W. Powers Proceedings of the Joint Conferences on New Methods in Language Processing and Computational Natural Language Learning , 1998 .

[18]  Peng Dai,et al.  And Now for Something Completely Different: Improving Crowdsourcing Workflows with Micro-Diversions , 2015, CSCW.

[19]  Krzysztof Z. Gajos,et al.  Toward automatic task design: a progress report , 2010, HCOMP '10.

[20]  Duncan J. Watts,et al.  Financial incentives and the "performance of crowds" , 2009, HCOMP '09.

[21]  Christiane Fellbaum,et al.  Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.

[22]  M. Bar,et al.  The effects of priming on frontal-temporal communication , 2008, Proceedings of the National Academy of Sciences.

[23]  Adam Marcus,et al.  The Effects of Sequence and Delay on Crowd Work , 2015, CHI.

[24]  Matthew Lease,et al.  Look before you leap: Legal pitfalls of crowdsourcing , 2011, ASIST.

[25]  Eric R. Ziegel,et al.  The Elements of Statistical Learning , 2003, Technometrics.

[26]  Adam J. Berinsky,et al.  Evaluating Online Labor Markets for Experimental Research: Amazon.com's Mechanical Turk , 2012, Political Analysis.

[27]  George Kingsley Zipf,et al.  Human behavior and the principle of least effort , 1949 .

[28]  Scott R. Klemmer,et al.  Shepherding the crowd yields better work , 2012, CSCW.

[29]  Brian P. Bailey,et al.  If not now, when?: the effects of interruption at different moments within task execution , 2004, CHI.

[30]  Lera Boroditsky,et al.  Natural Language Metaphors Covertly Influence Reasoning , 2013, PloS one.

[31]  J. Morton,et al.  The effects of priming with regularly and irregularly related words in auditory word recognition. , 1982, British journal of psychology.

[32]  David M. W. Powers,et al.  Applications and Explanations of Zipf’s Law , 1998, CoNLL.

[33]  Laura A. Dabbish,et al.  Workflow transparency in a microtask marketplace , 2012, GROUP.

[34]  P. Bickel,et al.  Some theory for Fisher''s linear discriminant function , 2004 .

[35]  J. Morton,et al.  The effects of priming on picture recognition. , 1982, British journal of psychology.

[36]  Ming-Hsuan Yang,et al.  The HPU , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Workshops.

[37]  John Le,et al.  Ensuring quality in crowdsourced search relevance evaluation: The effects of training question distribution , 2010 .

[38]  Ronitt Rubinfeld,et al.  Testing Closeness of Discrete Distributions , 2010, JACM.

[39]  C. Papadimitriou,et al.  Algorithmic Approaches to Statistical Questions , 2012 .

[40]  Harry Zhang,et al.  The Optimality of Naive Bayes , 2004, FLAIRS.