The Practice of Crowdsourcing

Abstract Many data-intensive applications that use machine learning or artificial intelligence techniques depend on humans providing the initial dataset, enabling algorithms to process the rest or ...

[1]  Gabriella Kazai,et al.  An analysis of human factors and label accuracy in crowdsourcing relevance judgments , 2013, Information Retrieval.

[2]  Krzysztof Z. Gajos,et al.  Toward automatic task design: a progress report , 2010, HCOMP '10.

[3]  Barbara Carminati,et al.  CrowdSelect: Increasing Accuracy of Crowdsourcing Tasks through Behavior Prediction and User Selection , 2016, CIKM.

[4]  Aniket Kittur,et al.  Crowd synthesis: extracting categories and clusters from complex data , 2014, CSCW.

[5]  Michael S. Bernstein,et al.  Heads in the cloud , 2010, XRDS.

[6]  Frank M. Shipman,et al.  The ownership and reuse of visual media , 2011, JCDL '11.

[7]  Falk Scholer,et al.  On Crowdsourcing Relevance Magnitudes for Information Retrieval Evaluation , 2017, ACM Trans. Inf. Syst..

[8]  Ingemar J. Cox,et al.  On Aggregating Labels from Multiple Crowd Workers to Infer Relevance of Documents , 2012, ECIR.

[9]  George Kesidis,et al.  Defeating Tyranny of the Masses in Crowdsourcing: Accounting for Low-Skilled and Adversarial Workers , 2013, GameSec.

[10]  Gabriella Kazai,et al.  The face of quality in crowdsourcing relevance labels: demographics, personality and labeling accuracy , 2012, CIKM.

[11]  Jaime G. Carbonell,et al.  A Probabilistic Framework to Learn from Multiple Annotators with Time-Varying Accuracy , 2010, SDM.

[12]  Manuel Blum,et al.  reCAPTCHA: Human-Based Character Recognition via Web Security Measures , 2008, Science.

[13]  Björn Hartmann,et al.  Collaboratively crowdsourcing workflows with turkomatic , 2012, CSCW.

[14]  Lorrie Faith Cranor,et al.  Are your participants gaming the system?: screening mechanical turk workers , 2010, CHI.

[15]  Fabio Casati,et al.  Modeling, Enacting, and Integrating Custom Crowdsourcing Processes , 2015, TWEB.

[16]  AnHai Doan,et al.  Chimera: Large-Scale Classification using Machine Learning, Rules, and Crowdsourcing , 2014, Proc. VLDB Endow..

[17]  Paul N. Bennett,et al.  The effects of choice in routing relevance judgments , 2011, SIGIR '11.

[18]  Benjamin B. Bederson,et al.  Human computation: a survey and taxonomy of a growing field , 2011, CHI.

[19]  Panagiotis G. Ipeirotis,et al.  Beat the Machine: Challenging Humans to Find a Predictive Model's “Unknown Unknowns” , 2015, JDIQ.

[20]  Chris Callison-Burch,et al.  Fast, Cheap, and Creative: Evaluating Translation Quality Using Amazon’s Mechanical Turk , 2009, EMNLP.

[21]  Frank M. Shipman,et al.  Experiences surveying the crowd: reflections on methods, participation, and reliability , 2013, WebSci.

[22]  Ece Kamar,et al.  Revolt: Collaborative Crowdsourcing for Labeling Machine Learning Datasets , 2017, CHI.

[23]  M. Six Silberman,et al.  Turkopticon: interrupting worker invisibility in amazon mechanical turk , 2013, CHI.

[24]  Boualem Benatallah,et al.  Quality Control in Crowdsourcing , 2018, ACM Comput. Surv..

[25]  A. Acquisti,et al.  Beyond the Turk: Alternative Platforms for Crowdsourcing Behavioral Research , 2016 .

[26]  Min-Yen Kan,et al.  Perspectives on crowdsourcing annotations for natural language processing , 2012, Language Resources and Evaluation.

[27]  Jean Carletta,et al.  Squibs: Reliability Measurement without Limits , 2008, CL.

[28]  Praveen Paritosh,et al.  The anatomy of a large-scale human computation engine , 2010, HCOMP '10.

[29]  Rahul Sami,et al.  A reputation system for selling human computation , 2009, HCOMP '09.

[30]  Xindong Wu,et al.  Learning from crowdsourced labeled data: a survey , 2016, Artificial Intelligence Review.

[31]  M. Six Silberman,et al.  Stories We Tell About Labor: Turkopticon and the Trouble with "Design" , 2016, CHI.

[32]  R. Caruana,et al.  Data Diff: Interpretable, Executable Summaries of Changes in Distributions for Data Wrangling , 2018, KDD.

[33]  Michael S. Bernstein,et al.  The future of crowd work , 2013, CSCW.

[34]  Carla E. Brodley,et al.  Identifying Mislabeled Training Data , 1999, J. Artif. Intell. Res..

[35]  Matt Post,et al.  The Language Demographics of Amazon Mechanical Turk , 2014, TACL.

[36]  Elisa Bertino,et al.  Quality Control in Crowdsourcing Systems: Issues and Directions , 2013, IEEE Internet Computing.

[37]  Aniket Kittur,et al.  Crowdsourcing user studies with Mechanical Turk , 2008, CHI.

[38]  Mark Sanderson,et al.  Quantifying test collection quality based on the consistency of relevance judgements , 2011, SIGIR.

[39]  Panagiotis G. Ipeirotis,et al.  Quality management on Amazon Mechanical Turk , 2010, HCOMP '10.

[40]  J. R. Landis,et al.  The measurement of observer agreement for categorical data. , 1977, Biometrics.

[41]  Ricardo Baeza-Yates,et al.  Design and Implementation of Relevance Assessments Using Crowdsourcing , 2011, ECIR.

[42]  Matthew Lease,et al.  Crowdsourcing for search evaluation and social-algorithmic search , 2012, SIGIR '12.

[43]  Arno Scharl,et al.  Games with a purpose for social networking platforms , 2009, HT '09.

[44]  Brendan T. O'Connor,et al.  Cheap and Fast – But is it Good? Evaluating Non-Expert Annotations for Natural Language Tasks , 2008, EMNLP.

[45]  Barry Boehm,et al.  A view of 20th and 21st century software engineering , 2006, ICSE.

[46]  Sanja Fidler,et al.  Human-Machine CRFs for Identifying Bottlenecks in Scene Understanding , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[47]  Alessandro Bozzon,et al.  Choosing the right crowd: expert finding in social networks , 2013, EDBT '13.

[48]  Michael S. Bernstein,et al.  Crowds in two seconds: enabling realtime crowd-powered interfaces , 2011, UIST.

[49]  Tim Kraska,et al.  CrowdER: Crowdsourcing Entity Resolution , 2012, Proc. VLDB Endow..

[50]  Walter S. Lasecki,et al.  Warping time for more effective real-time crowdsourcing , 2013, CHI.

[51]  Siddharth Suri,et al.  Conducting behavioral research on Amazon’s Mechanical Turk , 2010, Behavior research methods.

[52]  Omar Alonso,et al.  Crowdsourcing for relevance evaluation , 2008, SIGF.

[53]  Mon-Chu Chen,et al.  Rehumanized Crowdsourcing: A Labeling Framework Addressing Bias and Ethics in Machine Learning , 2019, CHI.

[54]  Bill Tomlinson,et al.  Who are the crowdworkers?: shifting demographics in mechanical turk , 2010, CHI Extended Abstracts.

[55]  Thomas Hofmann,et al.  Active Content-Based Crowdsourcing Task Selection , 2016, CIKM.

[56]  Aniket Kittur,et al.  Effects of simultaneous and sequential work structures on distributed collaborative interdependent tasks , 2014, CHI.

[57]  Michael S. Bernstein,et al.  Crowd Guilds: Worker-led Reputation and Feedback on Crowdsourcing Platforms , 2016, CSCW.

[58]  Gary Hsieh,et al.  You Get Who You Pay for: The Impact of Incentives on Participation Bias , 2016, CSCW.

[59]  Kimberly A. Barchard,et al.  Practical advice for conducting ethical online experiments and questionnaires for United States psychologists , 2008, Behavior research methods.

[60]  Greg Little Exploring iterative and parallel human computation processes , 2010, CHI EA '10.

[61]  Panagiotis G. Ipeirotis,et al.  Get another label? improving data quality and data mining using multiple, noisy labelers , 2008, KDD.

[62]  Catherine C. Marshall,et al.  Are Some Tweets More Interesting Than Others? #HardQuestion , 2013, HCIR '13.

[63]  Michael S. Bernstein,et al.  Soylent: a word processor with a crowd inside , 2015, Commun. ACM.

[64]  Scott R. Klemmer,et al.  Shepherding the crowd yields better work , 2012, CSCW.

[65]  Damon Horowitz,et al.  The anatomy of a large-scale social search engine , 2010, WWW '10.

[66]  Mary L. Gray,et al.  The Crowd is a Collaborative Network , 2016, CSCW.

[67]  Alessandro Bozzon,et al.  Clarity is a Worthwhile Quality: On the Role of Task Clarity in Microtask Crowdsourcing , 2017, HT.

[68]  Jesse Chandler,et al.  Nonnaïveté among Amazon Mechanical Turk workers: Consequences and solutions for behavioral researchers , 2013, Behavior Research Methods.

[69]  Lydia B. Chilton,et al.  Cascade: crowdsourcing taxonomy creation , 2013, CHI.

[70]  David A. Forsyth,et al.  Utility data annotation with Amazon Mechanical Turk , 2008, 2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

[71]  Craig MacDonald,et al.  Identifying top news using crowdsourcing , 2012, Information Retrieval.

[72]  Jennifer Widom,et al.  Understanding Workers, Developing Effective Tasks, and Enhancing Marketplace Dynamics: A Study of a Large Crowdsourcing Marketplace , 2017, Proc. VLDB Endow..

[73]  Daniel G. Goldstein,et al.  VoxPL: Programming with the Wisdom of the Crowd , 2017, CHI.

[74]  Gabriella Kazai,et al.  Crowdsourcing for book search evaluation: impact of hit design on comparative system ranking , 2011, SIGIR.

[75]  Dan Cosley,et al.  Taking a HIT: Designing around Rejection, Mistrust, Risk, and Workers' Experiences in Amazon Mechanical Turk , 2016, CHI.

[76]  Gerardo Hermosillo,et al.  Learning From Crowds , 2010, J. Mach. Learn. Res..

[77]  Luis von Ahn Games with a Purpose , 2006, Computer.

[78]  Michael S. Bernstein,et al.  Scalable multi-label annotation , 2014, CHI.

[79]  Vikas Kumar,et al.  CrowdSearch: exploiting crowds for accurate real-time image search on mobile phones , 2010, MobiSys '10.

[80]  RamananDeva,et al.  Efficiently Scaling up Crowdsourced Video Annotation , 2013 .

[81]  Eugene Agichtein,et al.  ViewSer: enabling large-scale remote user studies of web search examination and interaction , 2011, SIGIR.

[82]  Aniket Kittur,et al.  Instrumenting the crowd: using implicit behavioral measures to predict task performance , 2011, UIST.

[83]  A. P. Dawid,et al.  Maximum Likelihood Estimation of Observer Error‐Rates Using the EM Algorithm , 1979 .

[84]  Peng Dai,et al.  And Now for Something Completely Different: Improving Crowdsourcing Workflows with Micro-Diversions , 2015, CSCW.

[85]  Gordon B. Schmidt,et al.  Fifty Days an MTurk Worker: The Social and Motivational Context for Amazon Mechanical Turk Workers , 2015, Industrial and Organizational Psychology.

[86]  Michael S. Bernstein,et al.  Measuring Crowdsourcing Effort with Error-Time Curves , 2015, CHI.

[87]  Aniket Kittur,et al.  CrowdForge: crowdsourcing complex work , 2011, UIST.

[88]  Gjergji Kasneci,et al.  CoBayes: bayesian knowledge corroboration with assessors of unknown areas of expertise , 2011, WSDM '11.

[89]  Bob Carpenter,et al.  The Benefits of a Model of Annotation , 2013, Transactions of the Association for Computational Linguistics.

[90]  Jeffrey Heer,et al.  Strategies for crowdsourcing social data analysis , 2012, CHI.

[91]  Roi Blanco,et al.  Repeatable and reliable search system evaluation using crowdsourcing , 2011, SIGIR.

[92]  Jacki O'Neill,et al.  Being a turker , 2014, CSCW.

[93]  Sihem Amer-Yahia,et al.  A Survey of General-Purpose Crowdsourcing Techniques , 2016, IEEE Transactions on Knowledge and Data Engineering.

[94]  Catherine C. Marshall,et al.  Debugging a Crowdsourced Task with Low Inter-Rater Agreement , 2015, JCDL.

[95]  Maurizio Marchese,et al.  Crowdsourcing Processes: A Survey of Approaches and Opportunities , 2016, IEEE Internet Computing.

[96]  Lydia B. Chilton,et al.  TurKit: Tools for iterative tasks on mechanical turk , 2009, 2009 IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC).

[97]  Vikas Sindhwani,et al.  Data Quality from Crowdsourcing: A Study of Annotation Selection Criteria , 2009, HLT-NAACL 2009.

[98]  Alexander I. Rudnicky,et al.  Using the Amazon Mechanical Turk for transcription of spoken language , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.

[99]  Martin Hentschel,et al.  CrowdSTAR: A Social Task Routing Framework for Online Communities , 2014, ICWE.

[100]  Pietro Perona,et al.  Online crowdsourcing: Rating annotators and obtaining cost-effective labels , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Workshops.

[101]  E. Burton Swanson,et al.  Characteristics of application software maintenance , 1978, CACM.

[102]  Purnamrita Sarkar,et al.  Scaling Up Crowd-Sourcing to Very Large Datasets: A Case for Active Learning , 2014, Proc. VLDB Endow..

[103]  Tim Kraska,et al.  CrowdDB: answering queries with crowdsourcing , 2011, SIGMOD '11.

[104]  Duncan J. Watts,et al.  Financial incentives and the "performance of crowds" , 2009, HCOMP '09.

[105]  Aaron D. Shaw,et al.  Social desirability bias and self-reports of motivation: a study of amazon mechanical turk in the US and India , 2012, CHI.

[106]  Pietro Perona,et al.  Visual Recognition with Humans in the Loop , 2010, ECCV.

[107]  Omar Alonso,et al.  Duplicate News Story Detection Revisited , 2013, AIRS.

[108]  Jon Sprouse A validation of Amazon Mechanical Turk for the collection of acceptability judgments in linguistic theory , 2010, Behavior research methods.

[109]  Gabriella Kazai,et al.  Quality Management in Crowdsourcing using Gold Judges Behavior , 2016, WSDM.

[110]  Gabriella Kazai,et al.  Towards methods for the collective gathering and quality control of relevance assessments , 2009, SIGIR.

[111]  Gianluca Demartini,et al.  ZenCrowd: leveraging probabilistic reasoning and crowdsourcing techniques for large-scale entity linking , 2012, WWW.

[112]  David J. Hauser,et al.  Attentive Turkers: MTurk participants perform better on online attention checks than do subject pool participants , 2015, Behavior Research Methods.

[113]  Laura A. Dabbish,et al.  Designing games with a purpose , 2008, CACM.

[114]  Fenglong Ma,et al.  Crowdsourcing High Quality Labels with a Tight Budget , 2016, WSDM.

[115]  Donghui Feng,et al.  Acquiring High Quality Non-Expert Knowledge from On-Demand Workforce , 2009, PWNLP@IJCNLP.

[116]  Lydia B. Chilton,et al.  Frenzy: collaborative data organization for creating conference sessions , 2014, CHI.

[117]  Ujwal Gadiraju,et al.  Make Hay While the Crowd Shines: Towards Efficient Crowdsourcing on the Web , 2015, WWW.

[118]  Gerardo Hermosillo,et al.  Supervised learning from multiple experts: whom to trust when everyone lies a bit , 2009, ICML '09.

[119]  Jeffrey Nichols,et al.  Chorus: a crowd-powered conversational assistant , 2013, UIST.

[120]  Jing Wang,et al.  Cost-Effective Quality Assurance in Crowd Labeling , 2016, Inf. Syst. Res..

[121]  Panagiotis G. Ipeirotis,et al.  Managing crowdsourced human computation: a tutorial , 2011, WWW.

[122]  Ben Carterette,et al.  The effect of assessor error on IR system evaluation , 2010, SIGIR.

[123]  Gabriella Kazai,et al.  Worker types and personality traits in crowdsourcing relevance labels , 2011, CIKM '11.

[124]  Edith Law,et al.  Input-agreement: a new mechanism for collecting data using human computation games , 2009, CHI.

[125]  Aleksandrs Slivkins,et al.  Incentivizing High Quality Crowdwork , 2015 .

[126]  Panagiotis G. Ipeirotis Analyzing the Amazon Mechanical Turk marketplace , 2010, XRDS.

[127]  Alon Y. Halevy,et al.  Crowdsourcing systems on the World-Wide Web , 2011, Commun. ACM.

[128]  Jeffrey F. Naughton,et al.  Corleone: hands-off crowdsourcing for entity matching , 2014, SIGMOD Conference.

[129]  Deva Ramanan,et al.  Efficiently Scaling up Crowdsourced Video Annotation , 2012, International Journal of Computer Vision.

[130]  Stefanie Nowak,et al.  How reliable are annotations via crowdsourcing: a study about inter-annotator agreement for multi-label image annotation , 2010, MIR '10.

[131]  Ming Yin,et al.  The Communication Network Within the Crowd , 2016, WWW.

[132]  John Langford,et al.  CAPTCHA: Using Hard AI Problems for Security , 2003, EUROCRYPT.

[133]  Laura A. Dabbish,et al.  Labeling images with a computer game , 2004, AAAI Spring Symposium: Knowledge Collection from Volunteer Contributors.

[134]  Scott B. Huffman,et al.  How evaluator domain expertise affects search result relevance judgments , 2008, CIKM '08.

[135]  Schahram Dustdar,et al.  The Social Routing Principle , 2011, IEEE Internet Computing.

[136]  Jennifer Widom,et al.  CrowdScreen: algorithms for filtering data with humans , 2012, SIGMOD Conference.

[137]  Matthew Reid,et al.  Quality control mechanisms for crowdsourcing: peer review, arbitration, & expertise at familysearch indexing , 2013, CSCW '13.

[138]  Karl Aberer,et al.  Minimizing Efforts in Validating Crowd Answers , 2015, SIGMOD Conference.

[139]  P. Kidwell,et al.  The mythical man-month: Essays on software engineering , 1996, IEEE Annals of the History of Computing.

[140]  Lydia B. Chilton,et al.  Task search in a human computation market , 2010, HCOMP '10.

[141]  Lydia B. Chilton,et al.  TurKit: human computation algorithms on mechanical turk , 2010, UIST.

[142]  Emine Yilmaz,et al.  Crowdsourcing interactions: using crowdsourcing for evaluating interactive information retrieval systems , 2012, Information Retrieval.

[143]  Bo Zhao,et al.  The wisdom of minority: discovering and targeting the right group of workers for crowdsourcing , 2014, WWW.

[144]  Ed H. Chi,et al.  An elaborated model of social search , 2010, Inf. Process. Manag..

[145]  Thomas Hofmann,et al.  Exploiting Document Content for Efficient Aggregation of Crowdsourcing Votes , 2015, CIKM.

[146]  Chris Callison-Burch,et al.  A Data-Driven Analysis of Workers' Earnings on Amazon Mechanical Turk , 2017, CHI.

[147]  Panagiotis G. Ipeirotis,et al.  Demographics and Dynamics of Mechanical Turk Workers , 2018, WSDM.

[148]  Omar Alonso,et al.  Using crowdsourcing for TREC relevance assessment , 2012, Inf. Process. Manag..

[149]  Panagiotis G. Ipeirotis,et al.  Running Experiments on Amazon Mechanical Turk , 2010, Judgment and Decision Making.

[150]  Ittai Abraham,et al.  How Many Workers to Ask?: Adaptive Exploration for Collecting High Quality Labels , 2014, SIGIR.

[151]  Michael S. Bernstein,et al.  A Glimpse Far into the Future: Understanding Long-term Crowd Worker Quality , 2016, CSCW.

[152]  Peng Dai,et al.  POMDP-based control of workflows for crowdsourcing , 2013, Artif. Intell..

[153]  Gabriella Kazai,et al.  In Search of Quality in Crowdsourcing for Search Engine Evaluation , 2011, ECIR.

[154]  Carsten Eickhoff,et al.  Cognitive Biases in Crowdsourcing , 2018, WSDM.

[155]  Shuguang Han,et al.  Crowdsourcing Human Annotation on Web Page Structure , 2016, ACM Trans. Intell. Syst. Technol..

[156]  Gerhard Weikum,et al.  Combining information extraction and human computing for crowdsourced knowledge acquisition , 2014, 2014 IEEE 30th International Conference on Data Engineering.

[157]  Kalina Bontcheva,et al.  Crowdsourcing research opportunities: lessons from natural language processing , 2012, i-KNOW '12.

[158]  Alice M. Brawley,et al.  Work experiences on MTurk: Job satisfaction, turnover, and information sharing , 2016, Comput. Hum. Behav..

[159]  Stefan Dietze,et al.  Understanding Malicious Behavior in Crowdsourcing Platforms: The Case of Online Surveys , 2015, CHI.

[160]  Michael Gamon,et al.  Social annotations: utility and prediction modeling , 2012, SIGIR '12.

[161]  Matthew Lease,et al.  Look before you leap: Legal pitfalls of crowdsourcing , 2011, ASIST.

[162]  Gianluca Demartini,et al.  Pick-a-crowd: tell me what you like, and i'll tell you what to do , 2013, CIDR.