Speaking Outside the Box: Exploring the Benefits of Unconstrained Input in Crowdsourcing and Citizen Science Platforms

Crowdsourcing approaches provide a difficult design challenge for developers. There is a trade-off between the efficiency of the task to be done and the reward given to the user for participating, whether it be altruism, social enhancement, entertainment or money. This paper explores how crowdsourcing and citizen science systems collect data and complete tasks, illustrated by a case study from the online language game-with-a-purpose Phrase Detectives. The game was originally developed to be a constrained interface to prevent player collusion, but subsequently benefited from posthoc analysis of over 76k unconstrained inputs from users. Understanding the interface design and task deconstruction are critical for enabling users to participate in such systems and the paper concludes with a discussion of the idea that social networks can be viewed as form of citizen science platform with both constrained and unconstrained inputs making for a highly complex dataset.

[1]  Gilad Mishne,et al.  Finding high-quality content in social media , 2008, WSDM '08.

[2]  Jisup Hong,et al.  How Good is the Crowd at "real" WSD? , 2011, Linguistic Annotation Workshop.

[3]  Chris Callison-Burch,et al.  Creating Speech and Language Data With Amazon’s Mechanical Turk , 2010, Mturk@HLT-NAACL.

[4]  Kevin Crowston,et al.  From Conservation to Crowdsourcing: A Typology of Citizen Science , 2011, 2011 44th Hawaii International Conference on System Sciences.

[5]  Chrysanthos Dellarocas,et al.  Harnessing Crowds: Mapping the Genome of Collective Intelligence , 2009 .

[6]  Cliff O'Reilly,et al.  User Performance Indicators In Task-Based Data Collection Systems , 2014, MindTheGap@iConference.

[7]  Udo Kruschwitz,et al.  Phrase detectives: Utilizing collective intelligence for internet-scale language resource creation , 2013, TIIS.

[8]  Michael B. Twidale,et al.  Design Facets of Crowdsourcing , 2015 .

[9]  Gerardo Hermosillo,et al.  Learning From Crowds , 2010, J. Mach. Learn. Res..

[10]  Luis von Ahn Games with a Purpose , 2006, Computer.

[11]  Kôiti Hasida,et al.  ISO 24617-2: A semantically-based standard for dialogue annotation , 2012, LREC.

[12]  Alexander I. Rudnicky,et al.  Using the Amazon Mechanical Turk for transcription of spoken language , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.

[13]  Y. Benkler,et al.  Commons‐based Peer Production and Virtue* , 2006 .

[14]  Jon Chamberlain,et al.  Groupsourcing: Distributed Problem Solving Using Social Networks , 2014, HCOMP.

[15]  Udo Kruschwitz,et al.  Exploring Language Style in Chatbots to Increase Perceived Product Value and User Engagement , 2019, CHIIR.

[16]  Yang Liu,et al.  Non-Expert Evaluation of Summarization Systems is Risky , 2010, Mturk@HLT-NAACL.

[17]  Andrew B. Whinston,et al.  Social Computing: An Overview , 2007, Commun. Assoc. Inf. Syst..

[18]  Udo Kruschwitz,et al.  Assessing Crowdsourcing Quality through Objective Tasks , 2012, LREC.

[19]  Paulo Minatel Gonella,et al.  Drosera magnifica (Droseraceae): the largest New World sundew, discovered on Facebook , 2015 .

[20]  Rajarshi Das,et al.  Emerging theories and models of human computation systems: a brief survey , 2011, UbiCrowd '11.

[21]  Udo Kruschwitz,et al.  A Crowdsourced Corpus of Multiple Judgments and Disagreement on Anaphoric Interpretation , 2019, NAACL.

[22]  Udo Kruschwitz,et al.  Phrase Detectives Corpus 1.0 Crowdsourced Anaphoric Coreference. , 2016, LREC.

[23]  James D. Herbsleb,et al.  Transparency and Coordination in Peer Production , 2014, ArXiv.

[24]  Ted S. Sindlinger,et al.  Crowdsourcing: Why the Power of the Crowd is Driving the Future of Business , 2010 .

[25]  Mauro Giavalisco Galaxy Evolution , 2006 .

[26]  D. Maynard,et al.  Challenges in developing opinion mining tools for social media , 2012 .

[27]  Heng-Li Yang,et al.  Motivations of Wikipedia content contributors , 2010, Comput. Hum. Behav..

[28]  Udo Kruschwitz,et al.  Phrase Detectives: A Web-based collaborative annotation game , 2008 .

[29]  Eric Schenk,et al.  Towards a characterization of crowdsourcing practices , 2011 .

[30]  Chris Callison-Burch,et al.  Cheap, Fast and Good Enough: Automatic Speech Recognition with Non-Expert Transcription , 2010, NAACL.

[31]  Ryan T. Wright,et al.  Communications of the Association for Information Systems , 2010 .

[32]  D. Clery Galaxy evolution. Galaxy zoo volunteers share pain and glory of research. , 2011, Science.

[33]  Benjamin B. Bederson,et al.  Human computation: a survey and taxonomy of a growing field , 2011, CHI.

[34]  L. Jeppesen,et al.  The Value of Openness in Scientific Problem Solving , 2007 .

[35]  Eduard H. Hovy,et al.  A Taxonomy, Dataset, and Classifier for Automatic Noun Compound Interpretation , 2010, ACL.

[36]  Arno Scharl,et al.  Games with a purpose for social networking platforms , 2009, HT '09.

[37]  Qi Su,et al.  Internet-scale collection of human-reviewed data , 2007, WWW '07.

[38]  Laura A. Dabbish,et al.  Designing games with a purpose , 2008, CACM.

[39]  Udo Kruschwitz,et al.  Optimising crowdsourcing efficiency: Amplifying human computation with validation , 2018, it Inf. Technol..