Reaching Consensus in Crowdsourced Transcription of Biocollections Information

Crowdsourcing can be a cost-effective method for tackling the problem of digitizing historical bio collections data, and a number of crowd sourcing platforms have been developed to facilitate interaction with the public and to design simple "Human Intelligence Tasks". However, the problem of reaching consensus on the response of the crowd is still challenging for tasks for which a simple majority vote is inadequate. This paper (a) describes the challenges faced when trying to reach consensus on data transcribed by different workers, (b) offers consensus algorithms for textual data and a consensus-based controller to assign a dynamic number of workers per task, and (c) proposes further enhancements of future crowd sourcing tasks in order to minimize the need for complex consensus algorithms. Experiments using the proposed algorithms show up to a 45-fold increase in ability to reach consensus when compared to majority voting using exact string matching. In addition, the controller is able to decrease the crowd sourcing cost by 55% when compared to a strategy that uses a fixed number of workers.

[1]  Matthew Lease,et al.  Semi-Supervised Consensus Labeling for Crowdsourcing , 2011 .

[2]  Matthew Lease,et al.  Improving Consensus Accuracy via Z-Score and Weighted Voting , 2011, Human Computation.

[3]  Justin Tonra,et al.  Transcription maximized; expense minimized? Crowdsourcing and editing The Collected Works of Jeremy Bentham , 2012, Lit. Linguistic Comput..

[4]  Gabriella Kazai,et al.  An analysis of human factors and label accuracy in crowdsourcing relevance judgments , 2013, Information Retrieval.

[5]  Panagiotis G. Ipeirotis,et al.  Quality management on Amazon Mechanical Turk , 2010, HCOMP '10.

[6]  Chris Callison-Burch,et al.  Fast, Cheap, and Creative: Evaluating Translation Quality Using Amazon’s Mechanical Turk , 2009, EMNLP.

[7]  Andrew S. I. D. Lang,et al.  Using Amazon Mechanical Turk to Transcribe Historical Handwritten Documents , 2011 .

[8]  Stefanie Nowak,et al.  How reliable are annotations via crowdsourcing: a study about inter-annotator agreement for multi-label image annotation , 2010, MIR '10.

[9]  Arthur Chapman,et al.  © 2005, Global Biodiversity Information Facility Material in this publication is free to use, with proper attribution. Recommended citation format: Chapman, A. D. 2005. Principles of Data Quality, version 1.0. Report for the Global Biodiversity Information Facility, Copenhagen. , 2005 .

[10]  Aniket Kittur,et al.  Crowdsourcing user studies with Mechanical Turk , 2008, CHI.

[11]  Vladimir I. Levenshtein,et al.  Binary codes capable of correcting deletions, insertions, and reversals , 1965 .

[12]  James R. Glass,et al.  A Transcription Task for Crowdsourcing with Automatic Quality Control , 2011, INTERSPEECH.

[13]  Devavrat Shah,et al.  Budget-optimal crowdsourcing using low-rank matrix approximations , 2011, 2011 49th Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[14]  Qin Gao,et al.  Consensus versus expertise: a case study of word alignment with Mechanical Turk , 2010, HLT-NAACL 2010.

[15]  Lydia B. Chilton,et al.  TurKit: human computation algorithms on mechanical turk , 2010, UIST.