Learning from Measurements in Crowdsourcing Models: Inferring Ground Truth from Diverse Annotation Types

Annotated corpora enable supervised machine learning and data analysis. To reduce the cost of manual annotation, tasks are often assigned to internet workers whose judgments are reconciled by crowdsourcing models. We approach the problem of crowdsourcing using a framework for learning from rich prior knowledge, and we identify a family of crowdsourcing models with the novel ability to combine annotations with differing structures: e.g., document labels and word labels. Annotator judgments are given in the form of the predicted expected value of measurement functions computed over annotations and the data, unifying annotation models. Our model, a specific instance of this framework, compares favorably with previous work. Furthermore, it enables active sample selection, jointly selecting annotator, data item, and annotation structure to reduce annotation effort.

[1]  Eric K. Ringger,et al.  Making the Most of Crowdsourced Document Annotations: Confused Supervised LDA , 2015, CoNLL.

[2]  A. P. Dawid,et al.  Maximum Likelihood Estimation of Observer Error‐Rates Using the EM Algorithm , 1979 .

[3]  Javier R. Movellan,et al.  Whose Vote Should Count More: Optimal Integration of Labels from Labelers of Unknown Expertise , 2009, NIPS.

[4]  Stephen J. Roberts,et al.  Bayesian Methods for Intelligent Task Assignment in Crowdsourcing Systems , 2015, Decision Making.

[5]  Eric K. Ringger,et al.  Parallel Active Learning: Eliminating Wait Time with Minimal Staleness , 2010, HLT-NAACL 2010.

[6]  Subramanian Ramanathan,et al.  Learning from multiple annotators with varying expertise , 2013, Machine Learning.

[7]  Burr Settles,et al.  Active Learning Literature Survey , 2009 .

[8]  Brendan T. O'Connor,et al.  Cheap and Fast – But is it Good? Evaluating Non-Expert Annotations for Natural Language Tasks , 2008, EMNLP.

[9]  Shipeng Yu,et al.  Eliminating Spammers and Ranking Annotators for Crowdsourced Labeling Tasks , 2012, J. Mach. Learn. Res..

[10]  Jennifer G. Dy,et al.  Active Learning from Crowds , 2011, ICML.

[11]  Dirk Hovy,et al.  Learning Whom to Trust with MACE , 2013, NAACL.

[12]  Robbie A. Haertel,et al.  Practical Cost-Conscious Active Learning for Data Annotation in Annotator-Initiated Environments , 2013 .

[13]  David Jurgens,et al.  Embracing Ambiguity: A Comparison of Annotation Methodologies for Crowdsourcing Word Sense Labels , 2013, NAACL.

[14]  Ming-Wei Chang,et al.  Learning and Inference with Constraints , 2008, AAAI.

[15]  Matthew Lease,et al.  SQUARE: A Benchmark for Research on Computing Crowd Consensus , 2013, HCOMP.

[16]  Gerardo Hermosillo,et al.  Learning From Crowds , 2010, J. Mach. Learn. Res..

[17]  Michael Vitale,et al.  The Wisdom of Crowds , 2015, Cell.

[18]  Matthew Lease,et al.  Combining Crowd and Expert Labels Using Decision Theoretic Active Learning , 2015, HCOMP.

[19]  Pietro Perona,et al.  The Multidimensional Wisdom of Crowds , 2010, NIPS.

[20]  Ben Taskar,et al.  Posterior Regularization for Structured Latent Variable Models , 2010, J. Mach. Learn. Res..

[21]  Jaime G. Carbonell,et al.  Proactive learning: cost-sensitive active learning with multiple imperfect oracles , 2008, CIKM '08.

[22]  Eric K. Ringger,et al.  Early Gains Matter: A Case for Preferring Generative over Discriminative Crowdsourcing Models , 2015, NAACL.

[23]  Klaus Krippendorff,et al.  Content Analysis: An Introduction to Its Methodology , 1980 .

[24]  Gideon S. Mann,et al.  Learning from labeled features using generalized expectation criteria , 2008, SIGIR '08.

[25]  Dan Klein,et al.  Learning from measurements in exponential families , 2009, ICML '09.

[26]  Bob Carpenter,et al.  The Benefits of a Model of Annotation , 2013, Transactions of the Association for Computational Linguistics.