Learning from Crowds under Experts' Supervision

Crowdsourcing services have been proven efficient in collecting large amount of labeled data for supervised learning, but low cost of crowd workers leads to unreliable labels. Various methods have been proposed to infer the ground truth or learn from crowd data directly though, there is no guarantee that these methods work well for highly biased or noisy crowd labels. Motivated by this limitation of crowd data, we propose to improve the performance of crowdsourcing learning tasks with some additional expert labels by treating each labeler as a personal classifier and combining all labelers’ opinions from a model combination perspective. Experiments show that our method can significantly improve the learning quality as compared with those methods solely using crowd labels.

[1]  Gerardo Hermosillo,et al.  Learning From Crowds , 2010, J. Mach. Learn. Res..

[2]  Hisashi Kashima,et al.  Learning from Crowds and Experts , 2012, HCOMP@AAAI.

[3]  Jaime G. Carbonell,et al.  Active Learning and Crowd-Sourcing for Machine Translation , 2010, LREC.

[4]  Jian Peng,et al.  Variational Inference for Crowdsourcing , 2012, NIPS.

[5]  Christopher J. Merz,et al.  Using Correspondence Analysis to Combine Classifiers , 1999, Machine Learning.

[6]  A. P. Dawid,et al.  Maximum Likelihood Estimation of Observer Error‐Rates Using the EM Algorithm , 1979 .

[7]  James C. Bezdek,et al.  Decision templates for multiple classifier fusion: an experimental comparison , 2001, Pattern Recognit..

[8]  Nasser M. Nasrabadi,et al.  Pattern Recognition and Machine Learning , 2006, Technometrics.

[9]  Brendan T. O'Connor,et al.  Cheap and Fast – But is it Good? Evaluating Non-Expert Annotations for Natural Language Tasks , 2008, EMNLP.

[10]  Mark W. Schmidt,et al.  Modeling annotator expertise: Learning when everybody knows a bit of something , 2010, AISTATS.

[11]  Hisashi Kashima,et al.  A Convex Formulation for Learning from Crowds , 2012, AAAI.

[12]  Panagiotis G. Ipeirotis,et al.  Get another label? improving data quality and data mining using multiple, noisy labelers , 2008, KDD.

[13]  Devavrat Shah,et al.  Iterative Learning for Reliable Crowdsourcing Systems , 2011, NIPS.

[14]  Mark Dredze,et al.  Annotating Named Entities in Twitter Data with Crowdsourcing , 2010, Mturk@HLT-NAACL.

[15]  Derek Greene,et al.  Using Crowdsourcing and Active Learning to Track Sentiment in Online Media , 2010, ECAI.

[16]  James C. Spall,et al.  Introduction to Stochastic Search and Optimization. Estimation, Simulation, and Control (Spall, J.C. , 2007 .

[17]  Michael I. Jordan,et al.  Bayesian Bias Mitigation for Crowdsourcing , 2011, NIPS.