Variational Inference for Crowdsourcing

Crowdsourcing has become a popular paradigm for labeling large datasets. However, it has given rise to the computational task of aggregating the crowdsourced labels provided by a collection of unreliable annotators. We approach this problem by transforming it into a standard inference problem in graphical models, and applying approximate variational methods, including belief propagation (BP) and mean field (MF). We show that our BP algorithm generalizes both majority voting and a recent algorithm by Karger et al. [1], while our MF method is closely related to a commonly used EM algorithm. In both cases, we find that the performance of the algorithms critically depends on the choice of a prior distribution on the workers' reliability; by choosing the prior properly, both BP and MF (and EM) perform surprisingly well on both simulated and real-world datasets, competitive with state-of-the-art algorithms based on more complicated modeling assumptions.

[1]  A. Zellner An Introduction to Bayesian Inference in Econometrics , 1971 .

[2]  Miss A.O. Penney (b) , 1974, The New Yale Book of Quotations.

[3]  A. P. Dawid,et al.  Maximum Likelihood Estimation of Observer Error‐Rates Using the EM Algorithm , 1979 .

[4]  Béla Bollobás,et al.  Random Graphs , 1985 .

[5]  Pietro Perona,et al.  Inferring Ground Truth from Subjective Labelling of Venus Images , 1994, NIPS.

[6]  L. Wasserman,et al.  The Selection of Prior Distributions by Formal Rules , 1996 .

[7]  Geoffrey E. Hinton,et al.  A View of the Em Algorithm that Justifies Incremental, Sparse, and other Variants , 1998, Learning in Graphical Models.

[8]  Michael I. Jordan Graphical Models , 2003 .

[9]  William T. Freeman,et al.  On the optimality of solutions of the max-product belief-propagation algorithm in arbitrary graphs , 2001, IEEE Trans. Inf. Theory.

[10]  X. Jin Factor graphs and the Sum-Product Algorithm , 2002 .

[11]  Brendan T. O'Connor,et al.  Cheap and Fast – But is it Good? Evaluating Non-Expert Annotations for Natural Language Tasks , 2008, EMNLP.

[12]  Inc. Alias-i Multilevel Bayesian Models of Categorical Data Annotation , 2008 .

[13]  Michael I. Jordan,et al.  Graphical Models, Exponential Families, and Variational Inference , 2008, Found. Trends Mach. Learn..

[14]  K. Mengersen,et al.  A Comparison of Bayes–Laplace, Jeffreys, and Other Priors , 2008 .

[15]  Javier R. Movellan,et al.  Whose Vote Should Count More: Optimal Integration of Labels from Labelers of Unknown Expertise , 2009, NIPS.

[16]  Nir Friedman,et al.  Probabilistic Graphical Models - Principles and Techniques , 2009 .

[17]  Richard S. Zemel,et al.  HOP-MAP: Efficient Message Passing with High Order Potentials , 2010, AISTATS.

[18]  Pietro Perona,et al.  The Multidimensional Wisdom of Crowds , 2010, NIPS.

[19]  Gerardo Hermosillo,et al.  Learning From Crowds , 2010, J. Mach. Learn. Res..

[20]  A. Asuncion Approximate Mean Field for Dirichlet-Based Models , 2010 .

[21]  Michael I. Jordan,et al.  Bayesian Bias Mitigation for Crowdsourcing , 2011, NIPS.

[22]  Devavrat Shah,et al.  Iterative Learning for Reliable Crowdsourcing Systems , 2011, NIPS.

[23]  Shipeng Yu,et al.  Eliminating Spammers and Ranking Annotators for Crowdsourced Labeling Tasks , 2012, J. Mach. Learn. Res..