Efficient Learning for Crowdsourced Regression

Crowdsourcing platforms emerged as popular venues for purchasing human intelligence at low cost for large volume of tasks. As many low-paid workers are prone to give noisy answers, one of the fundamental questions is how to identify more reliable workers and exploit this heterogeneity to infer the true answers. Despite significant research efforts for classification tasks with discrete answers, little attention has been paid to regression tasks where the answers take continuous values. We consider the task of recovering the position of target objects, and introduce a new probabilistic model capturing the heterogeneity of the workers. We propose the belief propagation (BP) algorithm for inferring the positions and prove that it achieves optimal mean squared error by comparing its performance to that of an oracle estimator. Our experimental results on synthetic datasets confirm our theoretical predictions. We further emulate a crowdsourcing system using PASCAL visual object classes datasets and show that de-noising the crowdsourced data using BP can significantly improve the performance for the downstream vision task.

[1]  Robert D. Nowak,et al.  Top Arm Identification in Multi-Armed Bandits with Batch Arm Pulls , 2016, AISTATS.

[2]  Michael D. Lee,et al.  Inferring Expertise in Knowledge and Prediction Ranking Tasks , 2012, Top. Cogn. Sci..

[3]  Panagiotis G. Ipeirotis,et al.  Get another label? improving data quality and data mining using multiple, noisy labelers , 2008, KDD.

[4]  Luc Van Gool,et al.  The Pascal Visual Object Classes Challenge: A Retrospective , 2014, International Journal of Computer Vision.

[5]  Xi Chen,et al.  Spectral Methods Meet EM: A Provably Optimal Algorithm for Crowdsourcing , 2014, J. Mach. Learn. Res..

[6]  Martin J. Wainwright,et al.  A Permutation-Based Model for Crowd Labeling: Optimal Estimation and Robustness , 2016, IEEE Transactions on Information Theory.

[7]  Devavrat Shah,et al.  Budget-Optimal Task Allocation for Reliable Crowdsourcing Systems , 2011, Oper. Res..

[8]  Nihar B. Shah,et al.  Regularized Minimax Conditional Entropy for Crowdsourcing , 2015, ArXiv.

[9]  Devavrat Shah,et al.  Efficient crowdsourcing for multi-class labeling , 2013, SIGMETRICS '13.

[10]  Martin J. Wainwright,et al.  Belief propagation for continuous state spaces: stochastic message-passing with quantitative guarantees , 2012, J. Mach. Learn. Res..

[11]  Alan M. Frieze,et al.  Random graphs , 2006, SODA '06.

[12]  Rüdiger L. Urbanke,et al.  Spatially coupled ensembles universally achieve capacity under belief propagation , 2012, 2012 IEEE International Symposium on Information Theory Proceedings.

[13]  Benjamin Van Roy,et al.  Convergence of Min-Sum Message Passing for Quadratic Optimization , 2006, IEEE Transactions on Information Theory.

[14]  David R. Karger,et al.  Counting with the Crowd , 2012, Proc. VLDB Endow..

[15]  Elchanan Mossel,et al.  Belief propagation, robust reconstruction and optimal recovery of block models , 2013, COLT.

[16]  John C. Platt,et al.  Learning from the Wisdom of Crowds by Minimax Entropy , 2012, NIPS.

[17]  A. P. Dawid,et al.  Maximum Likelihood Estimation of Observer Error‐Rates Using the EM Algorithm , 1979 .

[18]  Judea Pearl,et al.  Reverend Bayes on Inference Engines: A Distributed Hierarchical Approach , 1982, AAAI.

[19]  Pietro Perona,et al.  Inferring Ground Truth from Subjective Labelling of Venus Images , 1994, NIPS.

[20]  Jinwoo Shin,et al.  Max-Product Belief Propagation for Linear Programming: Applications to Combinatorial Optimization , 2015, UAI.

[21]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[22]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[23]  Amir Globerson,et al.  Tightness Results for Local Consistency Relaxations in Continuous MRFs , 2014, UAI.

[24]  Michael S. Bernstein,et al.  Crowds in two seconds: enabling realtime crowd-powered interfaces , 2011, UIST.

[25]  Michael I. Jordan,et al.  Loopy Belief Propagation for Approximate Inference: An Empirical Study , 1999, UAI.

[26]  Ross B. Girshick,et al.  Fast R-CNN , 2015, 1504.08083.

[27]  Anirban Dasgupta,et al.  Aggregating crowdsourced binary ratings , 2013, WWW.

[28]  Qiang Liu,et al.  Crowdsourcing for structured labeling with applications to protein folding , 2013 .

[29]  Bin Bi,et al.  Iterative Learning for Reliable Crowdsourcing Systems , 2012 .

[30]  Wei Liu,et al.  SSD: Single Shot MultiBox Detector , 2015, ECCV.

[31]  Yair Weiss,et al.  Linear Programming Relaxations and Belief Propagation - An Empirical Study , 2006, J. Mach. Learn. Res..

[32]  Hao Su,et al.  Crowdsourcing Annotations for Visual Object Detection , 2012, HCOMP@AAAI.

[33]  Р Ю Чуйков,et al.  Обнаружение транспортных средств на изображениях загородных шоссе на основе метода Single shot multibox Detector , 2017 .

[34]  Ashish Khetan,et al.  Achieving budget-optimality with adaptive schemes in crowdsourcing , 2016, NIPS.

[35]  R. Preston McAfee,et al.  Who moderates the moderators?: crowdsourcing abuse detection in user-generated content , 2011, EC '11.

[36]  Chien-Ju Ho,et al.  Adaptive Task Assignment for Crowdsourced Classification , 2013, ICML.

[37]  Tom Minka,et al.  Expectation Propagation for approximate Bayesian inference , 2001, UAI.

[38]  Yong Yu,et al.  Sembler: Ensembling Crowd Sequential Labeling for Improved Quality , 2012, AAAI.

[39]  Jian Peng,et al.  Variational Inference for Crowdsourcing , 2012, NIPS.