Multi-Label Answer Aggregation for Crowdsourcing

Crowdsourcing has been widely established as a means to enable human computation at large scale, in particular for tasks that require manual labelling of large sets of data items. Answers obtained from heterogeneous crowd workers are aggregated to obtain a robust result. However, existing methods for answer aggregation assume that answers are given as a single label per item. Hence, these methods are ineffective for common multi-labelling problems such as image tagging and document annotation, where items are assigned sets of labels. In this paper, we propose a novel Bayesian nonparametric model for multi-label answer aggregation. It enables us to predict labels for non-grounded items, while taking into account dependencies between the labels in different answer sets. We also show how this model is instantiated for incremental learning, incorporating new answers from crowd workers as they arrive. An evaluation of our method using a number of large-scale, real-world crowdsourcing datasets reveals that it consistently outperforms the state-of-the-art in answer aggregation in terms of precision, recall, and robustness against faulty workers and data sparsity.

[1]  Michael I. Jordan,et al.  Variational inference for Dirichlet process mixtures , 2006 .

[2]  Kyumin Lee,et al.  The social honeypot project: protecting online communities from spammers , 2010, WWW '10.

[3]  Bo Zhao,et al.  A Bayesian Approach to Discovering Truth from Conflicting Sources for Data Integration , 2012, Proc. VLDB Endow..

[4]  Jeroen B. P. Vuurens,et al.  How Much Spam Can You Take? An Analysis of Crowdsourcing Results to Increase Accuracy , 2011 .

[5]  T. Ferguson A Bayesian Analysis of Some Nonparametric Problems , 1973 .

[6]  Lei Chen,et al.  Reducing Uncertainty of Schema Matching via Crowdsourcing , 2013, Proc. VLDB Endow..

[7]  Mausam,et al.  Crowdsourcing Multi-Label Classification for Taxonomy Creation , 2013, HCOMP.

[8]  Wojciech Kotlowski,et al.  On Nonparametric Ordinal Classification with Monotonicity Constraints , 2013 .

[9]  Jeffrey F. Naughton,et al.  Corleone: hands-off crowdsourcing for entity matching , 2014, SIGMOD Conference.

[10]  John Pavlopoulos,et al.  Aspect Term Extraction for Sentiment Analysis: New Datasets, New Evaluation Measures and an Improved Unsupervised Method , 2014 .

[11]  Gabriella Kazai,et al.  Worker types and personality traits in crowdsourcing relevance labels , 2011, CIKM '11.

[12]  Michael I. Jordan,et al.  An Introduction to Variational Methods for Graphical Models , 1999, Machine Learning.

[13]  Kun Zhang,et al.  Multi-label learning by exploiting label dependency , 2010, KDD.

[14]  Vikas Kumar,et al.  CrowdSearch: exploiting crowds for accurate real-time image search on mobile phones , 2010, MobiSys '10.

[15]  Gianluca Demartini,et al.  Mechanical Cheat: Spamming Schemes and Adversarial Techniques on Crowdsourcing Platforms , 2012, CrowdSearch.

[16]  Iadh Ounis,et al.  Overview of the TREC 2011 Microblog Track , 2011, TREC.

[17]  Jinfeng Yi,et al.  Semi-Crowdsourced Clustering: Generalizing Crowd Labeling by Robust Distance Metric Learning , 2012, NIPS.

[18]  Qaisar Abbas,et al.  Pattern classification of dermoscopy images: A perceptually uniform model , 2013, Pattern Recognit..

[19]  Lydia B. Chilton,et al.  The labor economics of paid crowdsourcing , 2010, EC '10.

[20]  AnHai Doan,et al.  Chimera: Large-Scale Classification using Machine Learning, Rules, and Crowdsourcing , 2014, Proc. VLDB Endow..

[21]  Chong Wang,et al.  Online Variational Inference for the Hierarchical Dirichlet Process , 2011, AISTATS.

[22]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[23]  Shipeng Yu,et al.  Ranking annotators for crowdsourced labeling tasks , 2011, NIPS.

[24]  Bill Tomlinson,et al.  Who are the crowdworkers?: shifting demographics in mechanical turk , 2010, CHI Extended Abstracts.

[25]  Michael I. Jordan,et al.  Bayesian Bias Mitigation for Crowdsourcing , 2011, NIPS.

[26]  Mohamed Medhat Gaber,et al.  A Survey of Classification Methods in Data Streams , 2007, Data Streams - Models and Algorithms.

[27]  Yee Whye Teh,et al.  Bayesian nonparametric crowdsourcing , 2014, J. Mach. Learn. Res..

[28]  Benjamin B. Bederson,et al.  Human computation: a survey and taxonomy of a growing field , 2011, CHI.

[29]  John Riedl,et al.  tagging, communities, vocabulary, evolution , 2006, CSCW '06.

[30]  Kevin P. Murphy,et al.  Machine learning - a probabilistic perspective , 2012, Adaptive computation and machine learning series.

[31]  Amélie Marian,et al.  Beyond the Stars: Improving Rating Predictions using Review Text Content , 2009, WebDB.

[32]  Tat-Seng Chua,et al.  NUS-WIDE: a real-world web image database from National University of Singapore , 2009, CIVR '09.

[33]  Lei Chen,et al.  Whom to Ask? Jury Selection for Decision Making Tasks on Micro-blog Services , 2012, Proc. VLDB Endow..

[34]  Pierre Senellart,et al.  Crowd mining , 2013, SIGMOD '13.

[35]  Karl Aberer,et al.  An Evaluation of Aggregation Techniques in Crowdsourcing , 2013, WISE.

[36]  J. Sethuraman A CONSTRUCTIVE DEFINITION OF DIRICHLET PRIORS , 1991 .

[37]  Chong Wang,et al.  Stochastic variational inference , 2012, J. Mach. Learn. Res..

[38]  Oren Etzioni,et al.  Named Entity Recognition in Tweets: An Experimental Study , 2011, EMNLP.