Truth Inference in Crowdsourcing: Is the Problem Solved?

Crowdsourcing has emerged as a novel problem-solving paradigm, which facilitates addressing problems that are hard for computers, e.g., entity resolution and sentiment analysis. However, due to the openness of crowdsourcing, workers may yield low-quality answers, and a redundancy-based method is widely employed, which first assigns each task to multiple workers and then infers the correct answer (called truth) for the task based on the answers of the assigned workers. A fundamental problem in this method is Truth Inference, which decides how to effectively infer the truth. Recently, the database community and data mining community independently study this problem and propose various algorithms. However, these algorithms are not compared extensively under the same framework and it is hard for practitioners to select appropriate algorithms. To alleviate this problem, we provide a detailed survey on 17 existing algorithms and perform a comprehensive evaluation using 5 real datasets. We make all codes and datasets public for future research. Through experiments we find that existing algorithms are not stable across different datasets and there is no algorithm that outperforms others consistently. We believe that the truth inference problem is not fully solved, and identify the limitations of existing algorithms and point out promising research directions.

[1]  Gerardo Hermosillo,et al.  Supervised learning from multiple experts: whom to trust when everyone lies a bit , 2009, ICML '09.

[2]  Beng Chin Ooi,et al.  iCrowd: An Adaptive Crowdsourcing Framework , 2015, SIGMOD Conference.

[3]  Divesh Srivastava,et al.  Less is More: Selecting Sources Wisely for Integration , 2012, Proc. VLDB Endow..

[4]  Javier R. Movellan,et al.  Whose Vote Should Count More: Optimal Integration of Labels from Labelers of Unknown Expertise , 2009, NIPS.

[5]  Ido Dagan,et al.  Synthesis Lectures on Human Language Technologies , 2009 .

[6]  Sanjeev Khanna,et al.  Using the crowd for top-k and group-by queries , 2013, ICDT '13.

[7]  Milad Shokouhi,et al.  Community-based bayesian aggregation models for crowdsourcing , 2014, WWW.

[8]  Aditya G. Parameswaran,et al.  Finish Them!: Pricing Algorithms for Human Computation , 2014, Proc. VLDB Endow..

[9]  Gianluca Demartini,et al.  ZenCrowd: leveraging probabilistic reasoning and crowdsourcing techniques for large-scale entity linking , 2012, WWW.

[10]  Philipp Koehn,et al.  Synthesis Lectures on Human Language Technologies , 2016 .

[11]  Jian Li,et al.  Cost-Effective Crowdsourced Entity Resolution: A Partial-Order Approach , 2016, SIGMOD Conference.

[12]  Beng Chin Ooi,et al.  Online data fusion , 2011, Proc. VLDB Endow..

[13]  Gerardo Hermosillo,et al.  Learning From Crowds , 2010, J. Mach. Learn. Res..

[14]  Song-Chun Zhu,et al.  Minimax Entropy Principle and Its Application to Texture Modeling , 1997, Neural Computation.

[15]  M. Kenward,et al.  An Introduction to the Bootstrap , 2007 .

[16]  Steven Reece,et al.  Language Understanding in the Wild: Combining Crowdsourcing and Machine Learning , 2015, WWW.

[17]  Devavrat Shah,et al.  Iterative Learning for Reliable Crowdsourcing Systems , 2011, NIPS.

[18]  Lilly Irani,et al.  Amazon Mechanical Turk , 2018, Advances in Intelligent Systems and Computing.

[19]  Qiang Liu,et al.  Aggregating Ordinal Labels from Crowds by Minimax Conditional Entropy , 2014, ICML.

[20]  John C. Platt,et al.  Learning from the Wisdom of Crowds by Minimax Entropy , 2012, NIPS.

[21]  Guoliang Li,et al.  Crowdsourced Top-k Algorithms: An Experimental Evaluation , 2016, Proc. VLDB Endow..

[22]  Chris Callison-Burch,et al.  Fast, Cheap, and Creative: Evaluating Translation Quality Using Amazon’s Mechanical Turk , 2009, EMNLP.

[23]  Bo Zhao,et al.  A Confidence-Aware Approach for Truth Discovery on Long-Tail Data , 2014, Proc. VLDB Endow..

[24]  Reynold Cheng,et al.  On Optimality of Jury Selection in Crowdsourcing , 2015, EDBT.

[25]  Aditya G. Parameswaran,et al.  Evaluating the crowd with confidence , 2013, KDD.

[26]  David R. Karger,et al.  Human-powered Sorts and Joins , 2011, Proc. VLDB Endow..

[27]  Manuel Blum,et al.  reCAPTCHA: Human-Based Character Recognition via Web Security Measures , 2008, Science.

[28]  Hyun-Chul Kim,et al.  Bayesian Classifier Combination , 2012, AISTATS.

[29]  Brendan T. O'Connor,et al.  Cheap and Fast – But is it Good? Evaluating Non-Expert Annotations for Natural Language Tasks , 2008, EMNLP.

[30]  Nir Friedman,et al.  Probabilistic Graphical Models - Principles and Techniques , 2009 .

[31]  Tim Kraska,et al.  CrowdER: Crowdsourcing Entity Resolution , 2012, Proc. VLDB Endow..

[32]  Pietro Perona,et al.  Inferring Ground Truth from Subjective Labelling of Venus Images , 1994, NIPS.

[33]  Hector Garcia-Molina,et al.  Question Selection for Crowd Entity Resolution , 2013, Proc. VLDB Endow..

[34]  Tim Kraska,et al.  CrowdDB: answering queries with crowdsourcing , 2011, SIGMOD '11.

[35]  Jian Peng,et al.  Variational Inference for Crowdsourcing , 2012, NIPS.

[36]  Reynold Cheng,et al.  QASCA: A Quality-Aware Task Assignment System for Crowdsourcing Applications , 2015, SIGMOD Conference.

[37]  Heng Ji,et al.  FaitCrowd: Fine Grained Truth Discovery for Crowdsourced Data Aggregation , 2015, KDD.

[38]  Michael I. Jordan,et al.  Graphical Models, Exponential Families, and Variational Inference , 2008, Found. Trends Mach. Learn..

[39]  Murat Demirbas,et al.  Crowdsourcing for Multiple-Choice Question Answering , 2014, AAAI.

[40]  Guoliang Li,et al.  Crowdsourced Data Management: A Survey , 2016, IEEE Transactions on Knowledge and Data Engineering.

[41]  C. Buckley,et al.  Overview of the TREC 2010 Relevance Feedback Track ( Notebook ) , 2010 .

[42]  Bo Zhao,et al.  Resolving conflicts in heterogeneous data by truth discovery and source reliability estimation , 2014, SIGMOD Conference.

[43]  I JordanMichael,et al.  Graphical Models, Exponential Families, and Variational Inference , 2008 .

[44]  Eugene Wu,et al.  CLAMShell: Speeding up Crowds for Low-latency Data Labeling , 2015, Proc. VLDB Endow..

[45]  Beng Chin Ooi,et al.  CDAS: A Crowdsourcing Data Analytics System , 2012, Proc. VLDB Endow..

[46]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[47]  Lei Zhang,et al.  Sentiment Analysis and Opinion Mining , 2017, Encyclopedia of Machine Learning and Data Mining.

[48]  Charles F. Hockett,et al.  A mathematical theory of communication , 1948, MOCO.

[49]  Jennifer Widom,et al.  Deco: declarative crowdsourcing , 2012, CIKM.

[50]  Rob Miller,et al.  Crowdsourced Databases: Query Processing with People , 2011, CIDR.

[51]  Mark W. Schmidt,et al.  Modeling annotator expertise: Learning when everybody knows a bit of something , 2010, AISTATS.

[52]  S. T. Buckland,et al.  An Introduction to the Bootstrap. , 1994 .

[53]  Reynold Cheng,et al.  DOCS: a domain-aware crowdsourcing system using knowledge bases , 2016, VLDB 2016.

[54]  Pietro Perona,et al.  The Multidimensional Wisdom of Crowds , 2010, NIPS.

[55]  Zhifeng Bao,et al.  Crowdsourced POI labelling: Location-aware result inference and Task Assignment , 2016, 2016 IEEE 32nd International Conference on Data Engineering (ICDE).

[56]  Shipeng Yu,et al.  Eliminating Spammers and Ranking Annotators for Crowdsourced Labeling Tasks , 2012, J. Mach. Learn. Res..

[57]  Aditya G. Parameswaran,et al.  So who won?: dynamic max discovery with the crowd , 2012, SIGMOD Conference.

[58]  Neoklis Polyzotis,et al.  Max algorithms in crowdsourcing environments , 2012, WWW.

[59]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[60]  Sang Joon Kim,et al.  A Mathematical Theory of Communication , 2006 .

[61]  Bo Zhao,et al.  Truth Discovery and Crowdsourcing Aggregation: A Unified Perspective , 2015, Proc. VLDB Endow..

[62]  A. P. Dawid,et al.  Maximum Likelihood Estimation of Observer Error‐Rates Using the EM Algorithm , 1979 .

[63]  Ohad Greenshpan,et al.  Asking the Right Questions in Crowd Data Sourcing , 2012, 2012 IEEE 28th International Conference on Data Engineering.

[64]  Wilfred Ng,et al.  Crowd-Selection Query Processing in Crowdsourcing Databases: A Task-Driven Approach , 2015, EDBT.

[65]  Jennifer Widom,et al.  CrowdScreen: algorithms for filtering data with humans , 2012, SIGMOD Conference.

[66]  David Guy Brizan,et al.  A. Survey of Entity Resolution and Record Linkage Methodologies , 2015, Communications of the IIMA.

[67]  Hongfei Yan,et al.  Comparing Twitter and Traditional Media Using Topic Models , 2011, ECIR.