OpinionRank: Extracting Ground Truth Labels from Unreliable Expert Opinions with Graph-Based Spectral Ranking

As larger and more comprehensive datasets become standard in contemporary machine learning, it becomes increasingly more difficult to obtain reliable, trustworthy label information with which to train sophisticated models. To address this problem, crowdsourcing has emerged as a popular, inexpensive, and efficient data mining solution for performing distributed label collection. However, crowdsourced annotations are inherently untrustworthy, as the labels are provided by anonymous volunteers who may have varying, unreliable expertise. Worse yet, some participants on commonly used platforms such as Amazon Mechanical Turk may be adversarial, and provide intentionally incorrect label information without the end user's knowledge. We discuss three conventional models of the label generation process, describing their parameterizations and the model-based approaches used to solve them. We then propose OpinionRank, a model-free, interpretable, graph-based spectral algorithm for integrating crowdsourced annotations into reliable labels for performing supervised or semi-supervised learning. Our experiments show that OpinionRank performs favorably when compared against more highly parameterized algorithms. We also show that OpinionRank is scalable to very large datasets and numbers of label sources, and requires considerably fewer computational resources than previous approaches.

[1]  Xiaohua Zhai,et al.  Are we done with ImageNet? , 2020, ArXiv.

[2]  Xi Chen,et al.  Spectral Methods Meet EM: A Provably Optimal Algorithm for Crowdsourcing , 2014, J. Mach. Learn. Res..

[3]  James P. Keener,et al.  The Perron-Frobenius Theorem and the Ranking of Football Teams , 1993, SIAM Rev..

[4]  Pietro Perona,et al.  The Multidimensional Wisdom of Crowds , 2010, NIPS.

[5]  Satoshi Oyama,et al.  Semi-Supervised Learning From Crowds Using Deep Generative Models , 2018, AAAI.

[6]  J. Seeley The net of reciprocal influence; a problem in treating sociometric data. , 1949 .

[7]  Brendan T. O'Connor,et al.  Cheap and Fast – But is it Good? Evaluating Non-Expert Annotations for Natural Language Tasks , 2008, EMNLP.

[8]  Rajeev Motwani,et al.  The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.

[9]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[10]  Yong Yu,et al.  Aggregating Crowd Wisdoms with Label-aware Autoencoders , 2017, IJCAI.

[11]  Javier R. Movellan,et al.  Whose Vote Should Count More: Optimal Integration of Labels from Labelers of Unknown Expertise , 2009, NIPS.

[12]  Hongzhi Wang,et al.  Brief survey of crowdsourcing for data mining , 2014, Expert Syst. Appl..

[13]  Xi Chen,et al.  Optimistic Knowledge Gradient Policy for Optimal Budget Allocation in Crowdsourcing , 2013, ICML.

[14]  Bernardete Ribeiro,et al.  Learning from multiple annotators: Distinguishing good from random labelers , 2013, Pattern Recognit. Lett..

[15]  Dragomir R. Radev,et al.  LexRank: Graph-based Lexical Centrality as Salience in Text Summarization , 2004, J. Artif. Intell. Res..

[16]  Pietro Perona,et al.  Inferring Ground Truth from Subjective Labelling of Venus Images , 1994, NIPS.

[17]  R. G. Kazmann Democratic organization: A preliminary mathematical model , 1973 .

[18]  Jacob Goldberger,et al.  Ensemble Segmentation Using Efficient Integer Linear Programming , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19]  Victor S. Sheng,et al.  Semi-Supervised Multi-Label Learning from Crowds via Deep Sequential Generative Model , 2020, KDD.

[20]  A. P. Dawid,et al.  Maximum Likelihood Estimation of Observer Error‐Rates Using the EM Algorithm , 1979 .

[21]  Seong Joon Oh,et al.  Re-labeling ImageNet: from Single to Multi-Labels, from Global to Localized Labels , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  C. List,et al.  Epistemic democracy : generalizing the Condorcet jury theorem , 2001 .

[23]  Xindong Wu,et al.  Multi-Label Inference for Crowdsourcing , 2018, KDD.

[24]  Nicolas de Condorcet Essai Sur L'Application de L'Analyse a la Probabilite Des Decisions Rendues a la Pluralite Des Voix , 2009 .

[25]  Yuxin Chen,et al.  Spectral Method and Regularized MLE Are Both Optimal for Top-$K$ Ranking , 2017, Annals of statistics.

[26]  Gerardo Hermosillo,et al.  Learning From Crowds , 2010, J. Mach. Learn. Res..

[27]  YanYan,et al.  Learning from multiple annotators with varying expertise , 2014 .

[28]  Chiranjib Bhattacharyya,et al.  Structured learning for non-smooth ranking losses , 2008, KDD.

[29]  Filip Radlinski,et al.  A support vector method for optimizing average precision , 2007, SIGIR.

[30]  Bernard Grofman,et al.  A comment on ‘democratic theory: A preliminary mathematical model.’ , 1975 .

[31]  Anirban Dasgupta,et al.  Aggregating crowdsourced binary ratings , 2013, WWW.

[32]  Scott L. Feld,et al.  PROVING A DISTRIBUTION-FREE GENERALIZATION OF THE CONDORCET JURY THEOREM* , 1989 .

[33]  Jacob Goldberger Combining soft decisions of several unreliable experts , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[34]  Gregory N. Hullender,et al.  Learning to rank using gradient descent , 2005, ICML.

[35]  R. Preston McAfee,et al.  Who moderates the moderators?: crowdsourcing abuse detection in user-generated content , 2011, EC '11.

[36]  Yan Liu,et al.  From crowdsourcing to crowdmining: using implicit human intelligence for better understanding of crowdsourced data , 2019, World Wide Web.

[37]  Ryan P. Adams,et al.  Contrastive Learning Using Spectral Methods , 2013, NIPS.

[38]  Min-Yen Kan,et al.  Perspectives on crowdsourcing annotations for natural language processing , 2012, Language Resources and Evaluation.

[39]  Chao Gao,et al.  Minimax Optimal Convergence Rates for Estimating Ground Truth from Crowdsourced Labels , 2013, 1310.5764.

[40]  Panagiotis G. Ipeirotis,et al.  Get another label? improving data quality and data mining using multiple, noisy labelers , 2008, KDD.

[41]  Qiang Liu,et al.  Aggregating Ordinal Labels from Crowds by Minimax Conditional Entropy , 2014, ICML.

[42]  Jian Peng,et al.  Variational Inference for Crowdsourcing , 2012, NIPS.

[43]  Rada Mihalcea,et al.  TextRank: Bringing Order into Text , 2004, EMNLP.