论文信息 - Judgment analysis of crowdsourced opinions using biclustering

Judgment analysis of crowdsourced opinions using biclustering

The problem of deriving final judgment from crowdsourced opinions is addressed with an unsupervised approach.Biclustering is shown to be useful for identifying the annotators crucial for a judgment.We establish that a suitable fraction of the entire dataset is sufficient for appropriate judgment analysis.As the proposed method does not work over the entire data, it becomes useful for big data analysis. Annotation by the crowd workers serving online is gaining focus in recent years in diverse fields due to its distributed power of problem solving. Distributing the labeling task among a large set of workers (may be experts or non-experts) and obtaining the final consensus is a popular way of performing large-scale annotation in a limited time. Collection of multiple annotations can be effective for annotation of large-scale datasets for applications like natural language processing, image processing, etc. However, as the crowd workers are not necessarily experts, their opinions might not be accurate enough. This causes problem in deriving the final aggregated judgment. Again, majority voting (MV) is not suitable for such problems because the number of annotators is limited and they have multiple options to choose. This might cause too much conflicts among the opinions provided. Additionally, there might exist annotators who randomly try to annotate (provide spam opinions for) too many questions to maximize their payment. This can incorporate noise while deriving the final judgment. In this paper, we address the problem of crowd judgment analysis in an unsupervised way and a biclustering-based approach is proposed to obtain the judgments appropriately. The effectiveness of this approach is demonstrated on four publicly available small-scale Amazon Mechanical Turk datasets, along with a large-scale CrowdFlower dataset. We also compare the algorithm with MV and some other existing algorithms. In most of the cases the proposed approach is competitively better than others. But most importantly, it does not use the entire dataset for deriving the judgment.

Malay Bhattacharyya | Sujoy Chatterjee

[1] Brendan T. O'Connor,et al. Cheap and Fast – But is it Good? Evaluating Non-Expert Annotations for Natural Language Tasks , 2008, EMNLP.

[2] Chao Liu,et al. TrueLabel + Confusions: A Spectrum of Probabilistic Models in Analyzing Multiple Ratings , 2012, ICML.

[3] Mark W. Schmidt,et al. Modeling annotator expertise: Learning when everybody knows a bit of something , 2010, AISTATS.

[4] Panagiotis G. Ipeirotis. Analyzing the Amazon Mechanical Turk marketplace , 2010, XRDS.

[5] Luis von Ahn. Human Computation , 2008, ICDE.

[6] Mausam,et al. Crowdsourcing Control: Moving Beyond Multiple Choice , 2012, UAI.

[7] Boris Mirkin,et al. Mathematical Classification and Clustering , 1996 .

[8] Pietro Perona,et al. Inferring Ground Truth from Subjective Labelling of Venus Images , 1994, NIPS.

[9] A. P. Dawid,et al. Maximum Likelihood Estimation of Observer Error‐Rates Using the EM Algorithm , 1979 .

[10] Bill Tomlinson,et al. Who are the crowdworkers?: shifting demographics in mechanical turk , 2010, CHI Extended Abstracts.

[11] David A. Forsyth,et al. Utility data annotation with Amazon Mechanical Turk , 2008, 2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

[12] Matthew Lease,et al. Improving Quality of Crowdsourced Labels via Probabilistic Matrix Factorization , 2012, HCOMP@AAAI.

[13] Hisashi Kashima,et al. A Convex Formulation for Learning from Crowds , 2012, AAAI.

[14] Matthew Lease,et al. SQUARE: A Benchmark for Research on Computing Crowd Consensus , 2013, HCOMP.

[15] W. Batchelder,et al. Markov chain estimation for test theory without an answer key , 2003 .

[16] Dmitry I. Ignatov,et al. Recommendation of Ideas and Antagonists for Crowdsourcing Platform Witology , 2014, RuSSIR.

[17] Eckart Zitzler,et al. BicAT: a biclustering analysis toolbox , 2006, Bioinform..

[18] Karim R. Lakhani,et al. Incentives and Problem Uncertainty in Innovation Contests: An Empirical Analysis , 2011, Manag. Sci..

[19] John C. Tang,et al. Reflecting on the DARPA Red Balloon Challenge , 2011, Commun. ACM.

[20] Lothar Thiele,et al. A systematic comparison and evaluation of biclustering methods for gene expression data , 2006, Bioinform..

[21] Faris Alqadah,et al. Biclustering neighborhood-based collaborative filtering method for top-n recommender systems , 2015, Knowledge and Information Systems.

[22] Simon Rogers,et al. Semi-parametric analysis of multi-rater data , 2010, Stat. Comput..

[23] De Ayala,et al. The Theory and Practice of Item Response Theory , 2008 .