Learning from Inconsistent and Unreliable Annotators by a Gaussian Mixture Model and Bayesian Information Criterion

Supervised learning from multiple annotators is an increasingly important problem in machine leaning and data mining. This paper develops a probabilistic approach to this problem when annotators are not only unreliable, but also have varying performance depending on the data. The proposed approach uses a Gaussian mixture model (GMM) and Bayesian information criterion (BIC) to find the fittest model to approximate the distribution of the instances. Then the maximum a posterior (MAP) estimation of the hidden true labels and the maximum-likelihood (ML) estimation of quality of multiple annotators are provided alternately. Experiments on emotional speech classification and CASP9 protein disorder prediction tasks show performance improvement of the proposed approach as compared to the majority voting baseline and a previous data-independent approach. Moreover, the approach also provides more accurate estimates of individual annotators performance for each Gaussian component, thus paving the way for understanding the behaviors of each annotator.

[1]  Koby Crammer,et al.  Learning from Multiple Sources , 2006, NIPS.

[2]  Shrikanth S. Narayanan,et al.  An articulatory study of emotional speech production , 2005, INTERSPEECH.

[3]  Jaime G. Carbonell,et al.  Proactive learning: cost-sensitive active learning with multiple imperfect oracles , 2008, CIKM '08.

[4]  Avrim Blum,et al.  Veritas: Combining Expert Opinions without Labeled Data , 2008, 2008 20th IEEE International Conference on Tools with Artificial Intelligence.

[5]  Brendan T. O'Connor,et al.  Cheap and Fast – But is it Good? Evaluating Non-Expert Annotations for Natural Language Tasks , 2008, EMNLP.

[6]  Panagiotis G. Ipeirotis,et al.  Get another label? improving data quality and data mining using multiple, noisy labelers , 2008, KDD.

[7]  Zoran Obradovic,et al.  Unsupervised integration of multiple protein disorder predictors , 2010, 2010 IEEE International Conference on Bioinformatics and Biomedicine (BIBM).

[8]  Pietro Perona,et al.  Inferring Ground Truth from Subjective Labelling of Venus Images , 1994, NIPS.

[9]  Angel R. Martinez,et al.  : Exploratory data analysis with MATLAB ® , 2007 .

[10]  Rong Jin,et al.  Learning with Multiple Labels , 2002, NIPS.

[11]  A. Raftery,et al.  Model-based Gaussian and non-Gaussian clustering , 1993 .

[12]  Jaime G. Carbonell,et al.  Efficiently learning the accuracy of labeling sources for selective sampling , 2009, KDD.

[13]  Mark W. Schmidt,et al.  Modeling annotator expertise: Learning when everybody knows a bit of something , 2010, AISTATS.

[14]  Shrikanth S. Narayanan,et al.  Data-dependent evaluator modeling and its application to emotional valence classification from speech , 2010, INTERSPEECH.

[15]  Ohad Shamir,et al.  Vox Populi: Collecting High-Quality Labels from a Crowd , 2009, COLT.

[16]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[17]  A. Dunker,et al.  Understanding protein non-folding. , 2010, Biochimica et biophysica acta.

[18]  Adrian E. Raftery,et al.  How Many Clusters? Which Clustering Method? Answers Via Model-Based Cluster Analysis , 1998, Comput. J..

[19]  Pietro Perona,et al.  The Multidimensional Wisdom of Crowds , 2010, NIPS.

[20]  Hagit Shatkay,et al.  How to Get the Most out of Your Curation Effort , 2009, PLoS Comput. Biol..

[21]  Jaime Prilusky,et al.  Assessment of disorder predictions in CASP8 , 2009, Proteins.

[22]  Gerardo Hermosillo,et al.  Supervised learning from multiple experts: whom to trust when everyone lies a bit , 2009, ICML '09.

[23]  Javier R. Movellan,et al.  Whose Vote Should Count More: Optimal Integration of Labels from Labelers of Unknown Expertise , 2009, NIPS.

[24]  Nasser M. Nasrabadi,et al.  Pattern Recognition and Machine Learning , 2006, Technometrics.

[25]  Christopher M. Bishop,et al.  Pattern Recognition and Machine Learning (Information Science and Statistics) , 2006 .