Deep Learning From Crowdsourced Labels: Coupled Cross-entropy Minimization, Identifiability, and Regularization

Using noisy crowdsourced labels from multiple annotators, a deep learning-based end-to-end (E2E) system aims to learn the label correction mechanism and the neural classifier simultaneously. To this end, many E2E systems concatenate the neural classifier with multiple annotator-specific ``label confusion'' layers and co-train the two parts in a parameter-coupled manner. The formulated coupled cross-entropy minimization (CCEM)-type criteria are intuitive and work well in practice. Nonetheless, theoretical understanding of the CCEM criterion has been limited. The contribution of this work is twofold: First, performance guarantees of the CCEM criterion are presented. Our analysis reveals for the first time that the CCEM can indeed correctly identify the annotators' confusion characteristics and the desired ``ground-truth'' neural classifier under realistic conditions, e.g., when only incomplete annotator labeling and finite samples are available. Second, based on the insights learned from our analysis, two regularized variants of the CCEM are proposed. The regularization terms provably enhance the identifiability of the target model parameters in various more challenging cases. A series of synthetic and real data experiments are presented to showcase the effectiveness of our approach.

[1]  Bo Han,et al.  Deep Learning From Multiple Noisy Annotators as A Union. , 2022, IEEE transactions on neural networks and learning systems.

[2]  Jialu Wang,et al.  Beyond Images: Label Noise Transition Matrix Estimation for Tasks with Lower-Quality Features , 2022, ICML.

[3]  Xiao Fu,et al.  Crowdsourcing via Annotator Co-occurrence Imputation and Provable Symmetric Nonnegative Matrix Factorization , 2021, ICML.

[4]  Masashi Sugiyama,et al.  Provably End-to-end Label-Noise Learning without Anchor Points , 2021, ICML.

[5]  Hongning Wang,et al.  Learning from Crowds by Modeling Common Confusions , 2020, AAAI.

[6]  Yang Liu,et al.  Learning with Instance-Dependent Label Noise: A Sample Sieve Approach , 2020, ICLR.

[7]  O. Ciccarelli,et al.  Disentangling Human Error from the Ground Truth in Segmentation of Medical Images , 2020, NeurIPS 2020.

[8]  Hailong Sun,et al.  Structured Probabilistic End-to-End Learning from Crowds , 2020, IJCAI.

[9]  Gang Niu,et al.  Parts-dependent Label Noise: Towards Instance-dependent Label Noise , 2020, ArXiv.

[10]  Chunhui Zhang,et al.  Coupled-View Deep Classifier Learning from Multiple Noisy Annotators , 2020, AAAI.

[11]  Shan Lin,et al.  Generalization Bounds for Convolutional Neural Networks , 2019, ArXiv.

[12]  Kejun Huang,et al.  Crowdsourcing via Pairwise Co-occurrences: Identifiability and Algorithms , 2019, NeurIPS.

[13]  Gang Niu,et al.  Are Anchor Points Really Indispensable in Label-Noise Learning? , 2019, NeurIPS.

[14]  Yizhou Wang,et al.  Max-MIG: an Information Theoretic Approach for Joint Learning from Crowds , 2019, ICLR.

[15]  Swami Sankaranarayanan,et al.  Learning From Noisy Labels by Regularized Estimation of Annotator Confusion , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Wing-Kin Ma,et al.  Nonnegative Matrix Factorization for Signal and Data Analytics: Identifiability, Algorithms, and Applications , 2018, IEEE Signal Processing Magazine.

[17]  Anima Anandkumar,et al.  Learning From Noisy Singly-labeled Data , 2017, ICLR.

[18]  Georgios B. Giannakis,et al.  Blind Multiclass Ensemble Classification , 2017, IEEE Transactions on Signal Processing.

[19]  Francisco C. Pereira,et al.  Deep learning from crowds , 2017, AAAI.

[20]  Xiao Fu,et al.  On Identifiability of Nonnegative Matrix Factorization , 2017, IEEE Signal Processing Letters.

[21]  Geoffrey E. Hinton,et al.  Who Said What: Modeling Individual Labelers Improves Classification , 2017, AAAI.

[22]  Bernardete Ribeiro,et al.  Learning Supervised Topic Models for Classification and Regression from Crowds , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[23]  Kerri Wazny,et al.  “Crowdsourcing” ten years in: A review , 2017, Journal of global health.

[24]  Roland Vollgraf,et al.  Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algorithms , 2017, ArXiv.

[25]  Matus Telgarsky,et al.  Spectrally-normalized margin bounds for neural networks , 2017, NIPS.

[26]  Yoshua Bengio,et al.  A Closer Look at Memorization in Deep Networks , 2017, ICML.

[27]  Samy Bengio,et al.  Understanding deep learning requires rethinking generalization , 2016, ICLR.

[28]  Nikos D. Sidiropoulos,et al.  Anchor-Free Correlated Topic Modeling: Identifiability and Algorithm , 2016, NIPS.

[29]  Andreas Maurer,et al.  A Vector-Contraction Inequality for Rademacher Complexities , 2016, ALT.

[30]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[31]  Xi Chen,et al.  Spectral Methods Meet EM: A Provably Optimal Algorithm for Crowdsourcing , 2014, J. Mach. Learn. Res..

[32]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[33]  Weihong Deng,et al.  Very deep convolutional neural network based image classification using small training sample size , 2015, 2015 3rd IAPR Asian Conference on Pattern Recognition (ACPR).

[34]  Nikos D. Sidiropoulos,et al.  Blind Separation of Quasi-Stationary Sources: Exploiting Convex Geometry in Covariance Domain , 2015, IEEE Transactions on Signal Processing.

[35]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[36]  Nikos D. Sidiropoulos,et al.  Principled Neuro-Functional Connectivity Discovery , 2015, SDM.

[37]  Taghi M. Khoshgoftaar,et al.  Deep learning applications and challenges in big data analytics , 2015, Journal of Big Data.

[38]  Bernardete Ribeiro,et al.  Gaussian Process Classification and Active Learning with Multiple Annotators , 2014, ICML.

[39]  Shai Ben-David,et al.  Understanding Machine Learning: From Theory to Algorithms , 2014 .

[40]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[41]  Nikos D. Sidiropoulos,et al.  Non-Negative Matrix Factorization Revisited: Uniqueness and Algorithm for Symmetric Decomposition , 2014, IEEE Transactions on Signal Processing.

[42]  Matthew Lease,et al.  Beyond AMT: An Analysis of Crowd Work Platforms , 2013, ArXiv.

[43]  Devavrat Shah,et al.  Efficient crowdsourcing for multi-class labeling , 2013, SIGMETRICS '13.

[44]  Anirban Dasgupta,et al.  Aggregating crowdsourced binary ratings , 2013, WWW.

[45]  Jian Peng,et al.  Variational Inference for Crowdsourcing , 2012, NIPS.

[46]  L. Deng,et al.  The MNIST Database of Handwritten Digit Images for Machine Learning Research [Best of the Web] , 2012, IEEE Signal Processing Magazine.

[47]  Roman Vershynin,et al.  Introduction to the non-asymptotic analysis of random matrices , 2010, Compressed Sensing.

[48]  Bin Bi,et al.  Iterative Learning for Reliable Crowdsourcing Systems , 2012 .

[49]  R. Preston McAfee,et al.  Who moderates the moderators?: crowdsourcing abuse detection in user-generated content , 2011, EC '11.

[50]  Ana de Almeida,et al.  Nonnegative Matrix Factorization , 2018 .

[51]  Pietro Perona,et al.  The Multidimensional Wisdom of Crowds , 2010, NIPS.

[52]  Gerardo Hermosillo,et al.  Learning From Crowds , 2010, J. Mach. Learn. Res..

[53]  Javier R. Movellan,et al.  Whose Vote Should Count More: Optimal Integration of Labels from Labelers of Unknown Expertise , 2009, NIPS.

[54]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[55]  Brendan T. O'Connor,et al.  Cheap and Fast – But is it Good? Evaluating Non-Expert Annotations for Natural Language Tasks , 2008, EMNLP.

[56]  Antonio Torralba,et al.  LabelMe: A Database and Web-Based Tool for Image Annotation , 2008, International Journal of Computer Vision.

[57]  Peter Harremoës,et al.  Refinements of Pinsker's inequality , 2003, IEEE Trans. Inf. Theory.

[58]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[59]  A. P. Dawid,et al.  Maximum Likelihood Estimation of Observer Error‐Rates Using the EM Algorithm , 1979 .