Analysing the Noise Model Error for Realistic Noisy Label Data

Distant and weak supervision allow to obtain large amounts of labeled training data quickly and cheaply, but these automatic annotations tend to contain a high amount of errors. A popular technique to overcome the negative effects of these noisy labels is noise modelling where the underlying noise process is modelled. In this work, we study the quality of these estimated noise models from the theoretical side by deriving the expected error of the noise model. Apart from evaluating the theoretical results on commonly used synthetic noise, we also publish NoisyNER, a new noisy label dataset from the NLP domain that was obtained through a realistic distant supervision technique. It provides seven sets of labels with differing noise patterns to evaluate different noise levels on the same instances. Parallel, clean labels are available making it possible to study scenarios where a small amount of gold-standard data can be leveraged. Our theoretical results and the corresponding experiments give insights into the factors that influence the noise model estimation like the noise distribution and the sampling technique.

[1]  Lars Kai Hansen,et al.  Design of robust neural network classifiers , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[2]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[3]  Erik F. Tjong Kim Sang,et al.  Introduction to the CoNLL-2003 Shared Task: Language-Independent Named Entity Recognition , 2003, CoNLL.

[4]  Antonio Torralba,et al.  Ieee Transactions on Pattern Analysis and Machine Intelligence 1 80 Million Tiny Images: a Large Dataset for Non-parametric Object and Scene Recognition , 2022 .

[5]  Daniel Jurafsky,et al.  Distant supervision for relation extraction without labeled data , 2009, ACL.

[6]  Geoffrey E. Hinton,et al.  Learning to Label Aerial Images from Noisy Data , 2012, ICML.

[7]  Nagarajan Natarajan,et al.  Learning with Noisy Labels , 2013, NIPS.

[8]  Sven Laur,et al.  Named Entity Recognition in Estonian , 2013, BSNLP@ACL.

[9]  Gilles Blanchard,et al.  Classification with Asymmetric Label Noise: Consistency and Maximal Denoising , 2013, COLT.

[10]  Joan Bruna,et al.  Training Convolutional Networks with Noisy Labels , 2014, ICLR 2014.

[11]  M. Verleysen,et al.  Classification in the Presence of Label Noise: A Survey , 2014, IEEE Transactions on Neural Networks and Learning Systems.

[12]  Xiaogang Wang,et al.  Learning from massive noisy labeled data for image classification , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Dumitru Erhan,et al.  Training Deep Neural Networks on Noisy Labels with Bootstrapping , 2014, ICLR.

[14]  Jacob Goldberger,et al.  Training deep neural-networks based on unreliable labels , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[15]  Christopher De Sa,et al.  Data Programming: Creating Large Training Sets, Quickly , 2016, NIPS.

[16]  Dacheng Tao,et al.  Classification with Noisy Labels by Importance Reweighting , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17]  Frank Nielsen,et al.  Loss factorization, weakly supervised learning and label noise robustness , 2016, ICML.

[18]  Trevor Cohn,et al.  Learning when to trust distant supervision: An application to low-resource POS tagging using cross-lingual projection , 2016, CoNLL.

[19]  Christopher Ré,et al.  Snorkel: Rapid Training Data Creation with Weak Supervision , 2017, Proc. VLDB Endow..

[20]  Kalina Bontcheva,et al.  Generalisation in named entity recognition: A quantitative analysis , 2017, Comput. Speech Lang..

[21]  Wei Li,et al.  WebVision Database: Visual Learning and Understanding from Web Data , 2017, ArXiv.

[22]  Robert C. Williamson,et al.  A Theory of Learning with Corrupted Labels , 2017, J. Mach. Learn. Res..

[23]  Richard Nock,et al.  Making Deep Neural Networks Robust to Label Noise: A Loss Correction Approach , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  Abhinav Gupta,et al.  Learning from Noisy Large-Scale Datasets with Minimal Supervision , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[25]  Jacob Goldberger,et al.  Training deep neural-networks using a noise adaptation layer , 2016, ICLR.

[26]  Dongyan Zhao,et al.  Learning with Noise: Enhance Distantly Supervised Relation Extraction with Dynamic Transition Matrix , 2017, ACL.

[27]  Kaiming He,et al.  Exploring the Limits of Weakly Supervised Pretraining , 2018, ECCV.

[28]  Kevin Gimpel,et al.  Using Trusted Data to Train Deep Networks on Labels Corrupted by Severe Noise , 2018, NeurIPS.

[29]  Yingying Fan,et al.  Classification with imperfect training labels , 2018, Biometrika.

[30]  Dietrich Klakow,et al.  Training a Neural Network in a Low-Resource Setting on Automatically Annotated Noisy Data , 2018, DeepLo@ACL.

[31]  Charu C. Aggarwal,et al.  Training Deep Neural Networks , 2018 .

[32]  Lei Zhang,et al.  CleanNet: Transfer Learning for Scalable Image Classifier Training with Label Noise , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[33]  Nagarajan Natarajan,et al.  Learning from binary labels with instance-dependent noise , 2018, Machine Learning.

[34]  Hayit Greenspan,et al.  Training a neural network based on unreliable human annotation of medical images , 2018, 2018 IEEE 15th International Symposium on Biomedical Imaging (ISBI 2018).

[35]  Xingrui Yu,et al.  Co-teaching: Robust training of deep neural networks with extremely noisy labels , 2018, NeurIPS.

[36]  Dietrich Klakow,et al.  Handling Noisy Labels for Robustly Learning from Self-Training Data for Low-Resource Sequence Labeling , 2019, NAACL.

[37]  Gang Niu,et al.  Are Anchor Points Really Indispensable in Label-Noise Learning? , 2019, NeurIPS.

[38]  Hongyu Guo,et al.  Uncover the Ground-Truth Relations in Distant Supervision: A Neural Expectation-Maximization Framework , 2019, EMNLP/IJCNLP.

[39]  Pengfei Chen,et al.  Understanding and Utilizing Deep Neural Networks Trained with Noisy Labels , 2019, ICML.

[40]  Dietrich Klakow,et al.  Feature-Dependent Confusion Matrices for Low-Resource NER Labeling with Noisy Labels , 2019, EMNLP.

[41]  Hao Wang,et al.  Learning with Noisy Labels for Sentence-level Sentiment Classification , 2019, EMNLP.

[42]  Trevor Cohn,et al.  Massively Multilingual Transfer for NER , 2019, ACL.

[43]  Gang Niu,et al.  Dual T: Reducing Estimation Error for Transition Matrix in Label-noise Learning , 2020, NeurIPS.

[44]  Gang Niu,et al.  Parts-dependent Label Noise: Towards Instance-dependent Label Noise , 2020, ArXiv.

[45]  Weilong Yang,et al.  Beyond Synthetic Noise: Deep Learning on Controlled Noisy Labels , 2019, ICML.

[46]  Kotagiri Ramamohanarao,et al.  Learning with Bounded Instance- and Label-dependent Label Noise , 2017, ICML.

[47]  Goran Glavaš,et al.  From Zero to Hero: On the Limitations of Zero-Shot Language Transfer with Multilingual Transformers , 2020, EMNLP.

[48]  Pierre Lison,et al.  Named Entity Recognition without Labelled Data: A Weak Supervision Approach , 2020, ACL.

[49]  Dietrich Klakow,et al.  Transfer Learning and Distant Supervision for Multilingual Transformer Models: A Study on African Languages , 2020, EMNLP.

[50]  G. Algan,et al.  Image Classification with Deep Learning in the Presence of Noisy Labels: A Survey , 2019, Knowl. Based Syst..

[51]  Heike Adel,et al.  A Survey on Recent Approaches for Natural Language Processing in Low-Resource Scenarios , 2020, NAACL.

[52]  Michael A. Hedderich,et al.  ANEA: Distant Supervision for Low-Resource Named Entity Recognition , 2021, arXiv.org.

[53]  N. O'Connor,et al.  Towards Robust Learning with Different Label Noise Distributions , 2019, 2020 25th International Conference on Pattern Recognition (ICPR).