Learning to Rectify for Robust Learning with Noisy Labels

Label noise significantly degrades the generalization ability of deep models in applications. Effective strategies and approaches, e.g. re-weighting, or loss correction, are designed to alleviate the negative impact of label noise when training a neural network. Those existing works usually rely on the pre-specified architecture and manually tuning the additional hyper-parameters. In this paper, we propose warped probabilistic inference (WarPI) to achieve adaptively rectifying the training procedure for the classification network within the meta-learning scenario. In contrast to the deterministic models, WarPI is formulated as a hierarchical probabilistic model by learning an amortization meta-network, which can resolve sample ambiguity and be therefore more robust to serious label noise. Unlike the existing approximated weighting function of directly generating weight values from losses, our meta-network is learned to estimate a rectifying vector from the input of the logits and labels, which has the capability of leveraging sufficient information lying in them. This provides an effective way to rectify the learning procedure for the classification network, demonstrating a significant improvement of the generalization ability. Besides, modeling the rectifying vector as a latent variable and learning the meta-network can be seamlessly integrated into the SGD optimization of the classification network. We evaluate WarPI on four benchmarks of robust learning with noisy labels and achieve the new state-of-the-art under variant noise types. Extensive study and analysis also demonstrate the effectiveness of our model.

[1]  M. Verleysen,et al.  Classification in the Presence of Label Noise: A Survey , 2014, IEEE Transactions on Neural Networks and Learning Systems.

[2]  Qinghua Hu,et al.  Training Noise-Robust Deep Neural Networks via Meta-Learning , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Yoshua Bengio,et al.  Learning a synaptic learning rule , 1991, IJCNN-91-Seattle International Joint Conference on Neural Networks.

[4]  장윤희,et al.  Y. , 2003, Industrial and Labor Relations Terms.

[5]  Mohammed Bennamoun,et al.  Cost-Sensitive Learning of Deep Feature Representations From Imbalanced Data , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[6]  Zhiwu Lu,et al.  Noise-robust semi-supervised learning via fast sparse coding , 2015, Pattern Recognit..

[7]  Maoguo Gong,et al.  RBoost: Label Noise-Robust Boosting Algorithm Based on a Nonconvex Loss Function and the Numerically Stable Base Learners , 2016, IEEE Transactions on Neural Networks and Learning Systems.

[8]  Bianca Zadrozny,et al.  Learning and evaluating classifiers under sample selection bias , 2004, ICML.

[9]  Swami Sankaranarayanan,et al.  Learning From Noisy Labels by Regularized Estimation of Annotator Confusion , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Zhiwu Lu,et al.  Learning from Weak and Noisy Labels for Semantic Segmentation. , 2017, IEEE transactions on pattern analysis and machine intelligence.

[11]  Pengxiang Wu,et al.  Learning with Feature-Dependent Label Noise: A Progressive Approach , 2021, ICLR.

[12]  Xingrui Yu,et al.  Co-teaching: Robust training of deep neural networks with extremely noisy labels , 2018, NeurIPS.

[13]  Oriol Vinyals,et al.  Matching Networks for One Shot Learning , 2016, NIPS.

[14]  Luca Bertinetto,et al.  Meta-learning with differentiable closed-form solvers , 2018, ICLR.

[15]  Sergey Levine,et al.  Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks , 2017, ICML.

[16]  Matthieu Guillaumin,et al.  Food-101 - Mining Discriminative Components with Random Forests , 2014, ECCV.

[17]  Susan T. Dumais,et al.  Meta Label Correction for Noisy Label Learning , 2019, AAAI.

[18]  Shai Shalev-Shwartz,et al.  Decoupling "when to update" from "how to update" , 2017, NIPS.

[19]  Shuo Li,et al.  Automated segmentation and area estimation of neural foramina with boundary regression model , 2017, Pattern Recognit..

[20]  Sergey Levine,et al.  Probabilistic Model-Agnostic Meta-Learning , 2018, NeurIPS.

[21]  Marcel J. T. Reinders,et al.  Classification in the presence of class noise using a probabilistic Kernel Fisher method , 2007, Pattern Recognit..

[22]  Robert Jenssen,et al.  Noisy multi-label semi-supervised dimensionality reduction , 2019, Pattern Recognit..

[23]  Lei Zhang,et al.  CleanNet: Transfer Learning for Scalable Image Classifier Training with Label Noise , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[24]  N. Courty,et al.  Wasserstein Adversarial Regularization for Learning With Label Noise , 2021, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[25]  Sebastian Nowozin,et al.  Meta-Learning Probabilistic Inference for Prediction , 2018, ICLR.

[26]  Mert R. Sabuncu,et al.  Generalized Cross Entropy Loss for Training Deep Neural Networks with Noisy Labels , 2018, NeurIPS.

[27]  Yee Whye Teh,et al.  MetaFun: Meta-Learning with Iterative Functional Updates , 2020, ICML.

[28]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[29]  Bin Yang,et al.  Learning to Reweight Examples for Robust Deep Learning , 2018, ICML.

[30]  Noel E. O'Connor,et al.  Unsupervised label noise modeling and loss correction , 2019, ICML.

[31]  Bo An,et al.  Combating Noisy Labels by Agreement: A Joint Training Method with Co-Regularization , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[32]  Yilong Yin,et al.  Learning to Learn Kernels with Variational Random Features , 2020, ICML.

[33]  Yi Ding,et al.  Augmentation Strategies for Learning with Noisy Labels , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[34]  Xingrui Yu,et al.  How does Disagreement Help Generalization against Label Corruption? , 2019, ICML.

[35]  Gang Niu,et al.  Parts-dependent Label Noise: Towards Instance-dependent Label Noise , 2020, ArXiv.

[36]  Kevin Gimpel,et al.  Using Trusted Data to Train Deep Networks on Labels Corrupted by Severe Noise , 2018, NeurIPS.

[37]  Paul E. Utgoff,et al.  Shift of bias for inductive concept learning , 1984 .

[38]  Gang Niu,et al.  Are Anchor Points Really Indispensable in Label-Noise Learning? , 2019, NeurIPS.

[39]  Katja Hofmann,et al.  Fast Context Adaptation via Meta-Learning , 2018, ICML.

[40]  Ivor W. Tsang,et al.  Masking: A New Perspective of Noisy Supervision , 2018, NeurIPS.

[41]  Jae-Gil Lee,et al.  SELFIE: Refurbishing Unclean Samples for Robust Deep Learning , 2019, ICML.

[42]  Jacob Goldberger,et al.  Training deep neural-networks using a noise adaptation layer , 2016, ICLR.

[43]  Mohan S. Kankanhalli,et al.  Learning to Learn From Noisy Labeled Data , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[44]  Dumitru Erhan,et al.  Training Deep Neural Networks on Noisy Labels with Bootstrapping , 2014, ICLR.

[45]  Xinlei Chen,et al.  Webly Supervised Learning of Convolutional Networks , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[46]  Kun Yi,et al.  Probabilistic End-To-End Noise Correction for Learning With Noisy Labels , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[47]  José A. Sáez,et al.  ANCES: A novel method to repair attribute noise in classification problems , 2022, Pattern Recognit..

[48]  Dacheng Tao,et al.  Classification with Noisy Labels by Importance Reweighting , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[49]  Razvan Pascanu,et al.  Meta-Learning with Latent Embedding Optimization , 2018, ICLR.

[50]  Chen Gong,et al.  Robust early-learning: Hindering the memorization of noisy labels , 2021, ICLR.

[51]  David M. Blei,et al.  Robust Probabilistic Modeling with Bayesian Data Reweighting , 2016, ICML.

[52]  Pietro Perona,et al.  One-shot learning of object categories , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[53]  Hongyi Zhang,et al.  mixup: Beyond Empirical Risk Minimization , 2017, ICLR.

[54]  Jeff A. Bilmes,et al.  Robust Curriculum Learning: from clean label detection to noisy label self-correction , 2021, ICLR.

[55]  Deyu Meng,et al.  Learning to Purify Noisy Labels via Meta Soft Label Corrector , 2020, AAAI.

[56]  Aritra Ghosh,et al.  Robust Loss Functions under Label Noise for Deep Neural Networks , 2017, AAAI.

[57]  Sergey Levine,et al.  Meta-Learning and Universality: Deep Representations and Gradient Descent can Approximate any Learning Algorithm , 2017, ICLR.

[58]  Yang Liu,et al.  A Second-Order Approach to Learning with Instance-Dependent Label Noise , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[59]  Xiaogang Wang,et al.  Learning from massive noisy labeled data for image classification , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[60]  Yoshua Bengio,et al.  A Closer Look at Memorization in Deep Networks , 2017, ICML.

[61]  Daphne Koller,et al.  Self-Paced Learning for Latent Variable Models , 2010, NIPS.

[62]  Danna Zhou,et al.  d. , 1840, Microbial pathogenesis.

[63]  Geoffrey E. Hinton,et al.  Regularizing Neural Networks by Penalizing Confident Output Distributions , 2017, ICLR.

[64]  James Bailey,et al.  Dimensionality-Driven Learning with Noisy Labels , 2018, ICML.

[65]  Xiaogang Wang,et al.  Deep Self-Learning From Noisy Labels , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[66]  Yang Hua,et al.  Improving MAE against CCE under Label Noise , 2019, ArXiv.

[67]  Gang Hua,et al.  Learning Discriminative Reconstructions for Unsupervised Outlier Removal , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[68]  Jun Sun,et al.  Safeguarded Dynamic Label Regression for Noisy Supervision , 2019, AAAI.

[69]  Yanyao Shen,et al.  Learning with Bad Training Data via Iterative Trimmed Loss Minimization , 2018, ICML.

[70]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[71]  Ross B. Girshick,et al.  Focal Loss for Dense Object Detection , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[72]  Sheng-Jun Huang,et al.  Partial Multi-Label Learning With Noisy Label Identification , 2020, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[73]  Qi Xie,et al.  Meta-Weight-Net: Learning an Explicit Mapping For Sample Weighting , 2019, NeurIPS.

[74]  Marta Garnelo,et al.  Adaptive Posterior Learning: few-shot learning with a surprise-based memory module , 2019, ICLR.

[75]  Dana Angluin,et al.  Learning from noisy examples , 1988, Machine Learning.

[76]  Cees G. M. Snoek,et al.  Learning to Learn Variational Semantic Memory , 2020, NeurIPS.

[77]  Razvan Pascanu,et al.  Meta-Learning with Warped Gradient Descent , 2020, ICLR.

[78]  Lawrence O. Hall,et al.  Active cleaning of label noise , 2016, Pattern Recognit..

[79]  Pheng-Ann Heng,et al.  Robustness of Accuracy Metric and its Inspirations in Learning with Noisy Labels , 2020, AAAI.

[80]  Li Fei-Fei,et al.  MentorNet: Learning Data-Driven Curriculum for Very Deep Neural Networks on Corrupted Labels , 2017, ICML.

[81]  Joan Bruna,et al.  Training Convolutional Networks with Noisy Labels , 2014, ICLR 2014.

[82]  Nick Barnes,et al.  Learning Saliency From Single Noisy Labelling: A Robust Model Fitting Perspective , 2020, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[83]  Ata Kabán,et al.  Learning kernel logistic regression in the presence of class label noise , 2014, Pattern Recognition.

[84]  P. Alam ‘G’ , 2021, Composites Engineering: An A–Z Guide.

[85]  Enming Luo,et al.  NoiseRank: Unsupervised Label Noise Reduction with Dependence Models , 2020, ECCV.

[86]  Pieter Abbeel,et al.  A Simple Neural Attentive Meta-Learner , 2017, ICLR.

[87]  Aram Galstyan,et al.  Improving Generalization by Controlling Label-Noise Information in Neural Network Weights , 2020, ICML.

[88]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[89]  Samy Bengio,et al.  Understanding deep learning requires rethinking generalization , 2016, ICLR.

[90]  Sheng Liu,et al.  Early-Learning Regularization Prevents Memorization of Noisy Labels , 2020, NeurIPS.

[91]  Junnan Li,et al.  DivideMix: Learning with Noisy Labels as Semi-supervised Learning , 2020, ICLR.

[92]  Jianxin Wu,et al.  Deep Label Distribution Learning With Label Ambiguity , 2016, IEEE Transactions on Image Processing.

[93]  P. Alam ‘N’ , 2021, Composites Engineering: An A–Z Guide.