A Second-Order Approach to Learning with Instance-Dependent Label Noise

The presence of label noise often misleads the training of deep neural networks. Departing from the recent literature which largely assumes the label noise rate is only determined by the true class, the errors in human-annotated labels are more likely to be dependent on the difficulty levels of tasks, resulting in settings with instance-dependent label noise. We show theoretically that the heterogeneous instance-dependent label noise is effectively down-weighting the examples with higher noise rates in a non-uniform way and thus causes imbalances, rendering the strategy of directly applying methods for class-dependent label noise questionable. In this paper, we propose and study the potentials of a second-order approach that leverages the estimation of several covariance terms defined between the instance-dependent noise rates and the Bayes optimal label. We show that this set of second-order information successfully captures the induced imbalances. We further proceed to show that with the help of the estimated secondorder information, we identify a new loss function whose expected risk of a classifier under instance-dependent label noise can be shown to be equivalent to a new problem with only class-dependent label noise. This fact allows us to develop effective loss functions to correctly evaluate models. We provide an efficient procedure to perform the estimations without accessing either ground truth labels or prior knowledge of the noise rates. Experiments on CIFAR10 and CIFAR100 with synthetic instance-dependent label noise and Clothing1M with real-world human label noise verify our approach.

[1]  Richard Nock,et al.  Making Deep Neural Networks Robust to Label Noise: A Loss Correction Approach , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Xiaogang Wang,et al.  Deep Self-Learning From Noisy Labels , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[3]  Hongxia Yang,et al.  Learning with Group Noise , 2021, AAAI.

[4]  Manfred K. Warmuth,et al.  Robust Bi-Tempered Logistic Loss Based on Bregman Divergences , 2019, NeurIPS.

[5]  Junnan Li,et al.  DivideMix: Learning with Noisy Labels as Semi-supervised Learning , 2020, ICLR.

[6]  Yang Liu,et al.  Learning with Instance-Dependent Label Noise: A Sample Sieve Approach , 2021, ICLR.

[7]  Nigam H. Shah,et al.  Learning statistical models of phenotypes using noisy labeled training data , 2016, J. Am. Medical Informatics Assoc..

[8]  Kotagiri Ramamohanarao,et al.  Learning with Bounded Instance- and Label-dependent Label Noise , 2017, ICML.

[9]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[10]  James Bailey,et al.  Symmetric Cross Entropy for Robust Learning With Noisy Labels , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[11]  Bo An,et al.  Combating Noisy Labels by Agreement: A Joint Training Method with Co-Regularization , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Samy Bengio,et al.  Understanding deep learning requires rethinking generalization , 2016, ICLR.

[13]  Gang Niu,et al.  Are Anchor Points Really Indispensable in Label-Noise Learning? , 2019, NeurIPS.

[14]  Maoguo Gong,et al.  Decomposition-Based Evolutionary Multiobjective Optimization to Self-Paced Learning , 2019, IEEE Transactions on Evolutionary Computation.

[15]  Yale Song,et al.  Learning from Noisy Labels with Distillation , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[16]  Naresh Manwani,et al.  Noise Tolerance Under Risk Minimization , 2011, IEEE Transactions on Cybernetics.

[17]  Xiaogang Wang,et al.  Learning from massive noisy labeled data for image classification , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  Masashi Sugiyama,et al.  Rethinking Importance Weighting for Deep Learning under Distribution Shift , 2020, NeurIPS.

[19]  Yang Liu,et al.  Clusterability as an Alternative to Anchor Points When Learning with Noisy Labels , 2021, ICML.

[20]  Deyu Meng,et al.  Learning Adaptive Loss for Robust Learning with Noisy Labels , 2020, ArXiv.

[21]  Lei Zhang,et al.  CleanNet: Transfer Learning for Scalable Image Classifier Training with Label Noise , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[22]  Arash Vahdat,et al.  Toward Robustness against Label Noise in Training Deep Discriminative Neural Networks , 2017, NIPS.

[23]  Li Fei-Fei,et al.  MentorNet: Learning Data-Driven Curriculum for Very Deep Neural Networks on Corrupted Labels , 2017, ICML.

[24]  Yizhou Wang,et al.  L_DMI: A Novel Information-theoretic Loss Function for Training Deep Nets Robust to Label Noise , 2019, NeurIPS.

[25]  Hongyi Zhang,et al.  mixup: Beyond Empirical Risk Minimization , 2017, ICLR.

[26]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[27]  Thomas Brox,et al.  SELF: Learning to Filter Noisy Labels with Self-Ensembling , 2019, ICLR.

[28]  Yang Liu,et al.  Peer Loss Functions: Learning from Noisy Labels without Knowing Noise Rates , 2019, ICML.

[29]  Mert R. Sabuncu,et al.  Generalized Cross Entropy Loss for Training Deep Neural Networks with Noisy Labels , 2018, NeurIPS.

[30]  Gang Niu,et al.  Learning Noise Transition Matrix from Only Noisy Labels via Total Variation Regularization , 2021, ICML.

[31]  King-Sun Fu,et al.  IEEE Transactions on Pattern Analysis and Machine Intelligence Publication Information , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[32]  Gang Niu,et al.  Confidence Scores Make Instance-dependent Label-noise Learning Possible , 2019, ICML.

[34]  Weilong Yang,et al.  Beyond Synthetic Noise: Deep Learning on Controlled Noisy Labels , 2019, ICML.

[35]  Manfred K. Warmuth,et al.  Two-temperature logistic regression based on the Tsallis divergence , 2017, AISTATS.

[36]  Aritra Ghosh,et al.  Robust Loss Functions under Label Noise for Deep Neural Networks , 2017, AAAI.

[37]  Michael I. Jordan,et al.  Convexity, Classification, and Risk Bounds , 2006 .

[38]  Gang Niu,et al.  Tackling Instance-Dependent Label Noise via a Universal Probabilistic Model , 2021, AAAI.

[39]  Nannan Wang,et al.  Extended $T$T: Learning With Mixed Closed-Set and Open-Set Noisy Labels , 2020, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[40]  Gang Niu,et al.  Dual T: Reducing Estimation Error for Transition Matrix in Label-noise Learning , 2020, NeurIPS.

[41]  Gang Niu,et al.  Parts-dependent Label Noise: Towards Instance-dependent Label Noise , 2020, ArXiv.

[42]  Rocco A. Servedio,et al.  Random classification noise defeats all convex potential boosters , 2008, ICML '08.

[43]  Abhinav Gupta,et al.  Learning from Noisy Large-Scale Datasets with Minimal Supervision , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[44]  Xingrui Yu,et al.  Co-teaching: Robust training of deep neural networks with extremely noisy labels , 2018, NeurIPS.

[45]  Xindong Wu,et al.  Improving Crowdsourced Label Quality Using Noise Correction , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[46]  Xingrui Yu,et al.  How does Disagreement Help Generalization against Label Corruption? , 2019, ICML.

[47]  Chen Gong,et al.  Robust early-learning: Hindering the memorization of noisy labels , 2021, ICLR.

[48]  Gang Niu,et al.  Searching to Exploit Memorization Effect in Learning with Noisy Labels , 2020, ICML.

[49]  Gang Niu,et al.  Provably End-to-end Label-Noise Learning without Anchor Points , 2021, ICML.

[50]  Bernhard Schölkopf,et al.  Correcting Sample Selection Bias by Unlabeled Data , 2006, NIPS.

[51]  Nagarajan Natarajan,et al.  Learning with Noisy Labels , 2013, NIPS.

[52]  Xingrui Yu,et al.  SIGUA: Forgetting May Make Learning with Noisy Labels More Robust , 2018, ICML.

[53]  Dacheng Tao,et al.  Classification with Noisy Labels by Importance Reweighting , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.