Learning with Instance-Dependent Label Noise: A Sample Sieve Approach

Human-annotated labels are often prone to noise, and the presence of such noise will degrade the performance of the resulting deep neural network (DNN) models. Much of the literature (with several recent exceptions) of learning with noisy labels focuses on the case when the label noise is independent from features. Practically, annotations errors tend to be instance-dependent and often depend on the difficulty levels of recognizing a certain task. Applying existing results from instance-independent settings would require a significant amount of estimation of noise rates. Therefore, learning with instance-dependent label noise remains a challenge. In this paper, we propose CORES^2 (COnfidence REgularized Sample Sieve), which progressively sieves out corrupted samples. The implementation of CORES^2 does not require specifying noise rates and yet we are able to provide theoretical guarantees of CORES^2 in filtering out the corrupted examples. This high-quality sample sieve allows us to treat clean examples and the corrupted ones separately in training a DNN solution, and such a separation is shown to be advantageous in the instance-dependent noise setting. We demonstrate the performance of CORES^2 on CIFAR10 and CIFAR100 datasets with synthetic instance-dependent label noise and Clothing1M with real-world human noise. As of independent interests, our sample sieve provides a generic machinery for anatomizing noisy datasets and provides a flexible interface for various robust training techniques to further improve the performance.

[1]  Xiaogang Wang,et al.  Deep Self-Learning From Noisy Labels , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[2]  Samy Bengio,et al.  Understanding deep learning requires rethinking generalization , 2016, ICLR.

[3]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[4]  Xingrui Yu,et al.  How does Disagreement Help Generalization against Label Corruption? , 2019, ICML.

[5]  Mert R. Sabuncu,et al.  Generalized Cross Entropy Loss for Training Deep Neural Networks with Noisy Labels , 2018, NeurIPS.

[6]  Bo Han,et al.  A Bi-level Formulation for Label Noise Learning with Spectral Cluster Discovery , 2020, IJCAI.

[7]  Nagarajan Natarajan,et al.  Learning with Noisy Labels , 2013, NIPS.

[8]  Kun Yi,et al.  Probabilistic End-To-End Noise Correction for Learning With Noisy Labels , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Aritra Ghosh,et al.  Robust Loss Functions under Label Noise for Deep Neural Networks , 2017, AAAI.

[10]  Hongyi Zhang,et al.  mixup: Beyond Empirical Risk Minimization , 2017, ICLR.

[11]  Thomas Brox,et al.  SELF: Learning to Filter Noisy Labels with Self-Ensembling , 2019, ICLR.

[12]  Avanti Shrikumar,et al.  Maximum Likelihood with Bias-Corrected Calibration is Hard-To-Beat at Label Shift Adaptation , 2020, ICML.

[13]  Peter L. Bartlett,et al.  Rademacher and Gaussian Complexities: Risk Bounds and Structural Results , 2003, J. Mach. Learn. Res..

[14]  Kiyoharu Aizawa,et al.  Joint Optimization Framework for Learning with Noisy Labels , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[15]  Yang Liu,et al.  The importance of understanding instance-level noisy labels , 2021, ArXiv.

[16]  Dumitru Erhan,et al.  Training Deep Neural Networks on Noisy Labels with Bootstrapping , 2014, ICLR.

[17]  Dacheng Tao,et al.  Classification with Noisy Labels by Importance Reweighting , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[18]  Bo An,et al.  Combating Noisy Labels by Agreement: A Joint Training Method with Co-Regularization , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Gang Niu,et al.  Are Anchor Points Really Indispensable in Label-Noise Learning? , 2019, NeurIPS.

[20]  Maoguo Gong,et al.  Decomposition-Based Evolutionary Multiobjective Optimization to Self-Paced Learning , 2019, IEEE Transactions on Evolutionary Computation.

[21]  Yang Liu,et al.  A Second-Order Approach to Learning with Instance-Dependent Label Noise , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Yang Liu,et al.  When Optimizing f-divergence is Robust with Label Noise , 2020, ICLR.

[23]  Abhinav Gupta,et al.  Learning from Noisy Large-Scale Datasets with Minimal Supervision , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  Gang Niu,et al.  Searching to Exploit Memorization Effect in Learning with Noisy Labels , 2020, ICML.

[25]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[26]  Xingrui Yu,et al.  Co-teaching: Robust training of deep neural networks with extremely noisy labels , 2018, NeurIPS.

[27]  Xindong Wu,et al.  Improving Crowdsourced Label Quality Using Noise Correction , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[28]  Quoc V. Le,et al.  Unsupervised Data Augmentation for Consistency Training , 2019, NeurIPS.

[29]  Satrajit Chatterjee,et al.  Coherent Gradients: An Approach to Understanding Generalization in Gradient Descent-based Optimization , 2020, ICLR.

[30]  Shankar Krishnan,et al.  Explaining Memorization and Generalization: A Large-Scale Study with Coherent Gradients , 2020, ArXiv.

[31]  Gang Niu,et al.  Confidence Scores Make Instance-dependent Label-noise Learning Possible , 2019, ICML.

[32]  Junmo Kim,et al.  NLNL: Negative Learning for Noisy Labels , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[33]  Mohan S. Kankanhalli,et al.  Learning to Learn From Noisy Labeled Data , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[34]  Bernhard Schölkopf,et al.  Correcting Sample Selection Bias by Unlabeled Data , 2006, NIPS.

[35]  Noel E. O'Connor,et al.  Unsupervised label noise modeling and loss correction , 2019, ICML.

[36]  James Bailey,et al.  Symmetric Cross Entropy for Robust Learning With Noisy Labels , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[37]  Xiaogang Wang,et al.  Learning from massive noisy labeled data for image classification , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[38]  Amos Storkey,et al.  When Training and Test Sets are Different: Characterising Learning Transfer , 2013 .

[39]  Chang-Tien Lu,et al.  Self-Paced Robust Learning for Leveraging Clean Labels in Noisy Data , 2020, AAAI.

[40]  Li Fei-Fei,et al.  MentorNet: Learning Data-Driven Curriculum for Very Deep Neural Networks on Corrupted Labels , 2017, ICML.

[41]  Gang Niu,et al.  Dual T: Reducing Estimation Error for Transition Matrix in Label-noise Learning , 2020, NeurIPS.

[42]  Gang Niu,et al.  Parts-dependent Label Noise: Towards Instance-dependent Label Noise , 2020, ArXiv.

[43]  Nigam H. Shah,et al.  Learning statistical models of phenotypes using noisy labeled training data , 2016, J. Am. Medical Informatics Assoc..

[44]  Kotagiri Ramamohanarao,et al.  Learning with Bounded Instance- and Label-dependent Label Noise , 2017, ICML.

[45]  Yale Song,et al.  Learning from Noisy Labels with Distillation , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[46]  Naresh Manwani,et al.  Noise Tolerance Under Risk Minimization , 2011, IEEE Transactions on Cybernetics.

[47]  Masashi Sugiyama,et al.  Rethinking Importance Weighting for Deep Learning under Distribution Shift , 2020, NeurIPS.

[48]  Arash Vahdat,et al.  Toward Robustness against Label Noise in Training Deep Discriminative Neural Networks , 2017, NIPS.

[49]  Richard Nock,et al.  Making Deep Neural Networks Robust to Label Noise: A Loss Correction Approach , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[50]  Yang Liu,et al.  Peer Loss Functions: Learning from Noisy Labels without Knowing Noise Rates , 2019, ICML.

[51]  Isaac L. Chuang,et al.  Confident Learning: Estimating Uncertainty in Dataset Labels , 2019, J. Artif. Intell. Res..

[52]  Honglak Lee,et al.  Distilling Effective Supervision From Severe Label Noise , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[53]  Yang Liu,et al.  Machine-Learning Aided Peer Prediction , 2017, EC.

[54]  Yang Liu,et al.  Clusterability as an Alternative to Anchor Points When Learning with Noisy Labels , 2021, ICML.

[55]  Neil D. Lawrence,et al.  When Training and Test Sets Are Different: Characterizing Learning Transfer , 2009 .

[56]  Deyu Meng,et al.  Learning Adaptive Loss for Robust Learning with Noisy Labels , 2020, ArXiv.

[57]  Quoc V. Le,et al.  Unsupervised Data Augmentation , 2019, ArXiv.

[58]  Manfred K. Warmuth,et al.  Robust Bi-Tempered Logistic Loss Based on Bregman Divergences , 2019, NeurIPS.

[59]  Junnan Li,et al.  DivideMix: Learning with Noisy Labels as Semi-supervised Learning , 2020, ICLR.