Provably End-to-end Label-Noise Learning without Anchor Points

In label-noise learning, the transition matrix plays a key role in building statistically consistent classifiers. Existing consistent estimators for the transition matrix have been developed by exploiting anchor points. However, the anchorpoint assumption is not always satisfied in real scenarios. In this paper, we propose an end-toend framework for solving label-noise learning without anchor points, in which we simultaneously optimize two objectives: the cross entropy loss between the noisy label and the predicted probability by the neural network, and the volume of the simplex formed by the columns of the transition matrix. Our proposed framework can identify the transition matrix if the clean class-posterior probabilities are sufficiently scattered. This is by far the mildest assumption under which the transition matrix is provably identifiable and the learned classifier is statistically consistent. Experimental results on benchmark datasets demonstrate the effectiveness and robustness of the proposed method.

[1]  James Bailey,et al.  Dimensionality-Driven Learning with Noisy Labels , 2018, ICML.

[2]  K. Schittkowski,et al.  NONLINEAR PROGRAMMING , 2022 .

[3]  Stephen P. Boyd,et al.  Log-det heuristic for matrix rank minimization with applications to Hankel and Euclidean distance matrices , 2003, Proceedings of the 2003 American Control Conference, 2003..

[4]  Tailin Wu,et al.  Learning with Confident Examples: Rank Pruning for Robust Classification with Noisy Labels , 2017, UAI.

[5]  Gang Niu,et al.  Confidence Scores Make Instance-dependent Label-noise Learning Possible , 2019, ICML.

[6]  Tinne Hoff Kjeldsen,et al.  Traces and Emergence of Nonlinear Programming , 2013 .

[8]  Junnan Li,et al.  DivideMix: Learning with Noisy Labels as Semi-supervised Learning , 2020, ICLR.

[9]  Xingrui Yu,et al.  Co-teaching: Robust training of deep neural networks with extremely noisy labels , 2018, NeurIPS.

[10]  Samy Bengio,et al.  Understanding deep learning requires rethinking generalization , 2016, ICLR.

[11]  Wei Li,et al.  WebVision Database: Visual Learning and Understanding from Web Data , 2017, ArXiv.

[12]  Xingrui Yu,et al.  How does Disagreement Help Generalization against Label Corruption? , 2019, ICML.

[13]  Kiyoharu Aizawa,et al.  Joint Optimization Framework for Learning with Noisy Labels , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[14]  Gang Niu,et al.  Class2Simi: A New Perspective on Learning with Label Noise , 2020, ArXiv.

[15]  Chen Gong,et al.  Robust early-learning: Hindering the memorization of noisy labels , 2021, ICLR.

[16]  Bin Yang,et al.  Learning to Reweight Examples for Robust Deep Learning , 2018, ICML.

[17]  Bo Yang,et al.  Robust Volume Minimization-Based Matrix Factorization for Remote Sensing and Document Clustering , 2016, IEEE Transactions on Signal Processing.

[18]  Thomas Brox,et al.  SELF: Learning to Filter Noisy Labels with Self-Ensembling , 2019, ICLR.

[19]  Xiaogang Wang,et al.  Learning from massive noisy labeled data for image classification , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  Yoshua Bengio,et al.  A Closer Look at Memorization in Deep Networks , 2017, ICML.

[21]  Amit Daniely,et al.  Generalization Bounds for Neural Networks via Approximate Description Length , 2019, NeurIPS.

[22]  Gang Niu,et al.  Searching to Exploit Memorization Effect in Learning with Noisy Labels , 2020, ICML.

[23]  Yang Liu,et al.  Clusterability as an Alternative to Anchor Points When Learning with Noisy Labels , 2021, ICML.

[24]  Ashish Khetan,et al.  Robustness of Conditional GANs to Noisy Labels , 2018, NeurIPS.

[25]  Dacheng Tao,et al.  Label-Noise Robust Domain Adaptation , 2020, ICML.

[26]  Nagarajan Natarajan,et al.  Learning with Noisy Labels , 2013, NIPS.

[27]  Xingrui Yu,et al.  SIGUA: Forgetting May Make Learning with Noisy Labels More Robust , 2018, ICML.

[28]  Dacheng Tao,et al.  Classification with Noisy Labels by Importance Reweighting , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[29]  Shai Shalev-Shwartz,et al.  Decoupling "when to update" from "how to update" , 2017, NIPS.

[30]  Gang Niu,et al.  Are Anchor Points Really Indispensable in Label-Noise Learning? , 2019, NeurIPS.

[31]  Jacob Goldberger,et al.  Training deep neural-networks using a noise adaptation layer , 2016, ICLR.

[32]  Hairong Qi,et al.  Endmember Extraction From Highly Mixed Data Using Minimum Volume Constrained Nonnegative Matrix Factorization , 2007, IEEE Transactions on Geoscience and Remote Sensing.

[33]  Yang Liu,et al.  A Second-Order Approach to Learning with Instance-Dependent Label Noise , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[34]  Li Fei-Fei,et al.  MentorNet: Learning Data-Driven Curriculum for Very Deep Neural Networks on Corrupted Labels , 2017, ICML.

[35]  Clayton Scott,et al.  A Rate of Convergence for Mixture Proportion Estimation, with Application to Learning from Noisy Labels , 2015, AISTATS.

[36]  Yizhou Wang,et al.  L_DMI: A Novel Information-theoretic Loss Function for Training Deep Nets Robust to Label Noise , 2019, NeurIPS.

[37]  Masashi Sugiyama,et al.  Learning from Multiple Complementary Labels , 2019, arXiv.org.

[38]  Joan Bruna,et al.  Training Convolutional Networks with Noisy Labels , 2014, ICLR 2014.

[39]  Stephen P. Boyd,et al.  Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[40]  Yong Yu,et al.  Robust Recovery of Subspace Structures by Low-Rank Representation , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[41]  W. Karush Minima of Functions of Several Variables with Inequalities as Side Conditions , 2014 .

[42]  Dumitru Erhan,et al.  Training Deep Neural Networks on Noisy Labels with Bootstrapping , 2014, ICLR.

[43]  José M. Bioucas-Dias,et al.  Minimum Volume Simplex Analysis: A Fast Algorithm to Unmix Hyperspectral Data , 2008, IGARSS 2008 - 2008 IEEE International Geoscience and Remote Sensing Symposium.

[44]  Dacheng Tao,et al.  Learning with Biased Complementary Labels , 2017, ECCV.

[45]  Xiao Fu,et al.  On Identifiability of Nonnegative Matrix Factorization , 2017, IEEE Signal Processing Letters.

[46]  Richard Nock,et al.  Making Deep Neural Networks Robust to Label Noise: A Loss Correction Approach , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[47]  Mert R. Sabuncu,et al.  Generalized Cross Entropy Loss for Training Deep Neural Networks with Noisy Labels , 2018, NeurIPS.

[48]  Gang Niu,et al.  Dual T: Reducing Estimation Error for Transition Matrix in Label-noise Learning , 2020, NeurIPS.

[49]  Gang Niu,et al.  Parts-dependent Label Noise: Towards Instance-dependent Label Noise , 2020, ArXiv.

[50]  Ivor W. Tsang,et al.  Masking: A New Perspective of Noisy Supervision , 2018, NeurIPS.

[51]  Nikos D. Sidiropoulos,et al.  Blind Separation of Quasi-Stationary Sources: Exploiting Convex Geometry in Covariance Domain , 2015, IEEE Transactions on Signal Processing.

[52]  Gang Niu,et al.  Unbiased Risk Estimators Can Mislead: A Case Study of Learning with Complementary Labels , 2020, ICML.

[53]  Christian Igel,et al.  Robust Active Label Correction , 2018, AISTATS.

[54]  Gang Niu,et al.  Learning Noise Transition Matrix from Only Noisy Labels via Total Variation Regularization , 2021, ICML.

[55]  Yang Liu,et al.  Peer Loss Functions: Learning from Noisy Labels without Knowing Noise Rates , 2019, ICML.