论文信息 - Knockoffs-SPR: Clean Sample Selection in Learning with Noisy Labels

Knockoffs-SPR: Clean Sample Selection in Learning with Noisy Labels

A noisy training set usually leads to the degradation of the generalization and robustness of neural networks. In this paper, we propose a novel theoretically guaranteed clean sample selection framework for learning with noisy labels. Specifically, we first present a Scalable Penalized Regression (SPR) method, to model the linear relation between network features and one-hot labels. In SPR, the clean data are identified by the zero mean-shift parameters solved in the regression model. We theoretically show that SPR can recover clean data under some conditions. Under general scenarios, the conditions may be no longer satisfied; and some noisy data are falsely selected as clean data. To solve this problem, we propose a data-adaptive method for Scalable Penalized Regression with Knockoff filters (Knockoffs-SPR), which is provable to control the False-Selection-Rate (FSR) in the selected clean data. To improve the efficiency, we further present a split algorithm that divides the whole training set into small pieces that can be solved in parallel to make the framework scalable to large datasets. While Knockoffs-SPR can be regarded as a sample selection module for a standard supervised training pipeline, we further combine it with a semi-supervised algorithm to exploit the support of noisy data as unlabeled data. Experimental results on several benchmark datasets and real-world noisy datasets show the effectiveness of our framework and validate the theoretical results of Knockoffs-SPR. Our code and pre-trained models are available at https://github.com/Yikai-Wang/Knockoffs-SPR.

Yanwei Fu | Xinwei Sun | Yikai Wang

[1] Yanwei Fu,et al. Scalable Penalized Regression for Noise Detection in Learning with Noisy Labels , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[2] Deming Zhai,et al. Learning with Noisy Labels via Sparse Regularization , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[3] Yuan Yao,et al. Controlling the False Discovery Rate in Transformational Sparsity: Split Knockoffs , 2021, 2103.16159.

[4] Xiaochun Cao,et al. Evaluating Visual Properties via Robust HodgeRank , 2021, International Journal of Computer Vision.

[5] Dimitris N. Metaxas,et al. A Topological Filter for Learning with Label Noise , 2020, NeurIPS.

[6] Xinlei Chen,et al. Exploring Simple Siamese Representation Learning , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[7] Yanwei Fu,et al. How to Trust Unlabeled Data? Instance Credibility Inference for Few-Shot Learning , 2020, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8] Ankit Singh Rawat,et al. Can gradient clipping mitigate label noise? , 2020, ICLR.

[9] Yanwei Fu,et al. Instance Credibility Inference for Few-Shot Learning , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[10] Junnan Li,et al. DivideMix: Learning with Noisy Labels as Semi-supervised Learning , 2020, ICLR.

[11] Hanze Dong,et al. Extreme vocabulary learning , 2019, Frontiers of Computer Science.

[12] Iryna Gurevych,et al. Scalable Bayesian preference learning for crowds , 2019, Machine Learning.

[13] Thomas Brox,et al. SELF: Learning to Filter Noisy Labels with Self-Ensembling , 2019, ICLR.

[14] James Bailey,et al. Symmetric Cross Entropy for Robust Learning With Noisy Labels , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[15] Yali Wang,et al. MetaCleaner: Learning to Hallucinate Clean Representations for Noisy-Labeled Visual Recognition , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[16] Yueming Lyu,et al. Curriculum Loss: Robust Learning and Generalization against Label Corruption , 2019, ICLR.

[17] Jae-Gil Lee,et al. SELFIE: Refurbishing Unclean Samples for Robust Deep Learning , 2019, ICML.

[18] Jeff A. Bilmes,et al. Combating Label Noise in Deep Learning Using Abstention , 2019, ICML.

[19] Seong Joon Oh,et al. CutMix: Regularization Strategy to Train Strong Classifiers With Localizable Features , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[20] Pengfei Chen,et al. Understanding and Utilizing Deep Neural Networks Trained with Noisy Labels , 2019, ICML.

[21] Noel E. O'Connor,et al. Unsupervised label noise modeling and loss correction , 2019, ICML.

[22] Kun Yi,et al. Probabilistic End-To-End Noise Correction for Learning With Noisy Labels , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[23] N. Silberman,et al. Learning From Noisy Labels by Regularized Estimation of Annotator Confusion , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[24] Kibok Lee,et al. Robust Inference via Generative Classifiers for Handling Noisy Labels , 2019, ICML.

[25] Xingrui Yu,et al. How does Disagreement Help Generalization against Label Corruption? , 2019, ICML.

[26] Mohan S. Kankanhalli,et al. Learning to Learn From Noisy Labeled Data , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[27] Yanyao Shen,et al. Learning with Bad Training Data via Iterative Trimmed Loss Minimization , 2018, ICML.

[28] James Bailey,et al. Dimensionality-Driven Learning with Noisy Labels , 2018, ICML.

[29] Ivor W. Tsang,et al. Masking: A New Perspective of Noisy Supervision , 2018, NeurIPS.

[30] Mert R. Sabuncu,et al. Generalized Cross Entropy Loss for Training Deep Neural Networks with Noisy Labels , 2018, NeurIPS.

[31] Masashi Sugiyama,et al. Co-teaching: Robust training of deep neural networks with extremely noisy labels , 2018, NeurIPS.

[32] Le Song,et al. Iterative Learning with Open-set Noisy Labels , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[33] Kiyoharu Aizawa,et al. Joint Optimization Framework for Learning with Noisy Labels , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[34] Bin Yang,et al. Learning to Reweight Examples for Robust Deep Learning , 2018, ICML.

[35] Li Fei-Fei,et al. MentorNet: Learning Data-Driven Curriculum for Very Deep Neural Networks on Corrupted Labels , 2017, ICML.

[36] Wei Li,et al. WebVision Database: Visual Learning and Understanding from Web Data , 2017, ArXiv.

[37] Yoshua Bengio,et al. A Closer Look at Memorization in Deep Networks , 2017, ICML.

[38] Shai Shalev-Shwartz,et al. Decoupling "when to update" from "how to update" , 2017, NIPS.

[39] Arash Vahdat,et al. Toward Robustness against Label Noise in Training Deep Discriminative Neural Networks , 2017, NIPS.

[40] Yale Song,et al. Learning from Noisy Labels with Distillation , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[41] Aritra Ghosh,et al. Robust Loss Functions under Label Noise for Deep Neural Networks , 2017, AAAI.

[42] Abhinav Gupta,et al. Learning from Noisy Large-Scale Datasets with Minimal Supervision , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[43] Samy Bengio,et al. Understanding deep learning requires rethinking generalization , 2016, ICLR.

[44] Jacob Goldberger,et al. Training deep neural-networks using a noise adaptation layer , 2016, ICLR.

[45] Lucas Janson,et al. Panning for gold: ‘model‐X’ knockoffs for high dimensional controlled variable selection , 2016, 1610.02351.

[46] Richard Nock,et al. Making Deep Neural Networks Robust to Label Noise: A Loss Correction Approach , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[47] Xiaochun Cao,et al. False Discovery Rate Control and Statistical Quality Assessment of Annotators in Crowdsourced Ranking , 2016, ICML.

[48] Sergey Ioffe,et al. Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning , 2016, AAAI.

[49] Ran Dai,et al. The knockoff filter for FDR control in group-sparse and multitask regression , 2016, ICML.

[50] E. Candès,et al. A knockoff filter for high-dimensional selective inference , 2016, The Annals of Statistics.

[51] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[52] Xiaogang Wang,et al. Learning from massive noisy labeled data for image classification , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[53] Xinlei Chen,et al. Webly Supervised Learning of Convolutional Networks , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[54] Tao Xiang,et al. Robust Subjective Visual Property Prediction from Crowdsourced Pairwise Labels , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[55] Dumitru Erhan,et al. Training Deep Neural Networks on Noisy Labels with Bootstrapping , 2014, ICLR.

[56] E. Candès,et al. Controlling the false discovery rate via knockoffs , 2014, 1404.5609.

[57] Trevor Hastie,et al. A Blockwise Descent Algorithm for Group-penalized Multiresponse and Multinomial Regression , 2013, 1311.6529.

[58] Runlong Tang,et al. Partial Consistency with Sparse Incidental Parameters. , 2012, Statistica Sinica.

[59] Yiyuan She,et al. Outlier Detection Using Nonconvex Penalized Regression , 2010, ArXiv.

[60] Jianqing Fan,et al. A Selective Overview of Variable Selection in High Dimensional Feature Space. , 2009, Statistica Sinica.

[61] Martin J. Wainwright,et al. Sharp Thresholds for High-Dimensional and Noisy Sparsity Recovery Using $\ell _{1}$ -Constrained Quadratic Programming (Lasso) , 2009, IEEE Transactions on Information Theory.

[62] A. Yang,et al. Robust Face Recognition via Sparse Representation , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[63] Marcelo J. Moreira. A Maximum Likelihood Method for the Incidental Parameter Problem , 2008, 0909.0613.

[64] Peng Zhao,et al. On Model Selection Consistency of Lasso , 2006, J. Mach. Learn. Res..

[65] Malik Beshir Malik,et al. Applied Linear Regression , 2005, Technometrics.

[66] Sheldon M. Ross,et al. Stochastic Processes , 2018, Gauge Integral Structures for Stochastic Calculus and Quantum Electrodynamics.

[67] J. Kiefer,et al. CONSISTENCY OF THE MAXIMUM LIKELIHOOD ESTIMATOR IN THE PRESENCE OF INFINITELY MANY INCIDENTAL PARAMETERS , 1956 .

[68] Jeff A. Bilmes,et al. Robust Curriculum Learning: from clean label detection to noisy label self-correction , 2021, ICLR.

[69] Chen Gong,et al. Robust early-learning: Hindering the memorization of noisy labels , 2021, ICLR.

[70] Alex Krizhevsky,et al. Learning Multiple Layers of Features from Tiny Images , 2009 .

[71] J. Ghosh. Elimination of Nuisance Parameters , 1988 .

[72] J. Neyman,et al. Consistent Estimates Based on Partially Consistent Observations , 1948 .