Two-Phase Learning for Overcoming Noisy Labels

To counter the challenge associated with noise labels, the learning strategy of deep neural networks must be differentiated over the learning period during the training process. Therefore, we propose a novel two-phase learning method, MORPH, which automatically transitions its learning phase at the point when the network begins to rapidly memorize false-labeled samples. In the first phase, MORPH starts to update the network for all the training samples before the transition point. Without any supervision, the learning phase is converted to the next phase on the basis of the estimated best transition point. Subsequently, MORPH resumes the training of the network only for a maximal safe set, which maintains the collection of almost certainly true-labeled samples at each epoch. Owing to its two-phase learning, MORPH realizes noise-free training for any type of label noise for practical use. Moreover, extensive experiments using six datasets verify that MORPH significantly outperforms five state-of-the art methods in terms of test error and training time.

[1]  Junnan Li,et al.  DivideMix: Learning with Noisy Labels as Semi-supervised Learning , 2020, ICLR.

[2]  Tegan Maharaj,et al.  Deep Nets Don't Learn via Memorization , 2017, ICLR.

[3]  Shai Shalev-Shwartz,et al.  Decoupling "when to update" from "how to update" , 2017, NIPS.

[4]  Li Fei-Fei,et al.  MentorNet: Learning Data-Driven Curriculum for Very Deep Neural Networks on Corrupted Labels , 2017, ICML.

[5]  Arne Leijon,et al.  Bayesian Estimation of Beta Mixture Models with Variational Inference , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  Samet Oymak,et al.  Gradient Descent with Early Stopping is Provably Robust to Label Noise for Overparameterized Neural Networks , 2019, AISTATS.

[7]  Yoshua Bengio,et al.  A Closer Look at Memorization in Deep Networks , 2017, ICML.

[8]  Dumitru Erhan,et al.  Training Deep Neural Networks on Noisy Labels with Bootstrapping , 2014, ICLR.

[9]  Honglak Lee,et al.  Consistency Regularization for Generative Adversarial Networks , 2020, ICLR.

[10]  Frank Hutter,et al.  SGDR: Stochastic Gradient Descent with Warm Restarts , 2016, ICLR.

[11]  Samy Bengio,et al.  Identity Crisis: Memorization and Generalization under Extreme Overparameterization , 2019, ICLR.

[12]  Jae-Gil Lee,et al.  SELFIE: Refurbishing Unclean Samples for Robust Deep Learning , 2019, ICML.

[13]  Jae-Gil Lee,et al.  Learning from Noisy Labels with Deep Neural Networks: A Survey , 2020, ArXiv.

[14]  Wei Li,et al.  WebVision Database: Visual Learning and Understanding from Web Data , 2017, ArXiv.

[15]  Samy Bengio,et al.  Understanding deep learning requires rethinking generalization , 2016, ICLR.

[16]  Noel E. O'Connor,et al.  Unsupervised label noise modeling and loss correction , 2019, ICML.

[17]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[18]  Xingrui Yu,et al.  Co-teaching: Robust training of deep neural networks with extremely noisy labels , 2018, NeurIPS.

[19]  Yanyao Shen,et al.  Learning with Bad Training Data via Iterative Trimmed Loss Minimization , 2018, ICML.

[20]  Lei Zhang,et al.  CleanNet: Transfer Learning for Scalable Image Classifier Training with Label Noise , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[21]  Hans-Peter Kriegel,et al.  LOF: identifying density-based local outliers , 2000, SIGMOD '00.

[22]  Sergey Ioffe,et al.  Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning , 2016, AAAI.

[23]  Bin Yang,et al.  Learning to Reweight Examples for Robust Deep Learning , 2018, ICML.

[24]  Pengfei Chen,et al.  Understanding and Utilizing Deep Neural Networks Trained with Noisy Labels , 2019, ICML.

[25]  Yoshua Bengio,et al.  An Empirical Study of Example Forgetting during Deep Neural Network Learning , 2018, ICLR.

[26]  Hao Xing,et al.  Product Image Recognition with Guidance Learning and Noisy Supervision , 2020, Comput. Vis. Image Underst..

[27]  David Berthelot,et al.  MixMatch: A Holistic Approach to Semi-Supervised Learning , 2019, NeurIPS.

[28]  Le Song,et al.  Iterative Learning with Open-set Noisy Labels , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[29]  Matthieu Guillaumin,et al.  Food-101 - Mining Discriminative Components with Random Forests , 2014, ECCV.

[30]  Nikos Komodakis,et al.  Wide Residual Networks , 2016, BMVC.

[31]  James Bailey,et al.  Dimensionality-Driven Learning with Noisy Labels , 2018, ICML.

[32]  Kilian Q. Weinberger,et al.  Detecting Noisy Training Data with Loss Curves , 2019 .

[33]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[34]  Christian Igel,et al.  Robust Active Label Correction , 2018, AISTATS.

[35]  Jae-Gil Lee,et al.  Prestopping: How Does Early Stopping Help Generalization against Label Noise? , 2019, ArXiv.

[36]  Geoffrey E. Hinton,et al.  Distilling the Knowledge in a Neural Network , 2015, ArXiv.

[37]  Yao Li,et al.  Attend in Groups: A Weakly-Supervised Deep Learning Framework for Learning from Web Data , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[38]  Xingrui Yu,et al.  How does Disagreement Help Generalization against Label Corruption? , 2019, ICML.

[39]  Richard Nock,et al.  Making Deep Neural Networks Robust to Label Noise: A Loss Correction Approach , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).