论文信息 - Generalized Jensen-Shannon Divergence Loss for Learning with Noisy Labels

Generalized Jensen-Shannon Divergence Loss for Learning with Noisy Labels

Prior works have found it beneficial to combine provably noise-robust loss functions e.g., mean absolute error (MAE) with standard categorical loss function e.g. cross entropy (CE) to improve their learnability. Here, we propose to use Jensen-Shannon divergence as a noise-robust loss function and show that it interestingly interpolate between CE and MAE with a controllable mixing parameter. Furthermore, we make a crucial observation that CE exhibits lower consistency around noisy data points. Based on this observation, we adopt a generalized version of the JensenShannon divergence for multiple distributions to encourage consistency around data points. Using this loss function, we show state-of-the-art results on both synthetic (CIFAR), and real-world (e.g. WebVision) noise with varying noise rates.

Hossein Azizpour | Erik Englesson | Hossein Azizpour | Erik Englesson

[1] Xiaohua Zhai,et al. Are we done with ImageNet? , 2020, ArXiv.

[2] Richard Nock,et al. Making Deep Neural Networks Robust to Label Noise: A Loss Correction Approach , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3] Dumitru Erhan,et al. Training Deep Neural Networks on Noisy Labels with Bootstrapping , 2014, ICLR.

[4] Graham W. Taylor,et al. Improved Regularization of Convolutional Neural Networks with Cutout , 2017, ArXiv.

[5] Jae-Gil Lee,et al. SELFIE: Refurbishing Unclean Samples for Robust Deep Learning , 2019, ICML.

[6] Samy Bengio,et al. Understanding deep learning requires rethinking generalization , 2016, ICLR.

[7] Wei Li,et al. WebVision Database: Visual Learning and Understanding from Web Data , 2017, ArXiv.

[8] Sheng Liu,et al. Early-Learning Regularization Prevents Memorization of Noisy Labels , 2020, NeurIPS.

[9] Colin Raffel,et al. Realistic Evaluation of Deep Semi-Supervised Learning Algorithms , 2018, NeurIPS.

[10] Lei Zhang,et al. CleanNet: Transfer Learning for Scalable Image Classifier Training with Label Noise , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[11] Bohyung Han,et al. Combinatorial Inference against Label Noise , 2019, NeurIPS.

[12] Jinwoo Shin,et al. Consistency Regularization for Adversarial Robustness , 2021, AAAI.

[13] Avi Mendelson,et al. Contrast to Divide: Self-Supervised Pre-Training for Learning with Noisy Labels , 2020, 2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV).

[14] James Bailey,et al. Symmetric Cross Entropy for Robust Learning With Noisy Labels , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[15] Mert R. Sabuncu,et al. Generalized Cross Entropy Loss for Training Deep Neural Networks with Noisy Labels , 2018, NeurIPS.

[16] Quoc V. Le,et al. Randaugment: Practical automated data augmentation with a reduced search space , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[17] Shin Ishii,et al. Virtual Adversarial Training: A Regularization Method for Supervised and Semi-Supervised Learning , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[18] Aritra Ghosh,et al. Robust Loss Functions under Label Noise for Deep Neural Networks , 2017, AAAI.

[19] Arash Vahdat,et al. Toward Robustness against Label Noise in Training Deep Discriminative Neural Networks , 2017, NIPS.

[20] Mohan S. Kankanhalli,et al. Learning to Learn From Noisy Labeled Data , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[21] Frank Nielsen. On the Jensen–Shannon Symmetrization of Distances Relying on Abstract Means , 2019, Entropy.

[22] Thomas Brox,et al. SELF: Learning to Filter Noisy Labels with Self-Ensembling , 2019, ICLR.

[23] Yang Liu,et al. Peer Loss Functions: Learning from Noisy Labels without Knowing Noise Rates , 2019, ICML.

[24] Nagarajan Natarajan,et al. Learning with Noisy Labels , 2013, NIPS.

[25] Li Fei-Fei,et al. MentorNet: Learning Data-Driven Curriculum for Very Deep Neural Networks on Corrupted Labels , 2017, ICML.

[26] Yizhou Wang,et al. L_DMI: A Novel Information-theoretic Loss Function for Training Deep Nets Robust to Label Noise , 2019, NeurIPS.

[27] Samet Oymak,et al. Gradient Descent with Early Stopping is Provably Robust to Label Noise for Overparameterized Neural Networks , 2019, AISTATS.

[28] Sergey Ioffe,et al. Rethinking the Inception Architecture for Computer Vision , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[29] Joan Bruna,et al. Training Convolutional Networks with Noisy Labels , 2014, ICLR 2014.

[30] James Bailey,et al. Normalized Loss Functions for Deep Learning with Noisy Labels , 2020, ICML.

[31] Junnan Li,et al. DivideMix: Learning with Noisy Labels as Semi-supervised Learning , 2020, ICLR.

[32] Ivor W. Tsang,et al. Masking: A New Perspective of Noisy Supervision , 2018, NeurIPS.