Stochastic Feature Averaging for Learning with Long-Tailed Noisy Labels

Deep neural networks have shown promising results on a wide variety of tasks using large-scale and well-annotated training datasets. However, data collected from real-world applications can suffer from two prevalent biases, i.e., long-tailed class distribution and label noise. Previous efforts on long-tailed learning and label-noise learning can only address a single type of data bias, leading to a severe deterioration of their performance. In this paper, we propose a distance-based sample selection algorithm called Stochastic Feature Averaging (SFA), which fits a Gaussian using the exponential running average of class centroids to capture uncertainty in representation space due to label noise and data scarcity. With SFA, we detect noisy samples based on their distances to class centroids sampled from this Gaussian distribution. Based on the identified clean samples, we then propose to train an auxiliary balanced classifier to improve the generalization for the minority class and facilitate the update of Gaussian parameters. Extensive experimental results show that SFA can enhance the performance of existing methods on both simulated and real-world datasets. Further, we propose to combine SFA with the sample-selection approach, distribution-robust, and noise-robust loss functions, resulting in significant improvement in performance over the baselines. Our code is available at https://github.com/HotanLee/SFA

[1]  Yufeng Li,et al.  Robust model selection for positive and unlabeled learning with constraints , 2022, Science China Information Sciences.

[2]  Xiansheng Hua,et al.  Identifying Hard Noise in Long-Tailed Sample Distribution , 2022, ECCV.

[3]  Deming Zhai,et al.  Prototype-Anchored Learning for Learning with Imperfect Annotations , 2022, ICML.

[4]  Masashi Sugiyama,et al.  Instance-Dependent Label-Noise Learning with Manifold-Regularized Transition Matrix Estimation , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Zhi Zhou Open-environment machine learning , 2022, National science review.

[6]  Ying Wang,et al.  Delving into Sample Loss Curve to Embrace Noisy and Imbalanced Data , 2021, AAAI.

[7]  Min-Ling Zhang,et al.  Prototypical Classifier for Robust Class-Imbalanced Learning , 2021, PAKDD.

[8]  Hyuck Lee,et al.  ABC: Auxiliary Balanced Classifier for Class-imbalanced Semi-supervised Learning , 2021, NeurIPS.

[9]  Wei-Wei Tu,et al.  Robust Long-Tailed Learning under Label Noise , 2021, ArXiv.

[10]  Zhi-Fan Wu,et al.  NGC: A Unified Framework for Learning with Open-World Noisy Data , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[11]  Jiaya Jia,et al.  Parametric Contrastive Learning , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[12]  Mingming Gong,et al.  Sample Selection with Uncertainty of Losses for Learning with Noisy Labels , 2021, ICLR.

[13]  Hanwang Zhang,et al.  Long-Tailed Classification by Keeping the Good and Removing the Bad Momentum Causal Effect , 2020, Neural Information Processing Systems.

[14]  Hongsheng Li,et al.  Balanced Meta-Softmax for Long-Tailed Visual Recognition , 2020, NeurIPS.

[15]  Ankit Singh Rawat,et al.  Long-tail learning via logit adjustment , 2020, ICLR.

[16]  Sheng Liu,et al.  Early-Learning Regularization Prevents Memorization of Noisy Labels , 2020, NeurIPS.

[17]  Tengyu Ma,et al.  Heteroskedastic and Imbalanced Deep Learning with Adaptive Regularization , 2020, ICLR.

[18]  Ming-Hsuan Yang,et al.  Rethinking Class-Balanced Methods for Long-Tailed Visual Recognition From a Domain Adaptation Perspective , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Junnan Li,et al.  DivideMix: Learning with Noisy Labels as Semi-supervised Learning , 2020, ICLR.

[20]  Guiguang Ding,et al.  Learning From Multiple Experts: Self-paced Knowledge Distillation for Long-tailed Classification , 2020, ECCV.

[21]  Xiu-Shen Wei,et al.  BBN: Bilateral-Branch Network With Cumulative Learning for Long-Tailed Visual Recognition , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Saining Xie,et al.  Decoupling Representation and Classifier for Long-Tailed Recognition , 2019, ICLR.

[23]  Colin Wei,et al.  Learning Imbalanced Datasets with Label-Distribution-Aware Margin Loss , 2019, NeurIPS.

[24]  David Berthelot,et al.  MixMatch: A Holistic Approach to Semi-Supervised Learning , 2019, NeurIPS.

[25]  Noel E. O'Connor,et al.  Unsupervised label noise modeling and loss correction , 2019, ICML.

[26]  Stella X. Yu,et al.  Large-Scale Long-Tailed Recognition in an Open World , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  Qi Xie,et al.  Meta-Weight-Net: Learning an Explicit Mapping For Sample Weighting , 2019, NeurIPS.

[28]  Andrew Gordon Wilson,et al.  A Simple Baseline for Bayesian Uncertainty in Deep Learning , 2019, NeurIPS.

[29]  Yang Song,et al.  Class-Balanced Loss Based on Effective Number of Samples , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Tong Wei,et al.  Does Tail Label Help for Large-Scale Multi-Label Learning? , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[31]  Mert R. Sabuncu,et al.  Generalized Cross Entropy Loss for Training Deep Neural Networks with Noisy Labels , 2018, NeurIPS.

[32]  Masashi Sugiyama,et al.  Co-teaching: Robust training of deep neural networks with extremely noisy labels , 2018, NeurIPS.

[33]  Bin Yang,et al.  Learning to Reweight Examples for Robust Deep Learning , 2018, ICML.

[34]  Kevin Gimpel,et al.  Using Trusted Data to Train Deep Networks on Labels Corrupted by Severe Noise , 2018, NeurIPS.

[35]  Li Fei-Fei,et al.  MentorNet: Learning Data-Driven Curriculum for Very Deep Neural Networks on Corrupted Labels , 2017, ICML.

[36]  Yoshua Bengio,et al.  A Closer Look at Memorization in Deep Networks , 2017, ICML.

[37]  Aritra Ghosh,et al.  Robust Loss Functions under Label Noise for Deep Neural Networks , 2017, AAAI.

[38]  Richard Nock,et al.  Making Deep Neural Networks Robust to Label Noise: A Loss Correction Approach , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[39]  Jian Sun,et al.  Identity Mappings in Deep Residual Networks , 2016, ECCV.

[40]  Sergey Ioffe,et al.  Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning , 2016, AAAI.

[41]  Qingming Huang,et al.  Relay Backpropagation for Effective Learning of Deep Convolutional Neural Networks , 2015, ECCV.

[42]  Haim H. Permuter,et al.  A study of Gaussian mixture models of color and texture features for image classification and segmentation , 2006, Pattern Recognit..

[43]  Gu Qing,et al.  SCIENCE CHINA Information Sciences , 2014 .

[44]  B. Ripley,et al.  Pattern Recognition , 1968, Nature.