Gradient Reweighting: Towards Imbalanced Class-Incremental Learning

Class-Incremental Learning (CIL) trains a model to continually recognize new classes from non-stationary data while retaining learned knowledge. A major challenge of CIL arises when applying to real-world data characterized by non-uniform distribution, which introduces a dual imbalance problem involving (i) disparities between stored exemplars of old tasks and new class data (inter-phase imbalance), and (ii) severe class imbalances within each individual task (intra-phase imbalance). We show that this dual imbalance issue causes skewed gradient updates with biased weights in FC layers, thus inducing over/under-fitting and catastrophic forgetting in CIL. Our method addresses it by reweighting the gradients towards balanced optimization and unbiased classifier learning. Additionally, we observe imbalanced forgetting where paradoxically the instance-rich classes suffer higher performance degradation during CIL due to a larger amount of training data becoming unavailable in subsequent learning phases. To tackle this, we further introduce a distribution-aware knowledge distillation loss to mitigate forgetting by aligning output logits proportionally with the distribution of lost training data. We validate our method on CIFAR-100, ImageNetSubset, and Food101 across various evaluation protocols and demonstrate consistent improvements compared to existing works, showing great potential to apply CIL in real-world scenarios with enhanced robustness and effectiveness.

[1]  Xiaobin Chang,et al.  Dynamic Residual Classifier for Class Incremental Learning , 2023, 2023 IEEE/CVF International Conference on Computer Vision (ICCV).

[2]  Dahua Lin,et al.  Multi-Level Logit Distillation , 2023, 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Dongyan Zhao,et al.  Dealing with Cross-Task Class Discrimination in Online Continual Learning , 2023, 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Youngmin Oh,et al.  Decomposed Knowledge Distillation for Class-Incremental Semantic Segmentation , 2022, NeurIPS.

[5]  Andrew D. Bagdanov,et al.  Long-Tailed Class Incremental Learning , 2022, ECCV.

[6]  Aurélien Lucchi,et al.  A Theoretical Analysis of the Learning Dynamics under Class Imbalance , 2022, ICML.

[7]  Jun Guo,et al.  A Survey on Long-Tailed Visual Recognition , 2022, International Journal of Computer Vision.

[8]  Fu Lee Wang,et al.  FOSTER: Feature Boosting and Compression for Class-Incremental Learning , 2022, ECCV.

[9]  Tyler L. Hayes,et al.  Online Continual Learning for Embedded Devices , 2022, CoLLAs.

[10]  Jiajun Liang,et al.  Decoupled Knowledge Distillation , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  J. Choi,et al.  The Majority Can Help the Minority: Context-rich Minority Oversampling for Long-tailed Classification , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Fengqing Zhu,et al.  Online Continual Learning Via Candidates Voting , 2021, 2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV).

[13]  Shuicheng Yan,et al.  Deep Long-Tailed Learning: A Survey , 2021, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14]  Younghan Jeon,et al.  Influence-Balanced Loss for Imbalanced Visual Classification , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[15]  Silong Peng,et al.  Balanced Knowledge Distillation for Long-tailed Learning , 2021, Neurocomputing.

[16]  Piotr Koniusz,et al.  On Learning the Geodesic Path for Incremental Learning , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Xuming He,et al.  DER: Dynamically Expandable Representation for Class Incremental Learning , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  Gang Zhang,et al.  Equalization Loss v2: A New Gradient Balance Approach for Long-tailed Object Detection , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Joost van de Weijer,et al.  Class-Incremental Learning: Survey and Performance Evaluation on Image Classification , 2020, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20]  B. Schiele,et al.  Adaptive Aggregation Networks for Class-Incremental Learning , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Gunhee Kim,et al.  Imbalanced Continual Learning with Partitioning Reservoir Sampling , 2020, ECCV.

[22]  Adrian Popescu,et al.  Active Class Incremental Learning for Imbalanced Datasets , 2020, ECCV Workshops.

[23]  Philip H. S. Torr,et al.  GDumb: A Simple Approach that Questions Our Progress in Continual Learning , 2020, ECCV.

[24]  Hongsheng Li,et al.  Balanced Meta-Softmax for Long-Tailed Visual Recognition , 2020, NeurIPS.

[25]  Ankit Singh Rawat,et al.  Long-tail learning via logit adjustment , 2020, ICLR.

[26]  Marie-Francine Moens,et al.  Online Continual Learning from Imbalanced Data , 2020, ICML.

[27]  Charles Ollion,et al.  PODNet: Pooled Outputs Distillation for Small-Tasks Incremental Learning , 2020, ECCV.

[28]  Taesup Moon,et al.  SS-IL: Separated Softmax for Incremental Learning , 2020, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[29]  Junjie Yan,et al.  Equalization Loss for Long-Tailed Object Recognition , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Bernt Schiele,et al.  Mnemonics Training: Multi-Class Incremental Learning Without Forgetting , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[31]  Shutao Xia,et al.  Maintaining Discrimination and Fairness in Class Incremental Learning , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[32]  Adrian Popescu,et al.  IL2M: Class Incremental Learning With Dual Memory , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[33]  Tinne Tuytelaars,et al.  Online Continual Learning with Maximally Interfered Retrieval , 2019, ArXiv.

[34]  Colin Wei,et al.  Learning Imbalanced Datasets with Label-Distribution-Aware Margin Loss , 2019, NeurIPS.

[35]  Dahua Lin,et al.  Learning a Unified Classifier Incrementally via Rebalancing , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[36]  Yandong Guo,et al.  Large Scale Incremental Learning , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[37]  Yoshua Bengio,et al.  Gradient based sample selection for online continual learning , 2019, NeurIPS.

[38]  Yang Song,et al.  Class-Balanced Loss Based on Effective Number of Samples , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[39]  Cordelia Schmid,et al.  End-to-End Incremental Learning , 2018, ECCV.

[40]  Vincenzo Lomonaco,et al.  Continuous Learning in Single-Incremental-Task Scenarios , 2018, Neural Networks.

[41]  Atsuto Maki,et al.  A systematic study of the class imbalance problem in convolutional neural networks , 2017, Neural Networks.

[42]  Kaiming He,et al.  Focal Loss for Dense Object Detection , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[43]  Sung Ju Hwang,et al.  Lifelong Learning with Dynamically Expandable Networks , 2017, ICLR.

[44]  Marc'Aurelio Ranzato,et al.  Gradient Episodic Memory for Continual Learning , 2017, NIPS.

[45]  Surya Ganguli,et al.  Continual Learning Through Synaptic Intelligence , 2017, ICML.

[46]  Christoph H. Lampert,et al.  iCaRL: Incremental Classifier and Representation Learning , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[47]  Derek Hoiem,et al.  Learning without Forgetting , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[48]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[49]  Geoffrey E. Hinton,et al.  Distilling the Knowledge in a Neural Network , 2015, ArXiv.

[50]  Matthieu Guillaumin,et al.  Food-101 - Mining Discriminative Components with Random Forests , 2014, ECCV.

[51]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[52]  Max Welling,et al.  Herding dynamical weights to learn , 2009, ICML '09.

[53]  Taghi M. Khoshgoftaar,et al.  Experimental perspectives on learning from imbalanced data , 2007, ICML '07.

[54]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .