A Tale of Two CILs: The Connections between Class Incremental Learning and Class Imbalanced Learning, and Beyond

Catastrophic forgetting, the main challenge of Class Incremental Learning, is closely related to the classifier’s bias due to imbalanced data, and most researchers resort to empirical techniques to remove the bias. Such anti-bias tricks share many ideas with the field of Class Imbalanced Learning, which encourages us to reflect on why these tricks work, and how we can design more principled solutions from a different perspective. In this paper, we comprehensively analyze the connections and seek possible collaborations between these two fields, i.e. Class Incremental Learning and Class Imbalanced Learning. Specifically, we first provide a panoramic view of recent bias correction tricks from the perspective of handling class imbalance. Then, we show that an adapted post-scaling technique which originates from Class Imbalanced Learning is on par with or even outperforms SOTA Class Incremental Learning method. Visualization via violin plots and polar charts further sheds light on how SOTA methods address the class imbalance problem from a more intuitive geometric perspective. These findings may encourage further infiltration between the two closely connected fields, but also raise concerns about whether it is correct that Class Incremental Learning degenerates into a class imbalance problem.

[1]  Bianca Zadrozny,et al.  Transforming classifier scores into accurate multiclass probability estimates , 2002, KDD.

[2]  Terrance E. Boult,et al.  Towards Open World Recognition , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Yang Song,et al.  The iNaturalist Species Classification and Detection Dataset , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[4]  Wouter M. Kouw An introduction to domain adaptation and transfer learning , 2018, ArXiv.

[5]  Michael McCloskey,et al.  Catastrophic Interference in Connectionist Networks: The Sequential Learning Problem , 1989 .

[6]  Zhi-Hua Zhou,et al.  Ieee Transactions on Knowledge and Data Engineering 1 Training Cost-sensitive Neural Networks with Methods Addressing the Class Imbalance Problem , 2022 .

[7]  Yandong Guo,et al.  Large Scale Incremental Learning , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Patrick Jähnichen,et al.  Learning to Remember: A Synaptic Plasticity Driven Framework for Continual Learning , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Yandong Guo,et al.  Incremental Classifier Learning with Generative Adversarial Networks , 2018, ArXiv.

[10]  Seetha Hari,et al.  Learning From Imbalanced Data , 2019, Advances in Computer and Electrical Engineering.

[11]  Ling Shao,et al.  Gaussian Affinity for Max-Margin Class Imbalanced Learning , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[12]  Wei Liu,et al.  Class Confidence Weighted kNN Algorithms for Imbalanced Data Sets , 2011, PAKDD.

[13]  Ross B. Girshick,et al.  LVIS: A Dataset for Large Vocabulary Instance Segmentation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Richard Lippmann,et al.  Neural Network Classifiers Estimate Bayesian a posteriori Probabilities , 1991, Neural Computation.

[15]  Ryota Tomioka,et al.  Norm-Based Capacity Control in Neural Networks , 2015, COLT.

[16]  Nikos Komodakis,et al.  Dynamic Few-Shot Visual Learning Without Forgetting , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[17]  Jinhui Tang,et al.  Few-Shot Image Recognition With Knowledge Transfer , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[18]  Yang Song,et al.  Class-Balanced Loss Based on Effective Number of Samples , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  J. Hintze,et al.  Violin plots : A box plot-density trace synergism , 1998 .

[20]  Teuvo Kohonen,et al.  The self-organizing map , 1990, Neurocomputing.

[21]  Mark Sandler,et al.  MobileNetV2: Inverted Residuals and Linear Bottlenecks , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[22]  Dragomir Anguelov,et al.  Capturing Long-Tail Distributions of Object Subcategories , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[23]  Charles X. Ling,et al.  Data Mining for Direct Marketing: Problems and Solutions , 1998, KDD.

[24]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[25]  Francisco Herrera,et al.  A Taxonomy and Experimental Study on Prototype Generation for Nearest Neighbor Classification , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[26]  Nitesh V. Chawla,et al.  SMOTE: Synthetic Minority Over-sampling Technique , 2002, J. Artif. Intell. Res..

[27]  Bianca Zadrozny,et al.  Obtaining calibrated probability estimates from decision trees and naive Bayesian classifiers , 2001, ICML.

[28]  Marcus Rohrbach,et al.  Decoupling Representation and Classifier for Long-Tailed Recognition , 2020, ICLR.

[29]  Philip H. S. Torr,et al.  GDumb: A Simple Approach that Questions Our Progress in Continual Learning , 2020, ECCV.

[30]  Fengqing Zhu,et al.  Incremental Learning in Online Scenario , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[31]  Frank Hutter,et al.  A Downsampled Variant of ImageNet as an Alternative to the CIFAR datasets , 2017, ArXiv.

[32]  Qingming Huang,et al.  Relay Backpropagation for Effective Learning of Deep Convolutional Neural Networks , 2015, ECCV.

[33]  Atsuto Maki,et al.  A systematic study of the class imbalance problem in convolutional neural networks , 2017, Neural Networks.

[34]  Fernando Nogueira,et al.  Imbalanced-learn: A Python Toolbox to Tackle the Curse of Imbalanced Datasets in Machine Learning , 2016, J. Mach. Learn. Res..

[35]  Haibo He,et al.  ADASYN: Adaptive synthetic sampling approach for imbalanced learning , 2008, 2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence).

[36]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[37]  R. Solomonoff A SYSTEM FOR INCREMENTAL LEARNING BASED ON ALGORITHMIC PROBABILITY , 1989 .

[38]  Lei Zhang,et al.  One-shot Face Recognition by Promoting Underrepresented Classes , 2017, ArXiv.

[39]  Yen-Cheng Liu,et al.  Re-evaluating Continual Learning Scenarios: A Categorization and Case for Strong Baselines , 2018, ArXiv.

[40]  Peter A. Flach,et al.  Beyond temperature scaling: Obtaining well-calibrated multiclass probabilities with Dirichlet calibration , 2019, NeurIPS.

[41]  Tao Lin,et al.  Generalized Class Incremental Learning , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[42]  Derek Hoiem,et al.  Learning without Forgetting , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[43]  N. Japkowicz Learning from Imbalanced Data Sets: A Comparison of Various Strategies * , 2000 .

[44]  Christopher Joseph Pal,et al.  Brain tumor segmentation with Deep Neural Networks , 2015, Medical Image Anal..

[45]  Francisco Herrera,et al.  A unifying view on dataset shift in classification , 2012, Pattern Recognit..

[46]  Seong Joon Oh,et al.  CutMix: Regularization Strategy to Train Strong Classifiers With Localizable Features , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[47]  Ming-Hsuan Yang,et al.  Rethinking Class-Balanced Methods for Long-Tailed Visual Recognition From a Domain Adaptation Perspective , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[48]  Gaël Varoquaux,et al.  The NumPy Array: A Structure for Efficient Numerical Computation , 2011, Computing in Science & Engineering.

[49]  Nathan Srebro,et al.  Exploring Generalization in Deep Learning , 2017, NIPS.

[50]  Christoph H. Lampert,et al.  iCaRL: Incremental Classifier and Representation Learning , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[51]  Nathalie Japkowicz,et al.  The class imbalance problem: A systematic study , 2002, Intell. Data Anal..

[52]  Carla E. Brodley,et al.  Class Imbalance, Redux , 2011, 2011 IEEE 11th International Conference on Data Mining.

[53]  Fei Yin,et al.  Robust Classification with Convolutional Prototype Learning , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[54]  Frank Hutter,et al.  SGDR: Stochastic Gradient Descent with Warm Restarts , 2016, ICLR.

[55]  Adrian Popescu,et al.  IL2M: Class Incremental Learning With Dual Memory , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[56]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[57]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[58]  Ling Shao,et al.  Striking the Right Balance With Uncertainty , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[59]  Shutao Xia,et al.  Maintaining Discrimination and Fairness in Class Incremental Learning , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[60]  Yuan Yu,et al.  TensorFlow: A system for large-scale machine learning , 2016, OSDI.

[61]  Faisal Shafait,et al.  Revisiting Distillation and Incremental Classifier Learning , 2018, ACCV.

[62]  Cordelia Schmid,et al.  End-to-End Incremental Learning , 2018, ECCV.

[63]  Jiwon Kim,et al.  Continual Learning with Deep Generative Replay , 2017, NIPS.

[64]  Philip H. S. Torr,et al.  Riemannian Walk for Incremental Learning: Understanding Forgetting and Intransigence , 2018, ECCV.

[65]  Peter A. Flach,et al.  Beyond sigmoids: How to obtain well-calibrated probabilities from binary classifiers with beta calibration , 2017 .

[66]  Shiguang Shan,et al.  Exemplar-Supported Generative Reproduction for Class Incremental Learning , 2018, BMVC.

[67]  Shih-Fu Chang,et al.  Low-shot Learning via Covariance-Preserving Adversarial Augmentation Networks , 2018, NeurIPS.

[68]  Barbara Hammer,et al.  Incremental learning algorithms and applications , 2016, ESANN.

[69]  Xiu-Shen Wei,et al.  BBN: Bilateral-Branch Network With Cumulative Learning for Long-Tailed Visual Recognition , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[70]  Andreas S. Tolias,et al.  Three scenarios for continual learning , 2019, ArXiv.

[71]  Kilian Q. Weinberger,et al.  On Calibration of Modern Neural Networks , 2017, ICML.

[72]  Shigeo Abe DrEng Pattern Classification , 2001, Springer London.

[73]  Ruiping Wang,et al.  CVPR 2020 Continual Learning in Computer Vision Competition: Approaches, Results, Current Challenges and Future Directions , 2020, Artif. Intell..

[74]  Ah Chung Tsoi,et al.  Neural Network Classification and Prior Class Probabilities , 1996, Neural Networks: Tricks of the Trade.