Class Incremental Learning with Self-Supervised Pre-Training and Prototype Learning

Deep Neural Network (DNN) has achieved great success on datasets of closed class set. However, new classes, like new categories of social media topics, are continuously added to the real world, making it necessary to incrementally learn. This is hard for DNN because it tends to focus on fitting to new classes while ignoring old classes, a phenomenon known as catastrophic forgetting. State-of-the-art methods rely on knowledge distillation and data replay techniques but still have limitations. In this work, we analyze the causes of catastrophic forgetting in class incremental learning, which owes to three factors: representation drift, representation confusion, and classifier distortion. Based on this view, we propose a two-stage learning framework with a fixed encoder and an incrementally updated prototype classifier. The encoder is trained with self-supervised learning to generate a feature space with high intrinsic dimensionality, thus improving its transferability and generality. The classifier incrementally learns new prototypes while retaining the prototypes of previously learned data, which is crucial in preserving the decision boundary.Our method does not rely on preserved samples of old classes, is thus a non-exemplar based CIL method. Experiments on public datasets show that our method can significantly outperform state-of-the-art exemplar-based methods when they reserved 5 examplers per class, under the incremental setting of 10 phases, by 18.24% on CIFAR-100 and 9.37% on ImageNet100.

[1]  R. Feris,et al.  CODA-Prompt: COntinual Decomposed Attention-Based Prompting for Rehearsal-Free Continual Learning , 2022, 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Zhiwu Huang,et al.  S-Prompts Learning with Pre-trained Transformers: An Occam's Razor for Domain Incremental Learning , 2022, NeurIPS.

[3]  M. Ozay,et al.  Bring Evanescent Representations to Life in Lifelong Class Incremental Learning , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Simone Scardapane,et al.  Continual Barlow Twins: Continual Self-Supervised Learning for Remote Sensing Semantic Segmentation , 2022, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.

[5]  Jennifer G. Dy,et al.  DualPrompt: Complementary Prompting for Rehearsal-free Continual Learning , 2022, ECCV.

[6]  Zhengjun Zha,et al.  Self-Sustaining Representation Expansion for Non-Exemplar Class-Incremental Learning , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Xialei Liu,et al.  Representation Compensation Networks for Continual Semantic Segmentation , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Yihong Gong,et al.  Model Behavior Preserving for Class-Incremental Learning , 2022, IEEE Transactions on Neural Networks and Learning Systems.

[9]  Fu Lee Wang,et al.  PyCIL: a Python toolbox for class-incremental learning , 2021, Science China Information Sciences.

[10]  Jennifer G. Dy,et al.  Learning to Prompt for Continual Learning , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Alahari Karteek,et al.  Self-Supervised Models are Continual Learners , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Chen Zhu,et al.  Understanding the Role of Self-Supervised Learning in Out-of-Distribution Detection Task , 2021, ArXiv.

[13]  Mahardhika Pratama,et al.  Unsupervised Continual Learning in Streaming Environments , 2021, IEEE Transactions on Neural Networks and Learning Systems.

[14]  Longxiang Gao,et al.  Prototype-Guided Memory Replay for Continual Learning. , 2021, IEEE transactions on neural networks and learning systems.

[15]  De-Chuan Zhan,et al.  Co-Transport for Class-Incremental Learning , 2021, ACM Multimedia.

[16]  Fei Yin,et al.  Prototype Augmentation and Self-Supervision for Incremental Learning , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Yann LeCun,et al.  VICReg: Variance-Invariance-Covariance Regularization for Self-Supervised Learning , 2021, ICLR.

[18]  Julien Mairal,et al.  Emerging Properties in Self-Supervised Vision Transformers , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[19]  Piotr Koniusz,et al.  On Learning the Geodesic Path for Incremental Learning , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  Xuming He,et al.  DER: Dynamically Expandable Representation for Class Incremental Learning , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Yann LeCun,et al.  Barlow Twins: Self-Supervised Learning via Redundancy Reduction , 2021, ICML.

[22]  Chunyan Miao,et al.  Distilling Causal Effect of Data in Class-Incremental Learning , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  Yuandong Tian,et al.  Understanding self-supervised Learning Dynamics without Contrastive Pairs , 2021, ICML.

[24]  Fei Yin,et al.  Convolutional Prototype Network for Open Set Recognition , 2020, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[25]  Xinlei Chen,et al.  Exploring Simple Siamese Representation Learning , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[26]  Gyeong-Moon Park,et al.  Convolutional Neural Network With Developmental Memory for Continual Learning , 2020, IEEE Transactions on Neural Networks and Learning Systems.

[27]  Priya Goyal,et al.  Unsupervised Learning of Visual Features by Contrasting Cluster Assignments , 2020, NeurIPS.

[28]  Chong You,et al.  Learning Diverse and Discriminative Representations via the Principle of Maximal Coding Rate Reduction , 2020, NeurIPS.

[29]  Pierre H. Richemond,et al.  Bootstrap Your Own Latent: A New Approach to Self-Supervised Learning , 2020, NeurIPS.

[30]  Diego Klabjan,et al.  Efficient Architecture Search for Continual Learning , 2020, IEEE Transactions on Neural Networks and Learning Systems.

[31]  Charles Ollion,et al.  PODNet: Pooled Outputs Distillation for Small-Tasks Incremental Learning , 2020, ECCV.

[32]  Taesup Moon,et al.  SS-IL: Separated Softmax for Incremental Learning , 2020, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[33]  Bernt Schiele,et al.  Mnemonics Training: Multi-Class Incremental Learning Without Forgetting , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[34]  Geoffrey E. Hinton,et al.  A Simple Framework for Contrastive Learning of Visual Representations , 2020, ICML.

[35]  Ross B. Girshick,et al.  Momentum Contrast for Unsupervised Visual Representation Learning , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[36]  Shirin Enshaeifar,et al.  Continual Learning Using Bayesian Neural Networks , 2019, IEEE Transactions on Neural Networks and Learning Systems.

[37]  Adrian Popescu,et al.  IL2M: Class Incremental Learning With Dual Memory , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[38]  Matthias De Lange,et al.  A Continual Learning Survey: Defying Forgetting in Classification Tasks , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[39]  Phillip Isola,et al.  Contrastive Multiview Coding , 2019, ECCV.

[40]  R Devon Hjelm,et al.  Learning Representations by Maximizing Mutual Information Across Views , 2019, NeurIPS.

[41]  Dahua Lin,et al.  Learning a Unified Classifier Incrementally via Rebalancing , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[42]  J. Macke,et al.  Intrinsic dimension of data representations in deep neural networks , 2019, NeurIPS.

[43]  Rama Chellappa,et al.  Learning Without Memorizing , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[44]  C. Lee Giles,et al.  Continual Learning of Recurrent Neural Networks by Locally Aligning Distributed Representations , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[45]  Adrian Popescu,et al.  DeeSIL: Deep-Shallow Incremental Learning , 2018, ECCV Workshops.

[46]  Cordelia Schmid,et al.  End-to-End Incremental Learning , 2018, ECCV.

[47]  Matthijs Douze,et al.  Deep Clustering for Unsupervised Learning of Visual Features , 2018, ECCV.

[48]  Oriol Vinyals,et al.  Representation Learning with Contrastive Predictive Coding , 2018, ArXiv.

[49]  Stefan Wermter,et al.  Continual Lifelong Learning with Neural Networks: A Review , 2018, Neural Networks.

[50]  Nikos Komodakis,et al.  Unsupervised Representation Learning by Predicting Image Rotations , 2018, ICLR.

[51]  Alexandros Karatzoglou,et al.  Overcoming catastrophic forgetting with hard attention to the task , 2018, ICML.

[52]  Marcus Rohrbach,et al.  Memory Aware Synapses: Learning what (not) to forget , 2017, ECCV.

[53]  Svetlana Lazebnik,et al.  PackNet: Adding Multiple Tasks to a Single Network by Iterative Pruning , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[54]  Yan Liu,et al.  Deep Generative Dual Memory Network for Continual Learning , 2017, ArXiv.

[55]  Sung Ju Hwang,et al.  Lifelong Learning with Dynamically Expandable Networks , 2017, ICLR.

[56]  Jiwon Kim,et al.  Continual Learning with Deep Generative Replay , 2017, NIPS.

[57]  Ross B. Girshick,et al.  Mask R-CNN , 2017, 1703.06870.

[58]  Surya Ganguli,et al.  Continual Learning Through Synaptic Intelligence , 2017, ICML.

[59]  Andrei A. Rusu,et al.  Overcoming catastrophic forgetting in neural networks , 2016, Proceedings of the National Academy of Sciences.

[60]  Christoph H. Lampert,et al.  iCaRL: Incremental Classifier and Representation Learning , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[61]  Frank Hutter,et al.  SGDR: Stochastic Gradient Descent with Warm Restarts , 2016, ICLR.

[62]  Derek Hoiem,et al.  Learning without Forgetting , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[63]  Razvan Pascanu,et al.  Progressive Neural Networks , 2016, ArXiv.

[64]  Alexei A. Efros,et al.  Context Encoders: Feature Learning by Inpainting , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[65]  Gregory Shakhnarovich,et al.  Learning Representations for Automatic Colorization , 2016, ECCV.

[66]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[67]  Ali Farhadi,et al.  You Only Look Once: Unified, Real-Time Object Detection , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[68]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[69]  Alexei A. Efros,et al.  Unsupervised Visual Representation Learning by Context Prediction , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[70]  Geoffrey E. Hinton,et al.  Distilling the Knowledge in a Neural Network , 2015, ArXiv.

[71]  Trevor Darrell,et al.  Fully convolutional networks for semantic segmentation , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[72]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[73]  Fei-Fei Li,et al.  ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[74]  Hiroshi Sako,et al.  Discriminative learning quadratic discriminant function for handwriting recognition , 2004, IEEE Transactions on Neural Networks.

[75]  R. French Catastrophic forgetting in connectionist networks , 1999, Trends in Cognitive Sciences.

[76]  Anthony V. Robins,et al.  Catastrophic Forgetting, Rehearsal and Pseudorehearsal , 1995, Connect. Sci..

[77]  Tinne Tuytelaars,et al.  More Classifiers, Less Forgetting: A Generic Multi-classifier Paradigm for Incremental Learning , 2020, ECCV.

[78]  Suamanda Ika Novichasari,et al.  Learning Vector Quantization , 2010, Encyclopedia of Machine Learning.

[79]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[80]  Michael McCloskey,et al.  Catastrophic Interference in Connectionist Networks: The Sequential Learning Problem , 1989 .

[81]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .