A Theoretical Study on Solving Continual Learning

Continual learning (CL) learns a sequence of tasks incrementally. There are two popular CL settings, class incremental learning (CIL) and task incremental learning (TIL). A major challenge of CL is catastrophic forgetting (CF). While a number of techniques are already available to effectively overcome CF for TIL, CIL remains to be highly challenging. So far, little theoretical study has been done to provide a principled guidance on how to solve the CIL problem. This paper performs such a study. It first shows that probabilistically, the CIL problem can be decomposed into two sub-problems: Within-task Prediction (WP) and Task-id Prediction (TP). It further proves that TP is correlated with out-of-distribution (OOD) detection, which connects CIL and OOD detection. The key conclusion of this study is that regardless of whether WP and TP or OOD detection are defined explicitly or implicitly by a CIL algorithm, good WP and good TP or OOD detection are necessary and sufficient for good CIL performances. Additionally, TIL is simply WP. Based on the theoretical result, new CIL methods are also designed, which outperform strong baselines in both CIL and TIL settings by a large margin.

[1]  Bin Liu,et al.  A Multi-Head Model for Continual Learning via Out-of-Distribution Replay , 2022, CoLLAs.

[2]  Dongyan Zhao,et al.  Adaptive Orthogonal Projection for Batch and Online Continual Learning , 2022, AAAI.

[3]  Changnan Xiao,et al.  Continual Learning Based on OOD Detection and Task Masking , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[4]  Zhenguo Li,et al.  Memory Replay with Data Compression for Continual Learning , 2022, ICLR.

[5]  Bing Liu,et al.  Continual Learning of a Mixed Sequence of Similar and Dissimilar Tasks , 2021, NeurIPS.

[6]  S. Akaho,et al.  Learning curves for continual learning in neural networks: Self-knowledge transfer and forgetting , 2021, ICLR.

[7]  Bing Liu,et al.  Self-Initiated Open World Learning for Autonomous AI Agents , 2021, ArXiv.

[8]  Bing Liu,et al.  Zero-Shot Out-of-Distribution Detection Based on the Pre-trained Model CLIP , 2021, AAAI.

[9]  Andrew M. Saxe,et al.  Continual Learning in the Teacher-Student Setup: Impact of Task Similarity , 2021, ICML.

[10]  Jinwoo Shin,et al.  Co2L: Contrastive Continual Learning , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[11]  Fei Yin,et al.  Prototype Augmentation and Self-Supervision for Incremental Learning , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Xuming He,et al.  DER: Dynamically Expandable Representation for Class Incremental Learning , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Jihwan Bang,et al.  Rainbow Memory: Continual Learning with a Memory of Diverse Samples , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Philip H. S. Torr,et al.  Continual Learning in Low-rank Orthogonal Subspaces , 2020, NeurIPS.

[15]  B. Schiele,et al.  Adaptive Aggregation Networks for Class-Incremental Learning , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Yixuan Li,et al.  Energy-based Out-of-distribution Detection , 2020, NeurIPS.

[17]  Jinwoo Shin,et al.  CSI: Novelty Detection via Contrastive Learning on Distributionally Shifted Instances , 2020, NeurIPS.

[18]  Ali Farhadi,et al.  Supermasks in Superposition , 2020, NeurIPS.

[19]  Masashi Sugiyama,et al.  Generalisation Guarantees for Continual Learning with Orthogonal Gradient Descent , 2020, ArXiv.

[20]  Ce Liu,et al.  Supervised Contrastive Learning , 2020, NeurIPS.

[21]  Simone Calderara,et al.  Dark Experience for General Continual Learning: a Strong, Simple Baseline , 2020, NeurIPS.

[22]  Simone Calderara,et al.  Conditional Channel Gated Networks for Task-Aware Continual Learning , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  Fahad Shahbaz Khan,et al.  iTAML: An Incremental Task-Agnostic Meta-learning Approach , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  Pramod K. Varshney,et al.  Anomalous Instance Detection in Deep Learning: A Survey , 2020, ArXiv.

[25]  Bernt Schiele,et al.  Mnemonics Training: Multi-Class Incremental Learning Without Forgetting , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[26]  Geoffrey E. Hinton,et al.  A Simple Framework for Contrastive Learning of Visual Representations , 2020, ICML.

[27]  Ali Farhadi,et al.  What’s Hidden in a Randomly Weighted Neural Network? , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[28]  Ross B. Girshick,et al.  Momentum Contrast for Unsupervised Visual Representation Learning , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[29]  Yi-Ming Chan,et al.  Compacting, Picking and Growing for Unforgetting Continual Learning , 2019, NeurIPS.

[30]  Dawn Song,et al.  Using Self-Supervised Learning Can Improve Model Robustness and Uncertainty , 2019, NeurIPS.

[31]  Benjamin F. Grewe,et al.  Continual learning with hypernetworks , 2019, ICLR.

[32]  Ming-Hsuan Yang,et al.  An Adaptive Random Path Selection Approach for Incremental Learning. , 2019, 1906.01120.

[33]  Dahua Lin,et al.  Learning a Unified Classifier Incrementally via Rebalancing , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[34]  Yandong Guo,et al.  Large Scale Incremental Learning , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[35]  Taesup Moon,et al.  Uncertainty-based Continual Learning with Adaptive Regularization , 2019, NeurIPS.

[36]  Andreas S. Tolias,et al.  Three scenarios for continual learning , 2019, ArXiv.

[37]  Patrick Jähnichen,et al.  Learning to Remember: A Synaptic Plasticity Driven Framework for Continual Learning , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[38]  Kibok Lee,et al.  Overcoming Catastrophic Forgetting With Unlabeled Data in the Wild , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[39]  Mohammad Rostami,et al.  Complementary Learning for Overcoming Catastrophic Forgetting Using Experience Replay , 2019, IJCAI.

[40]  Sung Ju Hwang,et al.  Scalable and Order-robust Continual Learning with Additive Parameter Decomposition , 2019, ICLR.

[41]  David Rolnick,et al.  Experience Replay for Continual Learning , 2018, NeurIPS.

[42]  Songcan Chen,et al.  Recent Advances in Open Set Recognition: A Survey , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[43]  Rama Chellappa,et al.  Learning Without Memorizing , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[44]  Shan Yu,et al.  Continual learning of context-dependent processing in neural networks , 2018, Nature Machine Intelligence.

[45]  Marc'Aurelio Ranzato,et al.  Efficient Lifelong Learning with A-GEM , 2018, ICLR.

[46]  Bing Liu,et al.  Overcoming Catastrophic Forgetting for Continual Learning via Model Adaptation , 2018, ICLR.

[47]  Cordelia Schmid,et al.  End-to-End Incremental Learning , 2018, ECCV.

[48]  Kibok Lee,et al.  A Simple Unified Framework for Detecting Out-of-Distribution Samples and Adversarial Attacks , 2018, NeurIPS.

[49]  Yee Whye Teh,et al.  Progress & Compress: A scalable framework for continual learning , 2018, ICML.

[50]  Zhanxing Zhu,et al.  Reinforced Continual Learning , 2018, NeurIPS.

[51]  David Barber,et al.  Online Structured Laplace Approximations For Overcoming Catastrophic Forgetting , 2018, NeurIPS.

[52]  Alexandros Karatzoglou,et al.  Overcoming Catastrophic Forgetting with Hard Attention to the Task , 2018 .

[53]  Ronald Kemker,et al.  FearNet: Brain-Inspired Model for Incremental Learning , 2017, ICLR.

[54]  Svetlana Lazebnik,et al.  PackNet: Adding Multiple Tasks to a Single Network by Iterative Pruning , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[55]  Yan Liu,et al.  Deep Generative Dual Memory Network for Continual Learning , 2017, ArXiv.

[56]  Yang You,et al.  Large Batch Training of Convolutional Networks , 2017, 1708.03888.

[57]  R. Srikant,et al.  Enhancing The Reliability of Out-of-distribution Image Detection in Neural Networks , 2017, ICLR.

[58]  Marc'Aurelio Ranzato,et al.  Gradient Episodic Memory for Continual Learning , 2017, NIPS.

[59]  Han Liu,et al.  Continual Learning in Generative Adversarial Nets , 2017, ArXiv.

[60]  Alexei A. Efros,et al.  Curiosity-Driven Exploration by Self-Supervised Prediction , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[61]  Jiwon Kim,et al.  Continual Learning with Deep Generative Replay , 2017, NIPS.

[62]  Surya Ganguli,et al.  Continual Learning Through Synaptic Intelligence , 2017, ICML.

[63]  Razvan Pascanu,et al.  Overcoming catastrophic forgetting in neural networks , 2016, Proceedings of the National Academy of Sciences.

[64]  Christoph H. Lampert,et al.  iCaRL: Incremental Classifier and Representation Learning , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[65]  Tinne Tuytelaars,et al.  Expert Gate: Lifelong Learning with a Network of Experts , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[66]  Bing Liu,et al.  Lifelong machine learning: a paradigm for continuous learning , 2017, Frontiers of Computer Science.

[67]  Frank Hutter,et al.  SGDR: Stochastic Gradient Descent with Warm Restarts , 2016, ICLR.

[68]  Junmo Kim,et al.  Less-forgetting Learning in Deep Neural Networks , 2016, ArXiv.

[69]  Derek Hoiem,et al.  Learning without Forgetting , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[70]  Giorgio Metta,et al.  Incremental robot learning of new objects with fixed update time , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[71]  Alexander Gepperth,et al.  A Bio-Inspired Incremental Learning Architecture for Applied Perceptual Problems , 2016, Cognitive Computation.

[72]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[73]  Jonathon Shlens,et al.  Explaining and Harnessing Adversarial Examples , 2014, ICLR.

[74]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[75]  Christoph H. Lampert,et al.  A PAC-Bayesian bound for Lifelong Learning , 2013, ICML.

[76]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[77]  B. Liu,et al.  Online Continual Learning through Mutual Information Maximization , 2022, ICML.

[78]  Bin Liu,et al.  CMG: A Class-Mixed Generation Approach to Out-of-Distribution Detection , 2022, ECML/PKDD.

[79]  Tinne Tuytelaars,et al.  More Classifiers, Less Forgetting: A Generic Multi-classifier Paradigm for Incremental Learning , 2020, ECCV.

[80]  Lawrence Carin,et al.  Calibrating CNNs for Lifelong Learning , 2020, NeurIPS.

[81]  Bing Liu,et al.  HRN: A Holistic Approach to One Class Learning , 2020, NeurIPS.

[82]  Bogdan Raducanu,et al.  Memory Replay GANs: Learning to Generate New Categories without Forgetting , 2018, NeurIPS.