Towards Better Plasticity-Stability Trade-off in Incremental Learning: A simple Linear Connector

Plasticity-stability dilemma is a main problem for incremental learning, with plasticity referring to the ability to learn new knowledge, and stability retaining the knowledge of previous tasks. Due to the lack of training samples from previous tasks, it is hard to balance the plasticity and stability. For example, the recent null-space projection methods (e.g., Adam-NSCL) have shown promising performance on preserving previous knowledge, while such strong projection also causes the performance degradation of the current task. To achieve better plasticity-stability trade-off, in this paper, we show that a simple averaging of two independently optimized optima of networks, null-space projection for past tasks and simple SGD for the current task, can attain a meaningful balance between preserving already learned knowledge and granting sufficient flexibility for learning a new task. This simple linear connector also provides us a new perspective and technology to control the trade-off between plasticity and stability. We evaluate the proposed method on several benchmark datasets. The results indicate our simple method can achieve notable improvement, and perform well on both the past and current tasks. In short, our method is an extremely simple approach and achieves a better balance model.

[1]  Razvan Pascanu,et al.  Overcoming catastrophic forgetting in neural networks , 2016, Proceedings of the National Academy of Sciences.

[2]  Gintare Karolina Dziugaite,et al.  Linear Mode Connectivity and the Lottery Ticket Hypothesis , 2019, ICML.

[3]  H. Jónsson,et al.  Nudged elastic band method for finding minimum energy paths of transitions , 1998 .

[4]  Kibok Lee,et al.  Overcoming Catastrophic Forgetting With Unlabeled Data in the Wild , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[5]  Shan Yu,et al.  Continual learning of context-dependent processing in neural networks , 2018, Nature Machine Intelligence.

[6]  Philip H. S. Torr,et al.  Riemannian Walk for Incremental Learning: Understanding Forgetting and Intransigence , 2018, ECCV.

[7]  Trevor Darrell,et al.  Uncertainty-guided Continual Learning with Bayesian Neural Networks , 2019, ICLR.

[8]  Zongben Xu,et al.  Training Networks in Null Space of Feature Covariance for Continual Learning , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Sung Ju Hwang,et al.  Lifelong Learning with Dynamically Expandable Networks , 2017, ICLR.

[10]  Ying Fu,et al.  Incremental Learning Using Conditional Adversarial Networks , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[11]  Marc'Aurelio Ranzato,et al.  Efficient Lifelong Learning with A-GEM , 2018, ICLR.

[12]  Andrew Gordon Wilson,et al.  Averaging Weights Leads to Wider Optima and Better Generalization , 2018, UAI.

[13]  Visvanathan Ramesh,et al.  A Wholistic View of Continual Learning with Deep Neural Networks: Forgotten Lessons and the Bridge to Active and Open World Learning , 2020, ArXiv.

[14]  Mingrui Liu,et al.  Improved Schemes for Episodic Memory-based Lifelong Learning , 2020, NeurIPS.

[15]  Taesup Moon,et al.  Uncertainty-based Continual Learning with Adaptive Regularization , 2019, NeurIPS.

[16]  Marcus Rohrbach,et al.  Memory Aware Synapses: Learning what (not) to forget , 2017, ECCV.

[17]  Surya Ganguli,et al.  Continual Learning Through Synaptic Intelligence , 2017, ICML.

[18]  Tinne Tuytelaars,et al.  More Classifiers, Less Forgetting: A Generic Multi-classifier Paradigm for Incremental Learning , 2020, ECCV.

[19]  Alexandros Karatzoglou,et al.  Overcoming Catastrophic Forgetting with Hard Attention to the Task , 2018 .

[20]  Xuming He,et al.  DER: Dynamically Expandable Representation for Class Incremental Learning , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Jiwon Kim,et al.  Continual Learning with Deep Generative Replay , 2017, NIPS.

[22]  Stanislav Fort,et al.  Large Scale Structure of Neural Network Loss Landscapes , 2019, NeurIPS.

[23]  Matthew B. Blaschko,et al.  Encoder Based Lifelong Learning , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[24]  Razvan Pascanu,et al.  Sharp Minima Can Generalize For Deep Nets , 2017, ICML.

[25]  Andrew Gordon Wilson,et al.  Loss Surfaces, Mode Connectivity, and Fast Ensembling of DNNs , 2018, NeurIPS.

[26]  Derek Hoiem,et al.  Learning without Forgetting , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[27]  Christoph H. Lampert,et al.  iCaRL: Incremental Classifier and Representation Learning , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[28]  Alexander Gepperth,et al.  A Bio-Inspired Incremental Learning Architecture for Applied Perceptual Problems , 2016, Cognitive Computation.

[29]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[30]  Matthieu Cord,et al.  PODNet: Pooled Outputs Distillation for Small-Tasks Incremental Learning , 2020, ECCV.

[31]  Jiayu Wu,et al.  Tiny ImageNet Challenge , 2017 .

[32]  Yann LeCun,et al.  The Loss Surfaces of Multilayer Networks , 2014, AISTATS.

[33]  Tinne Tuytelaars,et al.  A Continual Learning Survey: Defying Forgetting in Classification Tasks , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[34]  Fred A. Hamprecht,et al.  Essentially No Barriers in Neural Network Energy Landscape , 2018, ICML.

[35]  Martial Mermillod,et al.  The stability-plasticity dilemma: investigating the continuum from catastrophic forgetting to age-limited learning effects , 2013, Front. Psychol..

[36]  Marc'Aurelio Ranzato,et al.  Gradient Episodic Memory for Continual Learning , 2017, NIPS.

[37]  Ali Farhadi,et al.  Learning Neural Network Subspaces , 2021, ICML.

[38]  David Isele,et al.  Selective Experience Replay for Lifelong Learning , 2018, AAAI.

[39]  Richard Socher,et al.  Learn to Grow: A Continual Structure Learning Framework for Overcoming Catastrophic Forgetting , 2019, ICML.

[40]  Min Sun,et al.  Mitigating Forgetting in Online Continual Learning via Instance-Aware Parameterization , 2020, NeurIPS.