Generative Feature Replay with Orthogonal Weight Modification for Continual Learning

The ability of intelligent agents to learn and remember multiple tasks sequentially is crucial to achieving artificial general intelligence. Many continual learning (CL) methods have been proposed to overcome catastrophic forgetting which results from non i.i.d data in the sequential learning of neural networks. In this paper we focus on class incremental learning, a challenging CL scenario. For this scenario, generative replay is a promising strategy which generates and replays pseudo data for previous tasks to alleviate catastrophic forgetting. However, it is hard to train a generative model continually for relatively complex data. Based on recently proposed orthogonal weight modification (OWM) algorithm which can approximately keep previously learned feature invariant when learning new tasks, we propose to 1) replay penultimate layer feature with a generative model; 2) leverage a self-supervised auxiliary task to further enhance the stability of feature. Empirical results on several datasets show our method always achieves substantial improvement over powerful OWM while conventional generative replay always results in a negative effect. Meanwhile our method beats several strong baselines including one based on real data storage. In addition, we conduct experiments to study why our method is effective.

[1]  Yen-Cheng Liu,et al.  Re-evaluating Continual Learning Scenarios: A Categorization and Case for Strong Baselines , 2018, ArXiv.

[2]  Kaiming He,et al.  Momentum Contrast for Unsupervised Visual Representation Learning , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Cordelia Schmid,et al.  End-to-End Incremental Learning , 2018, ECCV.

[4]  Bing Liu,et al.  Overcoming Catastrophic Forgetting for Continual Learning via Model Adaptation , 2018, ICLR.

[5]  Marcus Rohrbach,et al.  Memory Aware Synapses: Learning what (not) to forget , 2017, ECCV.

[6]  R. O’Reilly,et al.  Opinion TRENDS in Cognitive Sciences Vol.6 No.12 December 2002 , 2022 .

[7]  Nicolas Y. Masse,et al.  Alleviating catastrophic forgetting using context-dependent gating and synaptic stabilization , 2018, Proceedings of the National Academy of Sciences.

[8]  Bogdan Raducanu,et al.  Generative Feature Replay For Class-Incremental Learning , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[9]  Paolo Favaro,et al.  Unsupervised Learning of Visual Representations by Solving Jigsaw Puzzles , 2016, ECCV.

[10]  Oriol Vinyals,et al.  Representation Learning with Contrastive Predictive Coding , 2018, ArXiv.

[11]  Jonathon Shlens,et al.  Conditional Image Synthesis with Auxiliary Classifier GANs , 2016, ICML.

[12]  Anthony V. Robins,et al.  Catastrophic forgetting in neural networks: the role of rehearsal mechanisms , 1993, Proceedings 1993 The First New Zealand International Two-Stream Conference on Artificial Neural Networks and Expert Systems.

[13]  Dahua Lin,et al.  Learning a Unified Classifier Incrementally via Rebalancing , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Richard E. Turner,et al.  Variational Continual Learning , 2017, ICLR.

[15]  Surya Ganguli,et al.  Continual Learning Through Synaptic Intelligence , 2017, ICML.

[16]  Bogdan Raducanu,et al.  Memory Replay GANs: Learning to Generate New Categories without Forgetting , 2018, NeurIPS.

[17]  Adrian Popescu,et al.  IL2M: Class Incremental Learning With Dual Memory , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[18]  Jian Sun,et al.  Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[19]  Michael McCloskey,et al.  Catastrophic Interference in Connectionist Networks: The Sequential Learning Problem , 1989 .

[20]  Andreas S. Tolias,et al.  Three scenarios for continual learning , 2019, ArXiv.

[21]  Demis Hassabis,et al.  Mastering the game of Go with deep neural networks and tree search , 2016, Nature.

[22]  Lantao Yu,et al.  SeqGAN: Sequence Generative Adversarial Nets with Policy Gradient , 2016, AAAI.

[23]  Alexandros Karatzoglou,et al.  Overcoming Catastrophic Forgetting with Hard Attention to the Task , 2018 .

[24]  Geoffrey E. Hinton,et al.  Distilling the Knowledge in a Neural Network , 2015, ArXiv.

[25]  OctoMiao Overcoming catastrophic forgetting in neural networks , 2016 .

[26]  Shutao Xia,et al.  Maintaining Discrimination and Fairness in Class Incremental Learning , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  Derek Hoiem,et al.  Learning without Forgetting , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[28]  Aaron C. Courville,et al.  Improved Training of Wasserstein GANs , 2017, NIPS.

[29]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[30]  R. French Catastrophic forgetting in connectionist networks , 1999, Trends in Cognitive Sciences.

[31]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[32]  P. Cochat,et al.  Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[33]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[34]  Geoffrey E. Hinton,et al.  A Simple Framework for Contrastive Learning of Visual Representations , 2020, ICML.

[35]  Xu He,et al.  Overcoming Catastrophic Interference using Conceptor-Aided Backpropagation , 2018, ICLR.

[36]  Gregory Shakhnarovich,et al.  Colorization as a Proxy Task for Visual Understanding , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[37]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[38]  Marc'Aurelio Ranzato,et al.  Gradient Episodic Memory for Continual Learning , 2017, NIPS.

[39]  Stefan Wermter,et al.  Continual Lifelong Learning with Neural Networks: A Review , 2019, Neural Networks.

[40]  Jiwon Kim,et al.  Continual Learning with Deep Generative Replay , 2017, NIPS.

[41]  Yandong Guo,et al.  Large Scale Incremental Learning , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[42]  Yoshua Bengio,et al.  An Empirical Investigation of Catastrophic Forgeting in Gradient-Based Neural Networks , 2013, ICLR.

[43]  David Filliat,et al.  Generative Models from the perspective of Continual Learning , 2018, 2019 International Joint Conference on Neural Networks (IJCNN).

[44]  Christoph H. Lampert,et al.  iCaRL: Incremental Classifier and Representation Learning , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[45]  Svetlana Lazebnik,et al.  PackNet: Adding Multiple Tasks to a Single Network by Iterative Pruning , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[46]  Philip H. S. Torr,et al.  Riemannian Walk for Incremental Learning: Understanding Forgetting and Intransigence , 2018, ECCV.

[47]  Ying Fu,et al.  Incremental Learning Using Conditional Adversarial Networks , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[48]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[49]  Nikos Komodakis,et al.  Unsupervised Representation Learning by Predicting Image Rotations , 2018, ICLR.

[50]  Shan Yu,et al.  Continual learning of context-dependent processing in neural networks , 2018, Nature Machine Intelligence.

[51]  Patrick Jähnichen,et al.  Learning to Remember: A Synaptic Plasticity Driven Framework for Continual Learning , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).