Generative Feature Replay with Orthogonal Weight Modification for Continual Learning

The ability of intelligent agents to learn and remember multiple tasks sequentially is crucial to achieving artificial general intelligence. Many continual learning (CL) methods have been proposed to overcome catastrophic forgetting which results from non i.i.d data in the sequential learning of neural networks. In this paper we focus on class incremental learning, a challenging CL scenario. For this scenario, generative replay is a promising strategy which generates and replays pseudo data for previous tasks to alleviate catastrophic forgetting. However, it is hard to train a generative model continually for relatively complex data. Based on recently proposed orthogonal weight modification (OWM) algorithm which can approximately keep previously learned feature invariant when learning new tasks, we propose to 1) replay penultimate layer feature with a generative model; 2) leverage a self-supervised auxiliary task to further enhance the stability of feature. Empirical results on several datasets show our method always achieves substantial improvement over powerful OWM while conventional generative replay always results in a negative effect. Meanwhile our method beats several strong baselines including one based on real data storage. In addition, we conduct experiments to study why our method is effective.

[1]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[2]  Xu He,et al.  Overcoming Catastrophic Interference using Conceptor-Aided Backpropagation , 2018, ICLR.

[3]  R. O’Reilly,et al.  Opinion TRENDS in Cognitive Sciences Vol.6 No.12 December 2002 , 2022 .

[4]  Philip H. S. Torr,et al.  Riemannian Walk for Incremental Learning: Understanding Forgetting and Intransigence , 2018, ECCV.

[5]  Kaiming He,et al.  Momentum Contrast for Unsupervised Visual Representation Learning , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  R. French Catastrophic forgetting in connectionist networks , 1999, Trends in Cognitive Sciences.

[7]  Cordelia Schmid,et al.  End-to-End Incremental Learning , 2018, ECCV.

[8]  Marcus Rohrbach,et al.  Memory Aware Synapses: Learning what (not) to forget , 2017, ECCV.

[9]  Nicolas Y. Masse,et al.  Alleviating catastrophic forgetting using context-dependent gating and synaptic stabilization , 2018, Proceedings of the National Academy of Sciences.

[10]  Derek Hoiem,et al.  Learning without Forgetting , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Jiwon Kim,et al.  Continual Learning with Deep Generative Replay , 2017, NIPS.

[12]  Alexandros Karatzoglou,et al.  Overcoming Catastrophic Forgetting with Hard Attention to the Task , 2018 .

[13]  Shutao Xia,et al.  Maintaining Discrimination and Fairness in Class Incremental Learning , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Bogdan Raducanu,et al.  Generative Feature Replay For Class-Incremental Learning , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[15]  Jian Sun,et al.  Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[16]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[17]  Shan Yu,et al.  Continual learning of context-dependent processing in neural networks , 2018, Nature Machine Intelligence.

[18]  Gregory Shakhnarovich,et al.  Colorization as a Proxy Task for Visual Understanding , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Stefan Wermter,et al.  Continual Lifelong Learning with Neural Networks: A Review , 2019, Neural Networks.

[20]  Patrick Jähnichen,et al.  Learning to Remember: A Synaptic Plasticity Driven Framework for Continual Learning , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Marc'Aurelio Ranzato,et al.  Gradient Episodic Memory for Continual Learning , 2017, NIPS.

[22]  Michael McCloskey,et al.  Catastrophic Interference in Connectionist Networks: The Sequential Learning Problem , 1989 .

[23]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[24]  Dahua Lin,et al.  Learning a Unified Classifier Incrementally via Rebalancing , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[25]  Yoshua Bengio,et al.  An Empirical Investigation of Catastrophic Forgeting in Gradient-Based Neural Networks , 2013, ICLR.

[26]  Bing Liu,et al.  Overcoming Catastrophic Forgetting for Continual Learning via Model Adaptation , 2018, ICLR.

[27]  Bogdan Raducanu,et al.  Memory Replay GANs: Learning to Generate New Categories without Forgetting , 2018, NeurIPS.

[28]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[29]  Svetlana Lazebnik,et al.  PackNet: Adding Multiple Tasks to a Single Network by Iterative Pruning , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[30]  Nikos Komodakis,et al.  Unsupervised Representation Learning by Predicting Image Rotations , 2018, ICLR.

[31]  Oriol Vinyals,et al.  Representation Learning with Contrastive Predictive Coding , 2018, ArXiv.

[32]  Geoffrey E. Hinton,et al.  A Simple Framework for Contrastive Learning of Visual Representations , 2020, ICML.

[33]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[34]  Geoffrey E. Hinton,et al.  Distilling the Knowledge in a Neural Network , 2015, ArXiv.

[35]  Aaron C. Courville,et al.  Improved Training of Wasserstein GANs , 2017, NIPS.

[36]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[37]  Andreas S. Tolias,et al.  Three scenarios for continual learning , 2019, ArXiv.

[38]  David Filliat,et al.  Generative Models from the perspective of Continual Learning , 2018, 2019 International Joint Conference on Neural Networks (IJCNN).

[39]  Razvan Pascanu,et al.  Overcoming catastrophic forgetting in neural networks , 2016, Proceedings of the National Academy of Sciences.

[40]  Yen-Cheng Liu,et al.  Re-evaluating Continual Learning Scenarios: A Categorization and Case for Strong Baselines , 2018, ArXiv.

[41]  Jonathon Shlens,et al.  Conditional Image Synthesis with Auxiliary Classifier GANs , 2016, ICML.

[42]  Christoph H. Lampert,et al.  iCaRL: Incremental Classifier and Representation Learning , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[43]  P. Cochat,et al.  Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[44]  Richard E. Turner,et al.  Variational Continual Learning , 2017, ICLR.

[45]  Anthony V. Robins,et al.  Catastrophic forgetting in neural networks: the role of rehearsal mechanisms , 1993, Proceedings 1993 The First New Zealand International Two-Stream Conference on Artificial Neural Networks and Expert Systems.

[46]  Adrian Popescu,et al.  IL2M: Class Incremental Learning With Dual Memory , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[47]  Demis Hassabis,et al.  Mastering the game of Go with deep neural networks and tree search , 2016, Nature.

[48]  Lantao Yu,et al.  SeqGAN: Sequence Generative Adversarial Nets with Policy Gradient , 2016, AAAI.

[49]  Ying Fu,et al.  Incremental Learning Using Conditional Adversarial Networks , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[50]  Surya Ganguli,et al.  Continual Learning Through Synaptic Intelligence , 2017, ICML.

[51]  Paolo Favaro,et al.  Unsupervised Learning of Visual Representations by Solving Jigsaw Puzzles , 2016, ECCV.

[52]  Yandong Guo,et al.  Large Scale Incremental Learning , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).