Condensed Composite Memory Continual Learning

Deep Neural Networks (DNNs) suffer from a rapid decrease in performance when trained on a sequence of tasks where only data of the most recent task is available. This phenomenon, known as catastrophic forgetting, prevents DNNs from accumulating knowledge over time. Overcoming catastrophic forgetting and enabling continual learning is of great interest since it would enable the application of DNNs in settings where unrestricted access to all the training data at any time is not always possible, e.g. due to storage limitations or legal issues. While many recently proposed methods for continual learning use some training examples for rehearsal, their performance strongly depends on the number of stored examples. In order to improve performance of rehearsal for continual learning, especially for a small number of stored examples, we propose a novel way of learning a small set of synthetic examples which capture the essence of a complete dataset. Instead of directly learning these synthetic examples, we learn a weighted combination of shared components for each example that enables a significant increase in memory efficiency. We demonstrate the performance of our method on commonly used datasets and compare it to recently proposed related methods and baselines.

[1]  Andreas Krause,et al.  Coresets via Bilevel Optimization for Continual Learning and Streaming , 2020, NeurIPS.

[2]  Andrew Y. Ng,et al.  Reading Digits in Natural Images with Unsupervised Feature Learning , 2011 .

[3]  Razvan Pascanu,et al.  Overcoming catastrophic forgetting in neural networks , 2016, Proceedings of the National Academy of Sciences.

[4]  Yen-Cheng Liu,et al.  Re-evaluating Continual Learning Scenarios: A Categorization and Case for Strong Baselines , 2018, ArXiv.

[5]  Hakan Bilen,et al.  Dataset Condensation with Gradient Matching , 2020, ICLR.

[6]  Razvan Pascanu,et al.  Progressive Neural Networks , 2016, ArXiv.

[7]  Richard S. Zemel,et al.  Prototypical Networks for Few-shot Learning , 2017, NIPS.

[8]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[9]  Alexei A. Efros,et al.  Dataset Distillation , 2018, ArXiv.

[10]  Davide Maltoni,et al.  Latent Replay for Real-Time Continual Learning , 2019, ArXiv.

[11]  Jiwon Kim,et al.  Continual Learning with Deep Generative Replay , 2017, NIPS.

[12]  S. Lewandowsky,et al.  Catastrophic interference in neural networks , 1995 .

[13]  Andreas S. Tolias,et al.  Three scenarios for continual learning , 2019, ArXiv.

[14]  Surya Ganguli,et al.  Continual Learning Through Synaptic Intelligence , 2017, ICML.

[15]  Michael McCloskey,et al.  Catastrophic Interference in Connectionist Networks: The Sequential Learning Problem , 1989 .

[16]  S. Grossberg,et al.  ART 2: self-organization of stable category recognition codes for analog input patterns. , 1987, Applied optics.

[17]  Andrea Vedaldi,et al.  Instance Normalization: The Missing Ingredient for Fast Stylization , 2016, ArXiv.

[18]  Oriol Vinyals,et al.  Matching Networks for One Shot Learning , 2016, NIPS.

[19]  Andreas S. Tolias,et al.  Generative replay with feedback connections as a general strategy for continual learning , 2018, ArXiv.

[20]  R. French Catastrophic forgetting in connectionist networks , 1999, Trends in Cognitive Sciences.

[21]  Yandong Guo,et al.  Large Scale Incremental Learning , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Roland Vollgraf,et al.  Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algorithms , 2017, ArXiv.

[23]  Joel Lehman,et al.  Learning to Continually Learn , 2020, ECAI.

[24]  Christopher Kanan,et al.  REMIND Your Neural Network to Prevent Catastrophic Forgetting , 2020, ECCV.

[25]  Vladimir Vapnik,et al.  Principles of Risk Minimization for Learning Theory , 1991, NIPS.

[26]  Karl Pearson F.R.S. LIII. On lines and planes of closest fit to systems of points in space , 1901 .