Closed-Loop GAN for continual Learning

Sequential learning of tasks using gradient descent leads to an unremitting decline in the accuracy of tasks for which training data is no longer available, termed catastrophic forgetting. Generative models have been explored as a means to approximate the distribution of old tasks and bypass storage of real data. Here we propose a cumulative closed-loop generator and embedded classifier using an AC-GAN architecture provided with external regularization by a small buffer. We evaluate incremental learning using a notoriously hard paradigm, single headed learning, in which each task is a disjoint subset of classes in the overall dataset, and performance is evaluated on all previous classes. First, we show that the variability contained in a small percentage of a dataset (memory buffer) accounts for a significant portion of the reported accuracy, both in multi-task and continual learning settings. Second, we show that using a generator to continuously output new images while training provides an up-sampling of the buffer, which prevents catastrophic forgetting and yields superior performance when compared to a fixed buffer. We achieve an average accuracy for all classes of 92.26% in MNIST and 76.15% in FASHION-MNIST after 5 tasks using GAN sampling with a buffer of only 0.17% of the entire dataset size. We compare to a network with regularization (EWC) which shows a deteriorated average performance of 29.19% (MNIST) and 26.5% (FASHION). The baseline of no regularization (plain gradient descent) performs at 99.84% (MNIST) and 99.79% (FASHION) for the last task, but below 3% for all previous tasks. Our method has very low long-term memory cost, the buffer, as well as negligible intermediate memory storage.

[1]  Razvan Pascanu,et al.  Overcoming catastrophic forgetting in neural networks , 2016, Proceedings of the National Academy of Sciences.

[2]  Trevor Darrell,et al.  Discriminator Rejection Sampling , 2018, ICLR.

[3]  Michael McCloskey,et al.  Catastrophic Interference in Connectionist Networks: The Sequential Learning Problem , 1989 .

[4]  S. Crawford,et al.  Volume 1 , 2012, Journal of Diabetes Investigation.

[5]  Christoph H. Lampert,et al.  iCaRL: Incremental Classifier and Representation Learning , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  R. French Catastrophic Forgetting in Connectionist Networks , 2006 .

[7]  Derek Hoiem,et al.  Learning without Forgetting , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  Richard E. Turner,et al.  Variational Continual Learning , 2017, ICLR.

[9]  Marc'Aurelio Ranzato,et al.  Gradient Episodic Memory for Continual Learning , 2017, NIPS.

[10]  Zoubin Ghahramani,et al.  Proceedings of the 24th international conference on Machine learning , 2007, ICML 2007.

[11]  Laurent Itti,et al.  Active Long Term Memory Networks , 2016, ArXiv.

[12]  Roland Vollgraf,et al.  Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algorithms , 2017, ArXiv.

[13]  Gregory Cohen,et al.  EMNIST: an extension of MNIST to handwritten letters , 2017, CVPR 2017.

[14]  Ronald Kemker,et al.  FearNet: Brain-Inspired Model for Incremental Learning , 2017, ICLR.

[15]  Conrad D. James,et al.  Neurogenesis deep learning: Extending deep networks to accommodate new classes , 2016, 2017 International Joint Conference on Neural Networks (IJCNN).

[16]  Anthony V. Robins,et al.  Catastrophic Forgetting, Rehearsal and Pseudorehearsal , 1995, Connect. Sci..

[17]  P. Cochat,et al.  Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[18]  Surya Ganguli,et al.  Continual Learning Through Synaptic Intelligence , 2017, ICML.

[19]  Jonathon Shlens,et al.  Conditional Image Synthesis with Auxiliary Classifier GANs , 2016, ICML.

[20]  Bogdan Raducanu,et al.  Memory Replay GANs: learning to generate images from new categories without forgetting , 2018, NeurIPS.

[21]  E. Chang,et al.  Human hippocampal neurogenesis drops sharply in children to undetectable levels in adults , 2018, Nature.

[22]  Stefan Wermter,et al.  Continual Lifelong Learning with Neural Networks: A Review , 2018, Neural Networks.

[23]  P. Caroni,et al.  Structural plasticity upon learning: regulation and functions , 2012, Nature Reviews Neuroscience.

[24]  Christopher Summerfield,et al.  Comparing continual task learning in minds and machines , 2018, Proceedings of the National Academy of Sciences.

[25]  Alexander Gerhard Schwing,et al.  Inference and learning algorithms with applications to 3D indoor scene understanding , 2013 .

[26]  O. Bagasra,et al.  Proceedings of the National Academy of Sciences , 1914, Science.

[27]  Robert M. French,et al.  Catastrophic Interference in Connectionist Networks: Can It Be Predicted, Can It Be Prevented? , 1993, NIPS.

[28]  Yarin Gal,et al.  Towards Robust Evaluations of Continual Learning , 2018, ArXiv.

[29]  Chrisantha Fernando,et al.  PathNet: Evolution Channels Gradient Descent in Super Neural Networks , 2017, ArXiv.

[30]  Jiwon Kim,et al.  Continual Learning with Deep Generative Replay , 2017, NIPS.

[31]  Tom Eccles,et al.  Life-Long Disentangled Representation Learning with Cross-Domain Latent Homologies , 2018, NeurIPS.