Continual Learning Through Synaptic Intelligence

While deep learning has led to remarkable advances across diverse applications, it struggles in domains where the data distribution changes over the course of learning. In stark contrast, biological neural networks continually adapt to changing domains, possibly by leveraging complex molecular machinery to solve many tasks simultaneously. In this study, we introduce intelligent synapses that bring some of this biological complexity into artificial neural networks. Each synapse accumulates task relevant information over time, and exploits this information to rapidly store new memories without forgetting old ones. We evaluate our approach on continual learning of classification tasks, and show that it dramatically reduces forgetting while maintaining computational efficiency.

[1]  F ROSENBLATT,et al.  The perceptron: a probabilistic model for information storage and organization in the brain. , 1958, Psychological review.

[2]  Kunihiko Fukushima,et al.  Neocognitron: A Self-Organizing Neural Network Model for a Mechanism of Visual Pattern Recognition , 1982 .

[3]  J. Montgomery,et al.  State-Dependent Heterogeneity in Synaptic Depression between Pyramidal Cell Pairs , 2002, Neuron.

[4]  M. Poo,et al.  Reversal and Stabilization of Synaptic Modifications in a Developing Visual System , 2003, Science.

[5]  L. Abbott,et al.  Cascade Models of Synaptically Stored Memories , 2005, Neuron.

[6]  Yann LeCun,et al.  The mnist database of handwritten digits , 2005 .

[7]  Dipti Srinivasan,et al.  Neural Networks for Continuous Online Learning and Control , 2006, IEEE Transactions on Neural Networks.

[8]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[9]  R. Morris,et al.  Making memories last: the synaptic tagging and capture hypothesis , 2010, Nature Reviews Neuroscience.

[10]  Ilya Sutskever,et al.  Estimating the Hessian by Back-propagating Curvature , 2012, ICML.

[11]  Surya Ganguli,et al.  A memory frontier for complex synapses , 2013, NIPS.

[12]  Jürgen Schmidhuber,et al.  Compete to Compute , 2013, NIPS.

[13]  Yoshua Bengio,et al.  An Empirical Investigation of Catastrophic Forgeting in Gradient-Based Neural Networks , 2013, ICLR.

[14]  Yoshua Bengio,et al.  How transferable are features in deep neural networks? , 2014, NIPS.

[15]  Stefan Carlsson,et al.  CNN Features Off-the-Shelf: An Astounding Baseline for Recognition , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition Workshops.

[16]  Trevor Darrell,et al.  DeCAF: A Deep Convolutional Activation Feature for Generic Visual Recognition , 2013, ICML.

[17]  Razvan Pascanu,et al.  Revisiting Natural Gradient for Deep Networks , 2013, ICLR.

[18]  Geoffrey E. Hinton,et al.  Distilling the Knowledge in a Neural Network , 2015, ArXiv.

[19]  Everton J. Agnes,et al.  Diverse synaptic plasticity mechanisms orchestrated to form and retrieve memories in spiking neural networks , 2015, Nature Communications.

[20]  W. Gerstner,et al.  Synaptic Consolidation: From Synapses to Behavioral Modeling , 2015, The Journal of Neuroscience.

[21]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[22]  OctoMiao Overcoming catastrophic forgetting in neural networks , 2016 .

[23]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[24]  Razvan Pascanu,et al.  Progressive Neural Networks , 2016, ArXiv.

[25]  Stefano Fusi,et al.  Computational principles of synaptic memory consolidation , 2016, Nature Neuroscience.

[26]  Junmo Kim,et al.  Less-forgetting Learning in Deep Neural Networks , 2016, ArXiv.

[27]  James Martens Second-order Optimization for Neural Networks , 2016 .

[28]  Derek Hoiem,et al.  Learning without Forgetting , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.