Learning Rapid-Temporal Adaptations

A hallmark of human intelligence and cognition is its flexibility. One of the long-standing goals in AI research is to replicate this flexibility in a learning machine. In this work we describe a mechanism by which artificial neural networks can learn rapid-temporal adaptation - the ability to adapt quickly to new environments or tasks - that we call adaptive neurons. Adaptive neurons modify their activations with task-specific values retrieved from a working memory. On standard metalearning and few-shot learning benchmarks in both vision and language domains, models augmented with adaptive neurons achieve state-of-the-art results.

[1]  N. Sigala,et al.  Dynamic Coding for Cognitive Control in Prefrontal Cortex , 2013, Neuron.

[2]  Geoffrey E. Hinton Using fast weights to deblur old memories , 1987 .

[3]  Sergey Levine,et al.  Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks , 2017, ICML.

[4]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[5]  Yoshua Bengio,et al.  Towards Biologically Plausible Deep Learning , 2015, ArXiv.

[6]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[7]  Gregory R. Koch,et al.  Siamese Neural Networks for One-Shot Image Recognition , 2015 .

[8]  Yoshua Bengio,et al.  Learning a synaptic learning rule , 1991, IJCNN-91-Seattle International Joint Conference on Neural Networks.

[9]  Pieter Abbeel,et al.  Meta-Learning with Temporal Convolutions , 2017, ArXiv.

[10]  Yoshua Bengio,et al.  An Empirical Investigation of Catastrophic Forgeting in Gradient-Based Neural Networks , 2013, ICLR.

[11]  P. Cochat,et al.  Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[12]  Jason Weston,et al.  The Goldilocks Principle: Reading Children's Books with Explicit Memory Representations , 2015, ICLR.

[13]  Ricardo Vilalta,et al.  A Perspective View and Survey of Meta-Learning , 2002, Artificial Intelligence Review.

[14]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[15]  Hugo Larochelle,et al.  Optimization as a Model for Few-Shot Learning , 2016, ICLR.

[16]  Hugo Larochelle,et al.  Modulating early visual processing by language , 2017, NIPS.

[17]  Earl K. Miller,et al.  Working Memory Capacity: Limits on the Bandwidth of Cognition , 2015, Daedalus.

[18]  Marcin Andrychowicz,et al.  Learning to learn by gradient descent by gradient descent , 2016, NIPS.

[19]  Joshua B. Tenenbaum,et al.  Human-level concept learning through probabilistic program induction , 2015, Science.

[20]  Ruslan Salakhutdinov,et al.  Improving One-Shot Learning through Fusing Side Information , 2017, ArXiv.

[21]  Arild Nøkland,et al.  Direct Feedback Alignment Provides Learning in Deep Neural Networks , 2016, NIPS.

[22]  Joan Bruna,et al.  Intriguing properties of neural networks , 2013, ICLR.

[23]  S. Monsell Task switching , 2003, Trends in Cognitive Sciences.

[24]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[25]  Phil Blunsom,et al.  Teaching Machines to Read and Comprehend , 2015, NIPS.

[26]  K. Sakai Task set and prefrontal cortex. , 2008, Annual review of neuroscience.

[27]  Aaron C. Courville,et al.  FiLM: Visual Reasoning with a General Conditioning Layer , 2017, AAAI.

[28]  Sepp Hochreiter,et al.  Learning to Learn Using Gradient Descent , 2001, ICANN.

[29]  Markus Siegel,et al.  Cortical information flow during flexible sensorimotor decisions , 2015, Science.

[30]  Colin J. Akerman,et al.  Random synaptic feedback weights support error backpropagation for deep learning , 2016, Nature Communications.

[31]  Philip Bachman,et al.  Learning Algorithms for Active Learning , 2017, ICML.

[32]  E. Miller,et al.  Prefrontal Cortex Networks Shift from External to Internal Modes during Learning , 2016, The Journal of Neuroscience.

[33]  Jürgen Schmidhuber,et al.  Compete to Compute , 2013, NIPS.

[34]  Daan Wierstra,et al.  Meta-Learning with Memory-Augmented Neural Networks , 2016, ICML.

[35]  Yoshua Bengio,et al.  Zero-data Learning of New Tasks , 2008, AAAI.

[36]  Sanja Fidler,et al.  Predicting Deep Zero-Shot Convolutional Neural Networks Using Textual Descriptions , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[37]  Daan Wierstra,et al.  One-Shot Generalization in Deep Generative Models , 2016, ICML.