Self-Assttentive Associative Memory

Heretofore, neural networks with external memory are restricted to single memory with lossy representations of memory interactions. A rich representation of relationships between memory pieces urges a high-order and segregated relational memory. In this paper, we propose to separate the storage of individual experiences (item memory) and their occurring relationships (relational memory). The idea is implemented through a novel Self-attentive Associative Memory (SAM) operator. Found upon outer product, SAM forms a set of associative memories that represent the hypothetical high-order relationships between arbitrary pairs of memory elements, through which a relational memory is constructed from an item memory. The two memories are wired into a single sequential model capable of both memorization and relational reasoning. We achieve competitive results with our proposed two-memory model in a diversity of machine learning tasks, from challenging synthetic problems to practical testbeds such as geometry, graph, reinforcement learning, and question answering.

[1]  Teuvo Kohonen,et al.  Correlation Matrix Memories , 1972, IEEE Transactions on Computers.

[2]  N. Cohen,et al.  Relational Memory and the Hippocampus: Representations and Methods , 2009, Front. Neurosci..

[3]  Margaret L. Schlichting,et al.  The hippocampus and inferential reasoning: building memories to navigate future decisions , 2012, Front. Hum. Neurosci..

[4]  Geoffrey E. Hinton,et al.  Using Fast Weights to Attend to the Recent Past , 2016, NIPS.

[5]  Razvan Pascanu,et al.  A simple neural network module for relational reasoning , 2017, NIPS.

[6]  Jeffrey L. Elman,et al.  Finding Structure in Time , 1990, Cogn. Sci..

[7]  Razvan Pascanu,et al.  Relational recurrent neural networks , 2018, NeurIPS.

[8]  J. Hodges Memory, Amnesia and the Hippocampal System , 1995 .

[9]  Truyen Tran,et al.  Neural Stored-program Memory , 2019, ICLR.

[10]  Byoung-Tak Zhang,et al.  Bilinear Attention Networks , 2018, NeurIPS.

[11]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[12]  Joshua B. Tenenbaum,et al.  Separating Style and Content with Bilinear Models , 2000, Neural Computation.

[13]  H Eichenbaum,et al.  Memory for items and memory for relations in the procedural/declarative memory framework. , 1997, Memory.

[14]  M. Buckley The Role of the Perirhinal Cortex and Hippocampus in Learning, Memory, and Perception , 2005, The Quarterly journal of experimental psychology. B, Comparative and physiological psychology.

[15]  Geoffrey E. Hinton Tensor Product Variable Binding and the Representation of Symbolic Structures in Connectionist Systems , 1991 .

[16]  Sergio Gomez Colmenarejo,et al.  Hybrid computing using a neural network with dynamic external memory , 2016, Nature.

[17]  J J Hopfield,et al.  Neural networks and physical systems with emergent collective computational abilities. , 1982, Proceedings of the National Academy of Sciences of the United States of America.

[18]  Jürgen Schmidhuber,et al.  Learning to Reason with Third-Order Tensor Products , 2018, NeurIPS.

[19]  James L. McClelland,et al.  Generalization Through the Recurrent Interaction of Episodic Memories , 2012, Psychological review.

[20]  Alex Graves,et al.  Neural Turing Machines , 2014, ArXiv.

[21]  Wei Zhang,et al.  Learning to update Auto-associative Memory in Recurrent Neural Networks for Improving Sequence Memorization , 2017, ArXiv.

[22]  Raúl Rojas,et al.  Neural Networks - A Systematic Introduction , 1996 .

[23]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[24]  Bram Bakker,et al.  Reinforcement Learning with Long Short-Term Memory , 2001, NIPS.

[25]  Svetha Venkatesh,et al.  Dual Memory Neural Computer for Asynchronous Two-view Sequential Learning , 2018, KDD.

[26]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[27]  Alex Graves,et al.  Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.

[28]  Jason Weston,et al.  Towards AI-Complete Question Answering: A Set of Prerequisite Toy Tasks , 2015, ICLR.

[29]  D. Marr A theory of cerebellar cortex , 1969, The Journal of physiology.

[30]  Tsendsuren Munkhdalai,et al.  Metalearned Neural Memory , 2019, NeurIPS.

[31]  Svetha Venkatesh,et al.  Learning to Remember More with Less Memorization , 2019, ICLR.

[32]  Rasmus Pagh,et al.  Fast and scalable polynomial kernels via explicit feature maps , 2013, KDD.

[33]  Ingrid R. Olson,et al.  Working Memory for Conjunctions Relies on the Medial Temporal Lobe , 2006, The Journal of Neuroscience.

[34]  Zhou Yu,et al.  Multi-modal Factorized Bilinear Pooling with Co-attention Learning for Visual Question Answering , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[35]  Jürgen Schmidhuber,et al.  Gated Fast Weights for On-The-Fly Neural Program Generation , 2017 .