论文信息 - Self-Attentive Associative Memory - 字舞流文

Self-Attentive Associative Memory

Heretofore, neural networks with external memory are restricted to single memory with lossy representations of memory interactions. A rich representation of relationships between memory pieces urges a high-order and segregated relational memory. In this paper, we propose to separate the storage of individual experiences (item memory) and their occurring relationships (relational memory). The idea is implemented through a novel Self-attentive Associative Memory (SAM) operator. Found upon outer product, SAM forms a set of associative memories that represent the hypothetical high-order relationships between arbitrary pairs of memory elements, through which a relational memory is constructed from an item memory. The two memories are wired into a single sequential model capable of both memorization and relational reasoning. We achieve competitive results with our proposed two-memory model in a diversity of machine learning tasks, from challenging synthetic problems to practical testbeds such as geometry, graph, reinforcement learning, and question answering.

Truyen Tran | Hung Le | Svetha Venkatesh

[1] Wei Zhang,et al. Learning to update Auto-associative Memory in Recurrent Neural Networks for Improving Sequence Memorization , 2017, ArXiv.

[2] Joshua B. Tenenbaum,et al. Separating Style and Content with Bilinear Models , 2000, Neural Computation.

[3] Sergio Gomez Colmenarejo,et al. Hybrid computing using a neural network with dynamic external memory , 2016, Nature.

[4] Jason Weston,et al. Towards AI-Complete Question Answering: A Set of Prerequisite Toy Tasks , 2015, ICLR.

[5] Rasmus Pagh,et al. Fast and scalable polynomial kernels via explicit feature maps , 2013, KDD.

[6] Razvan Pascanu,et al. Relational recurrent neural networks , 2018, NeurIPS.

[7] J. Hodges. Memory, Amnesia and the Hippocampal System , 1995 .

[8] Truyen Tran,et al. Neural Stored-program Memory , 2019, ICLR.

[9] Byoung-Tak Zhang,et al. Bilinear Attention Networks , 2018, NeurIPS.

[10] M. Buckley. The Role of the Perirhinal Cortex and Hippocampus in Learning, Memory, and Perception , 2005, The Quarterly journal of experimental psychology. B, Comparative and physiological psychology.

[11] H Eichenbaum,et al. Memory for items and memory for relations in the procedural/declarative memory framework. , 1997, Memory.

[12] N. Cohen,et al. Relational Memory and the Hippocampus: Representations and Methods , 2009, Front. Neurosci..

[13] Jürgen Schmidhuber,et al. Gated Fast Weights for On-The-Fly Neural Program Generation , 2017 .

[14] Ingrid R. Olson,et al. Working Memory for Conjunctions Relies on the Medial Temporal Lobe , 2006, The Journal of Neuroscience.

[15] Alex Graves,et al. Neural Turing Machines , 2014, ArXiv.

[16] Bram Bakker,et al. Reinforcement Learning with Long Short-Term Memory , 2001, NIPS.

[17] D. Marr. A theory of cerebellar cortex , 1969, The Journal of physiology.

[18] Svetha Venkatesh,et al. Learning to Remember More with Less Memorization , 2019, ICLR.

[19] Navdeep Jaitly,et al. Pointer Networks , 2015, NIPS.

[20] Svetha Venkatesh,et al. Dual Memory Neural Computer for Asynchronous Two-view Sequential Learning , 2018, KDD.

[21] Svetha Venkatesh,et al. Variational Memory Encoder-Decoder , 2018, NeurIPS.

[22] Tsendsuren Munkhdalai,et al. Metalearned Neural Memory , 2019, NeurIPS.

[23] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.

[24] Alex Graves,et al. Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.

[25] Geoffrey E. Hinton. Tensor Product Variable Binding and the Representation of Symbolic Structures in Connectionist Systems , 1991 .

[26] Razvan Pascanu,et al. A simple neural network module for relational reasoning , 2017, NIPS.

[27] Jeffrey L. Elman,et al. Finding Structure in Time , 1990, Cogn. Sci..

[28] Zhou Yu,et al. Multi-modal Factorized Bilinear Pooling with Co-attention Learning for Visual Question Answering , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[29] Raúl Rojas,et al. Neural Networks - A Systematic Introduction , 1996 .

[30] Mark Rudelson,et al. Sampling from large matrices: An approach through geometric functional analysis , 2005, JACM.

[31] Teuvo Kohonen,et al. Correlation Matrix Memories , 1972, IEEE Transactions on Computers.

[32] Margaret L. Schlichting,et al. The hippocampus and inferential reasoning: building memories to navigate future decisions , 2012, Front. Hum. Neurosci..

[33] Geoffrey E. Hinton,et al. Using Fast Weights to Attend to the Recent Past , 2016, NIPS.

[34] Geoffrey E. Hinton,et al. Layer Normalization , 2016, ArXiv.

[35] J J Hopfield,et al. Neural networks and physical systems with emergent collective computational abilities. , 1982, Proceedings of the National Academy of Sciences of the United States of America.

[36] Jason Weston,et al. Tracking the World State with Recurrent Entity Networks , 2016, ICLR.

[37] Christoph von der Malsburg,et al. The Correlation Theory of Brain Function , 1994 .

[38] Yoshua Bengio,et al. Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[39] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.

[40] Jürgen Schmidhuber,et al. Learning to Reason with Third-Order Tensor Products , 2018, NeurIPS.

[41] James L. McClelland,et al. Generalization Through the Recurrent Interaction of Episodic Memories , 2012, Psychological review.