论文信息 - Self-Assttentive Associative Memory - 字舞流文

Self-Assttentive Associative Memory

Heretofore, neural networks with external memory are restricted to single memory with lossy representations of memory interactions. A rich representation of relationships between memory pieces urges a high-order and segregated relational memory. In this paper, we propose to separate the storage of individual experiences (item memory) and their occurring relationships (relational memory). The idea is implemented through a novel Self-attentive Associative Memory (SAM) operator. Found upon outer product, SAM forms a set of associative memories that represent the hypothetical high-order relationships between arbitrary pairs of memory elements, through which a relational memory is constructed from an item memory. The two memories are wired into a single sequential model capable of both memorization and relational reasoning. We achieve competitive results with our proposed two-memory model in a diversity of machine learning tasks, from challenging synthetic problems to practical testbeds such as geometry, graph, reinforcement learning, and question answering.

Truyen Tran | Hung Le | Svetha Venkatesh | S. Venkatesh | T. Tran | Hung Le

[1] Teuvo Kohonen,et al. Correlation Matrix Memories , 1972, IEEE Transactions on Computers.

[2] N. Cohen,et al. Relational Memory and the Hippocampus: Representations and Methods , 2009, Front. Neurosci..

[3] Margaret L. Schlichting,et al. The hippocampus and inferential reasoning: building memories to navigate future decisions , 2012, Front. Hum. Neurosci..

[4] Geoffrey E. Hinton,et al. Using Fast Weights to Attend to the Recent Past , 2016, NIPS.

[5] Razvan Pascanu,et al. A simple neural network module for relational reasoning , 2017, NIPS.

[6] Jeffrey L. Elman,et al. Finding Structure in Time , 1990, Cogn. Sci..

[7] Razvan Pascanu,et al. Relational recurrent neural networks , 2018, NeurIPS.

[8] J. Hodges. Memory, Amnesia and the Hippocampal System , 1995 .

[9] Truyen Tran,et al. Neural Stored-program Memory , 2019, ICLR.

[10] Byoung-Tak Zhang,et al. Bilinear Attention Networks , 2018, NeurIPS.

[11] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.

[12] Joshua B. Tenenbaum,et al. Separating Style and Content with Bilinear Models , 2000, Neural Computation.

[13] H Eichenbaum,et al. Memory for items and memory for relations in the procedural/declarative memory framework. , 1997, Memory.

[14] M. Buckley. The Role of the Perirhinal Cortex and Hippocampus in Learning, Memory, and Perception , 2005, The Quarterly journal of experimental psychology. B, Comparative and physiological psychology.

[15] Geoffrey E. Hinton. Tensor Product Variable Binding and the Representation of Symbolic Structures in Connectionist Systems , 1991 .

[16] Sergio Gomez Colmenarejo,et al. Hybrid computing using a neural network with dynamic external memory , 2016, Nature.

[17] J J Hopfield,et al. Neural networks and physical systems with emergent collective computational abilities. , 1982, Proceedings of the National Academy of Sciences of the United States of America.

[18] Jürgen Schmidhuber,et al. Learning to Reason with Third-Order Tensor Products , 2018, NeurIPS.

[19] James L. McClelland,et al. Generalization Through the Recurrent Interaction of Episodic Memories , 2012, Psychological review.

[20] Alex Graves,et al. Neural Turing Machines , 2014, ArXiv.

[21] Wei Zhang,et al. Learning to update Auto-associative Memory in Recurrent Neural Networks for Improving Sequence Memorization , 2017, ArXiv.

[22] Raúl Rojas,et al. Neural Networks - A Systematic Introduction , 1996 .

[23] Yoshua Bengio,et al. Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[24] Bram Bakker,et al. Reinforcement Learning with Long Short-Term Memory , 2001, NIPS.

[25] Svetha Venkatesh,et al. Dual Memory Neural Computer for Asynchronous Two-view Sequential Learning , 2018, KDD.

[26] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.

[27] Alex Graves,et al. Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.

[28] Jason Weston,et al. Towards AI-Complete Question Answering: A Set of Prerequisite Toy Tasks , 2015, ICLR.

[29] D. Marr. A theory of cerebellar cortex , 1969, The Journal of physiology.

[30] Tsendsuren Munkhdalai,et al. Metalearned Neural Memory , 2019, NeurIPS.

[31] Svetha Venkatesh,et al. Learning to Remember More with Less Memorization , 2019, ICLR.

[32] Rasmus Pagh,et al. Fast and scalable polynomial kernels via explicit feature maps , 2013, KDD.

[33] Ingrid R. Olson,et al. Working Memory for Conjunctions Relies on the Medial Temporal Lobe , 2006, The Journal of Neuroscience.

[34] Zhou Yu,et al. Multi-modal Factorized Bilinear Pooling with Co-attention Learning for Visual Question Answering , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[35] Jürgen Schmidhuber,et al. Gated Fast Weights for On-The-Fly Neural Program Generation , 2017 .