Lie-Access Neural Turing Machines

External neural memory structures have recently become a popular tool for algorithmic deep learning (Graves et al. 2014, Weston et al. 2014). These models generally utilize differentiable versions of traditional discrete memory-access structures (random access, stacks, tapes) to provide the storage necessary for computational tasks. In this work, we argue that these neural memory systems lack specific structure important for relative indexing, and propose an alternative model, Lie-access memory, that is explicitly designed for the neural setting. In this paradigm, memory is accessed using a continuous head in a key-space manifold. The head is moved via Lie group actions, such as shifts or rotations, generated by a controller, and memory access is performed by linear smoothing in key space. We argue that Lie groups provide a natural generalization of discrete memory structures, such as Turing machines, as they provide inverse and identity operators while maintaining differentiability. To experiment with this approach, we implement a simplified Lie-access neural Turing machine (LANTM) with different Lie groups. We find that this approach is able to perform well on a range of algorithmic tasks. HarvardNLP landing page

[1]  Omer Levy,et al.  Published as a conference paper at ICLR 2018 S IMULATING A CTION D YNAMICS WITH N EURAL P ROCESS N ETWORKS , 2018 .

[2]  Jason Weston,et al.  End-To-End Memory Networks , 2015, NIPS.

[3]  P J Webros BACKPROPAGATION THROUGH TIME: WHAT IT DOES AND HOW TO DO IT , 1990 .

[4]  John M. Lee Introduction to Smooth Manifolds , 2002 .

[5]  Arne Marthinsen,et al.  Interpolation in Lie Groups , 1999, SIAM J. Numer. Anal..

[6]  Yoshua Bengio,et al.  Dynamic Neural Turing Machine with Soft and Hard Addressing Schemes , 2016, ArXiv.

[7]  Lukasz Kaiser,et al.  Neural GPUs Learn Algorithms , 2015, ICLR.

[8]  Xinyun Chen Under Review as a Conference Paper at Iclr 2017 Delving into Transferable Adversarial Ex- Amples and Black-box Attacks , 2016 .

[9]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[10]  Richard Socher,et al.  Ask Me Anything: Dynamic Memory Networks for Natural Language Processing , 2015, ICML.

[11]  Marcin Andrychowicz,et al.  Neural Random Access Machines , 2015, ERCIM News.

[12]  Wojciech Zaremba,et al.  Reinforcement Learning Neural Turing Machines - Revised , 2015 .

[13]  Tomas Mikolov,et al.  Inferring Algorithmic Patterns with Stack-Augmented Recurrent Nets , 2015, NIPS.

[14]  Sergio Gomez Colmenarejo,et al.  Hybrid computing using a neural network with dynamic external memory , 2016, Nature.

[15]  Jason Weston,et al.  Key-Value Memory Networks for Directly Reading Documents , 2016, EMNLP.

[16]  Phil Blunsom,et al.  Learning to Transduce with Unbounded Memory , 2015, NIPS.

[17]  Alex Graves,et al.  Neural Turing Machines , 2014, ArXiv.

[18]  Quoc V. Le,et al.  Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[19]  Tatiana Shingel,et al.  Interpolation in special orthogonal groups , 2009 .

[20]  Alex Graves,et al.  Grid Long Short-Term Memory , 2015, ICLR.