论文信息 - Dynamic Neural Turing Machine with Continuous and Discrete Addressing Schemes - 字舞流文

Dynamic Neural Turing Machine with Continuous and Discrete Addressing Schemes

We extend the neural Turing machine (NTM) model into a dynamic neural Turing machine (D-NTM) by introducing trainable address vectors. This addressing scheme maintains for each memory cell two separate vectors, content and address vectors. This allows the D-NTM to learn a wide variety of location-based addressing strategies, including both linear and nonlinear ones. We implement the D-NTM with both continuous and discrete read and write mechanisms. We investigate the mechanisms and effects of learning to read and write into a memory through experiments on Facebook bAbI tasks using both a feedforward and GRU controller. We provide extensive analysis of our model and compare different variations of neural Turing machines on this task. We show that our model outperforms long short-term memory and NTM variants. We provide further experimental results on the sequential MNIST, Stanford Natural Language Inference, associative recall, and copy tasks.

Yoshua Bengio | Kyunghyun Cho | A. P. Sarath Chandar | Çaglar Gülçehre | Yoshua Bengio | Kyunghyun Cho | Çaglar Gülçehre | A. Chandar

[1] Quoc V. Le,et al. A Neural Conversational Model , 2015, ArXiv.

[2] Margaret Mitchell,et al. VQA: Visual Question Answering , 2015, International Journal of Computer Vision.

[3] Jürgen Schmidhuber,et al. Predictive Coding with Neural Nets: Application to Text Compression , 1994, NIPS.

[4] Frederick R. Forst,et al. On robust estimation of the location parameter , 1980 .

[5] Jason Weston,et al. Memory Networks , 2014, ICLR.

[6] Pascal Vincent,et al. Hierarchical Memory Networks , 2016, ArXiv.

[7] Jeffrey Pennington,et al. GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[8] Sepp Hochreiter,et al. Untersuchungen zu dynamischen neuronalen Netzen , 1991 .

[9] Richard Socher,et al. Ask Me Anything: Dynamic Memory Networks for Natural Language Processing , 2015, ICML.

[10] Jason Weston,et al. Towards AI-Complete Question Answering: A Set of Prerequisite Toy Tasks , 2015, ICLR.

[11] Yoshua Bengio,et al. Unitary Evolution Recurrent Neural Networks , 2015, ICML.

[12] Geoffrey E. Hinton,et al. Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.

[13] Jason Weston,et al. Large-scale Simple Question Answering with Memory Networks , 2015, ArXiv.

[14] Phil Blunsom,et al. Reasoning about Entailment with Neural Attention , 2015, ICLR.

[15] Joelle Pineau,et al. Building End-To-End Dialogue Systems Using Generative Hierarchical Neural Network Models , 2015, AAAI.

[16] Christopher Potts,et al. A large annotated corpus for learning natural language inference , 2015, EMNLP.

[17] Xiang Zhang,et al. Evaluating Prerequisite Qualities for Learning End-to-End Dialog Systems , 2015, ICLR.

[18] Erhardt Barth,et al. Recurrent Dropout without Memory Loss , 2016, COLING.

[19] Yoshua Bengio,et al. Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[20] Yoshua Bengio,et al. Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[21] Yoshua Bengio,et al. Show, Attend and Tell: Neural Image Caption Generation with Visual Attention , 2015, ICML.

[22] Nando de Freitas,et al. Neural Programmer-Interpreters , 2015, ICLR.

[23] Ali Farhadi,et al. Query-Reduction Networks for Question Answering , 2016, ICLR.

[24] Greg Yang,et al. Lie Access Neural Turing Machine , 2016, ArXiv.

[25] Misha Denil,et al. Noisy Activation Functions , 2016, ICML.

[26] Karol Gregor,et al. Neural Variational Inference and Learning in Belief Networks , 2014, ICML.

[27] Richard Socher,et al. Dynamic Memory Networks for Visual and Textual Question Answering , 2016, ICML.

[28] Christopher Joseph Pal,et al. Describing Videos by Exploiting Temporal Structure , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[29] Sergio Gomez Colmenarejo,et al. Hybrid computing using a neural network with dynamic external memory , 2016, Nature.

[30] Jason Weston,et al. A Neural Attention Model for Abstractive Sentence Summarization , 2015, EMNLP.

[31] Jason Weston,et al. Key-Value Memory Networks for Directly Reading Documents , 2016, EMNLP.

[32] Phil Blunsom,et al. Learning to Transduce with Unbounded Memory , 2015, NIPS.

[33] Daan Wierstra,et al. One-shot Learning with Memory-Augmented Neural Networks , 2016, ArXiv.

[34] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.

[35] Tomas Mikolov,et al. Inferring Algorithmic Patterns with Stack-Augmented Recurrent Nets , 2015, NIPS.

[36] Phil Blunsom,et al. Teaching Machines to Read and Comprehend , 2015, NIPS.

[37] Christopher D. Manning,et al. Effective Approaches to Attention-based Neural Machine Translation , 2015, EMNLP.

[38] Yang Yu,et al. Structured Memory for Neural Turing Machines , 2015, ArXiv.

[39] Jason Weston,et al. Tracking the World State with Recurrent Entity Networks , 2016, ICLR.

[40] Jason Weston,et al. The Goldilocks Principle: Reading Children's Books with Explicit Memory Representations , 2015, ICLR.

[41] Yoshua Bengio,et al. Zoneout: Regularizing RNNs by Randomly Preserving Hidden Activations , 2016, ICLR.

[42] Jason Weston,et al. End-To-End Memory Networks , 2015, NIPS.

[43] Alex Graves,et al. Scaling Memory-Augmented Neural Networks with Sparse Reads and Writes , 2016, NIPS.

[44] Aaron C. Courville,et al. Recurrent Batch Normalization , 2016, ICLR.

[45] Wojciech Zaremba,et al. Reinforcement Learning Neural Turing Machines , 2015, ArXiv.

[46] Yoshua Bengio,et al. Attention-Based Models for Speech Recognition , 2015, NIPS.

[47] Yoshua Bengio,et al. Learning long-term dependencies with gradient descent is difficult , 1994, IEEE Trans. Neural Networks.

[48] Wojciech Zaremba,et al. Learning Simple Algorithms from Examples , 2015, ICML.

[49] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.

[50] C. Lee Giles,et al. The Neural Network Pushdown Automaton: Architecture, Dynamics and Training , 1997, Summer School on Neural Networks.

[51] Geoffrey E. Hinton,et al. A Simple Way to Initialize Recurrent Networks of Rectified Linear Units , 2015, ArXiv.

[52] R. J. Williams,et al. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.