Attention and Augmented Recurrent Neural Networks

The basic RNN design struggles with longer sequences, but a special variant—“long short-term memory” networks —can even work with these. Such models have been found to be very powerful, achieving remarkable results in many tasks including translation, voice recognition, and image captioning. As a result, recurrent neural networks have become very widespread in the last few years. As this has happened, we’ve seen a growing number of attempts to augment RNNs with new properties. Four directions stand out as particularly exciting:

[1]  Marcin Andrychowicz,et al.  Learning Efficient Algorithms with Hierarchical Attentive Memory , 2016, ArXiv.

[2]  Quoc V. Le,et al.  Neural Programmer: Inducing Latent Programs with Gradient Descent , 2015, ICLR.

[3]  Demis Hassabis,et al.  Mastering the game of Go with deep neural networks and tree search , 2016, Nature.

[4]  Quoc V. Le,et al.  Listen, Attend and Spell , 2015, ArXiv.

[5]  Marcin Andrychowicz,et al.  Neural Random Access Machines , 2015, ERCIM News.

[6]  Phil Blunsom,et al.  Learning to Transduce with Unbounded Memory , 2015, NIPS.

[7]  Quoc V. Le,et al.  A Neural Conversational Model , 2015, ArXiv.

[8]  Alex Graves,et al.  Neural Turing Machines , 2014, ArXiv.

[9]  Lukasz Kaiser,et al.  Neural GPUs Learn Algorithms , 2015, ICLR.

[10]  Josef Urban,et al.  DeepMath - Deep Sequence Models for Premise Selection , 2016, NIPS.

[11]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[12]  Yoshua Bengio,et al.  Show, Attend and Tell: Neural Image Caption Generation with Visual Attention , 2015, ICML.

[13]  Geoffrey E. Hinton,et al.  Grammar as a Foreign Language , 2014, NIPS.

[14]  Tomas Mikolov,et al.  Inferring Algorithmic Patterns with Stack-Augmented Recurrent Nets , 2015, NIPS.

[15]  Navdeep Jaitly,et al.  Pointer Networks , 2015, NIPS.

[16]  Nando de Freitas,et al.  Neural Programmer-Interpreters , 2015, ICLR.

[17]  Alex Graves,et al.  Adaptive Computation Time for Recurrent Neural Networks , 2016, ArXiv.

[18]  Jason Weston,et al.  Memory Networks , 2014, ICLR.

[19]  Wojciech Zaremba,et al.  Reinforcement Learning Neural Turing Machines , 2015, ArXiv.

[20]  Richard Socher,et al.  Ask Me Anything: Dynamic Memory Networks for Natural Language Processing , 2015, ICML.