Learning to update Auto-associative Memory in Recurrent Neural Networks for Improving Sequence Memorization

Learning to remember long sequences remains a challenging task for recurrent neural networks. Register memory and attention mechanisms were both proposed to resolve the issue with either high computational cost to retain memory differentiability, or by discounting the RNN representation learning towards encoding shorter local contexts than encouraging long sequence encoding. Associative memory, which studies the compression of multiple patterns in a fixed size memory, were rarely considered in recent years. Although some recent work tries to introduce associative memory in RNN and mimic the energy decay process in Hopfield nets, it inherits the shortcoming of rule-based memory updates, and the memory capacity is limited. This paper proposes a method to learn the memory update rule jointly with task objective to improve memory capacity for remembering long sequences. Also, we propose an architecture that uses multiple such associative memory for more complex input encoding. We observed some interesting facts when compared to other RNN architectures on some well-studied sequence learning tasks.

[1]  Jürgen Schmidhuber,et al.  Recurrent Highway Networks , 2016, ICML.

[2]  Bowen Zhou,et al.  Abstractive Text Summarization using Sequence-to-sequence RNNs and Beyond , 2016, CoNLL.

[3]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[4]  Florence March,et al.  2016 , 2016, Affair of the Heart.

[5]  Hava T. Siegelmann,et al.  On the Computational Power of Neural Nets , 1995, J. Comput. Syst. Sci..

[6]  Teuvo Kohonen,et al.  An Adaptive Associative Memory Principle , 1974, IEEE Transactions on Computers.

[7]  Geoffrey E. Hinton,et al.  Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.

[8]  Yang Yu,et al.  Structured Memory for Neural Turing Machines , 2015, ArXiv.

[9]  Takayuki Osogami,et al.  Learning dynamic Boltzmann machines with spike-timing dependent plasticity , 2015, ArXiv.

[10]  Bowen Zhou,et al.  End-to-End Answer Chunk Extraction and Ranking for Reading Comprehension , 2016, 1610.09996.

[11]  Phil Blunsom,et al.  Teaching Machines to Read and Comprehend , 2015, NIPS.

[12]  Jan Peters,et al.  Reinforcement learning in robotics: A survey , 2013, Int. J. Robotics Res..

[13]  Paul Smolensky,et al.  Information processing in dynamical systems: foundations of harmony theory , 1986 .

[14]  J. Koenderink Q… , 2014, Les noms officiels des communes de Wallonie, de Bruxelles-Capitale et de la communaute germanophone.

[15]  Alex Graves,et al.  Neural Turing Machines , 2014, ArXiv.

[16]  John J. Hopfield,et al.  Dense Associative Memory for Pattern Recognition , 2016, NIPS.

[17]  Geoffrey E. Hinton Using fast weights to deblur old memories , 1987 .

[18]  Yoshua Bengio,et al.  Show, Attend and Tell: Neural Image Caption Generation with Visual Attention , 2015, ICML.

[19]  John J. Hopfield,et al.  Neural networks and physical systems with emergent collective computational abilities , 1999 .

[20]  Quoc V. Le,et al.  Neural Architecture Search with Reinforcement Learning , 2016, ICLR.

[21]  Geoffrey E. Hinton,et al.  The Recurrent Temporal Restricted Boltzmann Machine , 2008, NIPS.

[22]  Alex Graves,et al.  Recurrent Models of Visual Attention , 2014, NIPS.

[23]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[24]  Geoffrey E. Hinton,et al.  Using Fast Weights to Attend to the Recent Past , 2016, NIPS.

[25]  Christopher Potts,et al.  A large annotated corpus for learning natural language inference , 2015, EMNLP.

[26]  Matthias Löwe,et al.  On a Model of Associative Memory with Huge Storage Capacity , 2017, 1702.01929.

[27]  Geoffrey E. Hinton A Practical Guide to Training Restricted Boltzmann Machines , 2012, Neural Networks: Tricks of the Trade.

[28]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[29]  Jian Zhang,et al.  SQuAD: 100,000+ Questions for Machine Comprehension of Text , 2016, EMNLP.

[30]  Taichi Kiwaki,et al.  Deep Boltzmann Machines with Fine Scalability , 2015, ArXiv.

[31]  Jianfeng Gao,et al.  Deep Reinforcement Learning for Dialogue Generation , 2016, EMNLP.

[32]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.