论文信息 - Internal Memory Gate for Recurrent Neural Networks with Application to Spoken Language Understanding

Internal Memory Gate for Recurrent Neural Networks with Application to Spoken Language Understanding

Long Short-Term Memory (LSTM) Recurrent Neural Networks (RNN) require 4 gates to learn shortand long-term dependencies for a given sequence of basic elements. Recently, “Gated Recurrent Unit” (GRU) has been introduced and requires fewer gates than LSTM (reset and update gates), to code shortand long-term dependencies and reaches equivalent performances to LSTM, with less processing time during the learning. The “Leaky integration Unit” (LU) is a GRU with a single gate (update) that codes mostly long-term dependencies quicker than LSTM or GRU (small number of operations for learning). This paper proposes a novel RNN that takes advantage of LSTM, GRU (shortand long-term dependencies) and the LU (fast learning) called “Internal Memory Gate” (IMG). The effectiveness and the robustness of the proposed IMG-RNN is evaluated during a classification task of a small corpus of spoken dialogues from the DECODA project that allows us to evaluate the capability of each RNN to code short-term dependencies. The experiments show that IMG-RNNs reach better accuracies with a gain of 0.4 points compared to LSTMand GRU-RNNs and 0.7 points compared to the LU-RNN. Moreover, IMG-RNN requires less processing time than GRU or LSTM with a gain of 19% and 50% respectively.

Mohamed Morchid

[1] Christopher Potts,et al. Learning Word Vectors for Sentiment Analysis , 2011, ACL.

[2] Yoshua Bengio,et al. On the Properties of Neural Machine Translation: Encoder–Decoder Approaches , 2014, SSST@EMNLP.

[3] Frédéric Béchet,et al. DECODA: a call-centre human-human spoken conversation corpus , 2012, LREC.

[4] Sepp Hochreiter,et al. The Vanishing Gradient Problem During Learning Recurrent Neural Nets and Problem Solutions , 1998, Int. J. Uncertain. Fuzziness Knowl. Based Syst..

[5] Razvan Pascanu,et al. Advances in optimizing recurrent networks , 2012, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[6] Yoshua Bengio,et al. Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling , 2014, ArXiv.

[7] Jürgen Schmidhuber,et al. Applying LSTM to Time Series Predictable through Time-Window Approaches , 2000, ICANN.

[8] Yoshua Bengio,et al. Learning long-term dependencies with gradient descent is difficult , 1994, IEEE Trans. Neural Networks.

[9] Qin Jin,et al. Multi-modal Dimensional Emotion Recognition using Recurrent Neural Networks , 2015, AVEC@ACM Multimedia.

[10] Ya Li,et al. Long Short Term Memory Recurrent Neural Network based Multimodal Dimensional Emotion Recognition , 2015, AVEC@ACM Multimedia.

[11] Georges Linarès,et al. The LIA Speech Recognition System: From 10xRT to 1xRT , 2007, TSD.

[12] Yoshua Bengio,et al. Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[13] B. Ramabhadran,et al. Contour Prediction with Long Short-Term Memory , Bi-Directional , Deep Recurrent Neural Networks , 2014 .

[14] Hermann Ney,et al. LSTM Neural Networks for Language Modeling , 2012, INTERSPEECH.

[15] Mohamed Morchid,et al. Impact of Word Error Rate on theme identification task of highly imperfect human-human conversations , 2016, Comput. Speech Lang..

[16] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.

[17] Frank K. Soong,et al. TTS synthesis with bidirectional LSTM based recurrent neural networks , 2014, INTERSPEECH.

[18] D. Signorini,et al. Neural networks , 1995, The Lancet.