Internal Memory Gate for Recurrent Neural Networks with Application to Spoken Language Understanding

Long Short-Term Memory (LSTM) Recurrent Neural Networks (RNN) require 4 gates to learn shortand long-term dependencies for a given sequence of basic elements. Recently, “Gated Recurrent Unit” (GRU) has been introduced and requires fewer gates than LSTM (reset and update gates), to code shortand long-term dependencies and reaches equivalent performances to LSTM, with less processing time during the learning. The “Leaky integration Unit” (LU) is a GRU with a single gate (update) that codes mostly long-term dependencies quicker than LSTM or GRU (small number of operations for learning). This paper proposes a novel RNN that takes advantage of LSTM, GRU (shortand long-term dependencies) and the LU (fast learning) called “Internal Memory Gate” (IMG). The effectiveness and the robustness of the proposed IMG-RNN is evaluated during a classification task of a small corpus of spoken dialogues from the DECODA project that allows us to evaluate the capability of each RNN to code short-term dependencies. The experiments show that IMG-RNNs reach better accuracies with a gain of 0.4 points compared to LSTMand GRU-RNNs and 0.7 points compared to the LU-RNN. Moreover, IMG-RNN requires less processing time than GRU or LSTM with a gain of 19% and 50% respectively.

[1]  Christopher Potts,et al.  Learning Word Vectors for Sentiment Analysis , 2011, ACL.

[2]  Yoshua Bengio,et al.  On the Properties of Neural Machine Translation: Encoder–Decoder Approaches , 2014, SSST@EMNLP.

[3]  Frédéric Béchet,et al.  DECODA: a call-centre human-human spoken conversation corpus , 2012, LREC.

[4]  Sepp Hochreiter,et al.  The Vanishing Gradient Problem During Learning Recurrent Neural Nets and Problem Solutions , 1998, Int. J. Uncertain. Fuzziness Knowl. Based Syst..

[5]  Razvan Pascanu,et al.  Advances in optimizing recurrent networks , 2012, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[6]  Yoshua Bengio,et al.  Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling , 2014, ArXiv.

[7]  Jürgen Schmidhuber,et al.  Applying LSTM to Time Series Predictable through Time-Window Approaches , 2000, ICANN.

[8]  Yoshua Bengio,et al.  Learning long-term dependencies with gradient descent is difficult , 1994, IEEE Trans. Neural Networks.

[9]  Qin Jin,et al.  Multi-modal Dimensional Emotion Recognition using Recurrent Neural Networks , 2015, AVEC@ACM Multimedia.

[10]  Ya Li,et al.  Long Short Term Memory Recurrent Neural Network based Multimodal Dimensional Emotion Recognition , 2015, AVEC@ACM Multimedia.

[11]  Georges Linarès,et al.  The LIA Speech Recognition System: From 10xRT to 1xRT , 2007, TSD.

[12]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[13]  B. Ramabhadran,et al.  Contour Prediction with Long Short-Term Memory , Bi-Directional , Deep Recurrent Neural Networks , 2014 .

[14]  Hermann Ney,et al.  LSTM Neural Networks for Language Modeling , 2012, INTERSPEECH.

[15]  Mohamed Morchid,et al.  Impact of Word Error Rate on theme identification task of highly imperfect human-human conversations , 2016, Comput. Speech Lang..

[16]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[17]  Frank K. Soong,et al.  TTS synthesis with bidirectional LSTM based recurrent neural networks , 2014, INTERSPEECH.

[18]  D. Signorini,et al.  Neural networks , 1995, The Lancet.