Temporal Hierarchies in Sequence to Sequence for Sentence Correction

This work tackles sentence correction in the lan-guage domain by approaching it as a sequence to sequence (seq2seq) problem with the help of temporal hierarchies. It does so by implementing a Multiple Timescales model of the Gated Recurrent Unit (MTGRU) in a Recurrent Neural Network (RNN) Encoder-Decoder framework, which can perform more meaningful data abstraction even in the presence of errors. The proposed language correction model is compared to three baseline models: conventional RNN, Long Short-Term Memory (LSTM) and Gated Recurrent Unit (GRU); by using a newly built dataset that consists of incorrect and correct sentences as input and target respectively. The result shows that the MTGRU model has a better generalization performance and outperforms all three models on the BLEU-n evaluation metric.

[1]  Geoffrey Zweig,et al.  Joint Language and Translation Modeling with Recurrent Neural Networks , 2013, EMNLP.

[2]  D. Poeppel,et al.  Cortical Tracking of Hierarchical Linguistic Structures in Connected Speech , 2015, Nature Neuroscience.

[3]  Geoffrey Zweig,et al.  Context dependent recurrent neural network language model , 2012, 2012 IEEE Spoken Language Technology Workshop (SLT).

[4]  Yoshua Bengio,et al.  Learning long-term dependencies with gradient descent is difficult , 1994, IEEE Trans. Neural Networks.

[5]  Steven Bird,et al.  NLTK: The Natural Language Toolkit , 2002, ACL.

[6]  Quoc V. Le,et al.  Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[7]  Minho Lee,et al.  Towards Abstraction from Extraction: Multiple Timescale Gated Recurrent Unit for Summarization , 2016, Rep4NLP@ACL.

[8]  Daniel Jurafsky,et al.  A Hierarchical Neural Autoencoder for Paragraphs and Documents , 2015, ACL.

[9]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[10]  Yoshua Bengio,et al.  Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling , 2014, ArXiv.

[11]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[12]  Jun Tani,et al.  Emergence of Functional Hierarchy in a Multiple Timescale Neural Network Model: A Humanoid Robot Experiment , 2008, PLoS Comput. Biol..

[13]  Minho Lee,et al.  Temporal hierarchies in multilayer gated recurrent neural networks for language models , 2017, 2017 International Joint Conference on Neural Networks (IJCNN).

[14]  Steven Bird,et al.  NLTK: The Natural Language Toolkit , 2002, ACL 2006.

[15]  Lukás Burget,et al.  Recurrent neural network based language model , 2010, INTERSPEECH.

[16]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[17]  Eric Greenstein,et al.  Japanese-to-English Machine Translation Using Recurrent Neural Networks , 2015 .

[18]  Lukás Burget,et al.  Extensions of recurrent neural network language model , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).