The Context-Dependent Additive Recurrent Neural Net

Contextual sequence mapping is one of the fundamental problems in Natural Language Processing (NLP). Here, instead of relying solely on the information presented in the text, the learning agents have access to a strong external signal given to assist the learning process. In this paper, we propose a novel family of Recurrent Neural Network unit: the Context-dependent Additive Recurrent Neural Network (CARNN) that is designed specifically to address this type of problem. The experimental results on public datasets in the dialog problem (Babi dialog Task 6 and Frame), contextual language model (Switchboard and Penn Tree Bank) and question answering (Trec QA) show that our novel CARNN-based architectures outperform previous methods.

[1]  Gholamreza Haffari,et al.  A Latent Variable Recurrent Neural Network for Discourse Relation Language Models , 2016, ArXiv.

[2]  Rashmi Prasad,et al.  The Penn Discourse Treebank , 2004, LREC.

[3]  Si Li,et al.  A Compare-Aggregate Model with Dynamic-Clip Attention for Answer Selection , 2017, CIKM.

[4]  Siu Cheung Hui,et al.  Enabling Efficient Question Answer Retrieval via Hyperbolic Neural Networks , 2017, ArXiv.

[5]  Noah A. Smith,et al.  What is the Jeopardy Model? A Quasi-Synchronous Grammar for QA , 2007, EMNLP.

[6]  Timothy Baldwin,et al.  Topically Driven Neural Language Model , 2017, ACL.

[7]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[8]  Ali Farhadi,et al.  Query-Regression Networks for Machine Comprehension , 2016, ArXiv.

[9]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[10]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[11]  Jason Weston,et al.  End-To-End Memory Networks , 2015, NIPS.

[12]  Antonio Jimeno-Yepes,et al.  Named Entity Recognition with Stack Residual LSTM and Trainable Bias Decoding , 2017, IJCNLP.

[13]  Zhiguo Wang,et al.  Bilateral Multi-Perspective Matching for Natural Language Sentences , 2017, IJCAI.

[14]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Zhi-Hong Deng,et al.  Inter-Weighted Alignment Network for Sentence Pair Modeling , 2017, EMNLP.

[16]  Julien Perez,et al.  Gated End-to-End Memory Networks , 2016, EACL.

[17]  Jason Weston,et al.  Learning End-to-End Goal-Oriented Dialog , 2016, ICLR.

[18]  Omer Levy,et al.  Recurrent Additive Networks , 2017, ArXiv.

[19]  Yoshua Bengio,et al.  Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling , 2014, ArXiv.

[20]  Geoffrey Zweig,et al.  Hybrid Code Networks: practical and efficient end-to-end dialog control with supervised and reinforcement learning , 2017, ACL.

[21]  Jimmy J. Lin,et al.  Multi-Perspective Sentence Similarity Modeling with Convolutional Neural Networks , 2015, EMNLP.

[22]  Hannes Schulz,et al.  Frames: a corpus for adding memory to goal-oriented dialogue systems , 2017, SIGDIAL Conference.

[23]  Bing Liu,et al.  An End-to-End Trainable Neural Network Model with Belief Tracking for Task-Oriented Dialog , 2017, INTERSPEECH.

[24]  Chris Dyer,et al.  Document Context Language Models , 2015, ICLR 2015.

[25]  Ingrid Zukerman,et al.  Inter-document Contextual Language model , 2016, HLT-NAACL.