Long-Term Memory Networks for Question Answering

Question answering is an important and difficult task in the natural language processing domain, because many basic natural language processing tasks can be cast into a question answering task. Several deep neural network architectures have been developed recently, which employ memory and inference components to memorize and reason over text information, and generate answers to questions. However, a major drawback of many such models is that they are capable of only generating single-word answers. In addition, they require large amount of training data to generate accurate answers. In this paper, we introduce the Long-Term Memory Network (LTMN), which incorporates both an external memory module and a Long Short-Term Memory (LSTM) module to comprehend the input data and generate multi-word answers. The LTMN model can be trained end-to-end using back-propagation and requires minimal supervision. We test our model on two synthetic data sets (based on Facebook's bAbI data set) and the real-world Stanford question answering data set, and show that it can achieve state-of-the-art performance.

[1]  Jiebo Luo,et al.  Image Captioning with Semantic Attention , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Michael C. Mozer,et al.  A Connectionist Symbol Manipulator that Discovers the Structure of Context-Free Languages , 1992, NIPS.

[3]  Jason Weston,et al.  Towards AI-Complete Question Answering: A Set of Prerequisite Toy Tasks , 2015, ICLR.

[4]  Jonathan Berant,et al.  Semantic Parsing via Paraphrasing , 2014, ACL.

[5]  Alex Graves,et al.  Neural Turing Machines , 2014, ArXiv.

[6]  Richard Socher,et al.  Ask Me Anything: Dynamic Memory Networks for Natural Language Processing , 2015, ICML.

[7]  Jason Weston,et al.  Memory Networks , 2014, ICLR.

[8]  Luke S. Zettlemoyer,et al.  Learning to Map Sentences to Logical Form: Structured Classification with Probabilistic Categorial Grammars , 2005, UAI.

[9]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[10]  Marius Paca Open-Domain Question Answering from Large Text Collections , 2003, Computational Linguistics.

[11]  Phil Blunsom,et al.  Teaching Machines to Read and Comprehend , 2015, NIPS.

[12]  Alex Graves,et al.  Generating Sequences With Recurrent Neural Networks , 2013, ArXiv.

[13]  Jason Weston,et al.  Joint Learning of Words and Meaning Representations for Open-Text Semantic Parsing , 2012, AISTATS.

[14]  Quoc V. Le,et al.  Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[15]  Jian Zhang,et al.  SQuAD: 100,000+ Questions for Machine Comprehension of Text , 2016, EMNLP.

[16]  Jason Weston,et al.  End-To-End Memory Networks , 2015, NIPS.

[17]  Bert F. Green,et al.  Baseball: an automatic question-answerer , 1899, IRE-AIEE-ACM '61 (Western).

[18]  Quoc V. Le,et al.  Distributed Representations of Sentences and Documents , 2014, ICML.

[19]  Richard Socher,et al.  A Neural Network for Factoid Question Answering over Paragraphs , 2014, EMNLP.

[20]  Jens Lehmann,et al.  DBpedia - A crystallization point for the Web of Data , 2009, J. Web Semant..

[21]  P. Cochat,et al.  Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[22]  Jun Zhao,et al.  A Joint Model for Question Answering over Multiple Knowledge Bases , 2016, AAAI.

[23]  Yoshua Bengio,et al.  On the Properties of Neural Machine Translation: Encoder–Decoder Approaches , 2014, SSST@EMNLP.

[24]  Andrew Chou,et al.  Semantic Parsing on Freebase from Question-Answer Pairs , 2013, EMNLP.

[25]  Colin Cherry,et al.  A Systematic Comparison of Smoothing Techniques for Sentence-Level BLEU , 2014, WMT@ACL.

[26]  Jun Zhao,et al.  Employing External Rich Knowledge for Machine Comprehension , 2016, IJCAI.

[27]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[28]  Jing Gao,et al.  A MultiTask Framework for Monitoring Health Conditions via Attention-based Recurrent Neural Networks , 2017 .

[29]  J. Elman Distributed representations, simple recurrent networks, and grammatical structure , 1991, Machine Learning.

[30]  Fenglong Ma,et al.  A Multi-task Framework for Monitoring Health Conditions via Attention-based Recurrent Neural Networks , 2017, AMIA.

[31]  Tomas Mikolov,et al.  Inferring Algorithmic Patterns with Stack-Augmented Recurrent Nets , 2015, NIPS.

[32]  Fenglong Ma,et al.  Dipole: Diagnosis Prediction in Healthcare via Attention-based Bidirectional Recurrent Neural Networks , 2017, KDD.

[33]  Praveen Paritosh,et al.  Freebase: a collaboratively created graph database for structuring human knowledge , 2008, SIGMOD Conference.