DIM Reader: Dual Interaction Model for Machine Comprehension

Enabling a computer to understand a document so that it can answer comprehension questions is a central, yet unsolved goal of Natural Language Processing, so reading comprehension of text is an important problem in NLP research. In this paper, we propose a novel dual interaction model (called DIM Reader) (Our code is available at https://github.com/dlt/mrc-dim), which constructs dual iterative alternating attention mechanism over multiple hops. The proposed DIM Reader continually refines its view of the query and document while aggregating the information required to answer a query, aiming to compute the attentions not only for the document but also the query side, which will benefit from the mutual information. DIM Reader makes use of multiple turns to effectively exploit and perform deeper inference among queries, documents. We conduct extensive experiments on CNN/DailyMail News datasets, and our model achieves the best results on both machine comprehension datasets among almost published results.

[1]  Danqi Chen,et al.  A Thorough Examination of the CNN/Daily Mail Reading Comprehension Task , 2016, ACL.

[2]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[3]  Wilson L. Taylor,et al.  “Cloze Procedure”: A New Tool for Measuring Readability , 1953 .

[4]  Ting Liu,et al.  Generating and Exploiting Large-scale Pseudo Training Data for Zero Pronoun Resolution , 2016, ACL.

[5]  Shuohang Wang,et al.  Machine Comprehension Using Match-LSTM and Answer Pointer , 2016, ICLR.

[6]  Jian Zhang,et al.  SQuAD: 100,000+ Questions for Machine Comprehension of Text , 2016, EMNLP.

[7]  Matthew Richardson,et al.  MCTest: A Challenge Dataset for the Open-Domain Machine Comprehension of Text , 2013, EMNLP.

[8]  Martín Abadi,et al.  TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems , 2016, ArXiv.

[9]  Surya Ganguli,et al.  Exact solutions to the nonlinear dynamics of learning in deep linear neural networks , 2013, ICLR.

[10]  Rudolf Kadlec,et al.  Text Understanding with the Attention Sum Reader Network , 2016, ACL.

[11]  Philip Bachman,et al.  Natural Language Comprehension with the EpiReader , 2016, EMNLP.

[12]  Navdeep Jaitly,et al.  Pointer Networks , 2015, NIPS.

[13]  Ting Liu,et al.  Consensus Attention-based Neural Networks for Chinese Reading Comprehension , 2016, COLING.

[14]  Ting Liu,et al.  Attention-over-Attention Neural Networks for Reading Comprehension , 2016, ACL.

[15]  Dirk Weissenborn,et al.  FastQA: A Simple and Efficient Neural Architecture for Question Answering , 2017, ArXiv.

[16]  Eunsol Choi,et al.  TriviaQA: A Large Scale Distantly Supervised Challenge Dataset for Reading Comprehension , 2017, ACL.

[17]  Yoon Kim,et al.  Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.

[18]  Richard Socher,et al.  Dynamic Coattention Networks For Question Answering , 2016, ICLR.

[19]  Li-Rong Dai,et al.  Exploring Question Understanding and Adaptation in Neural-Network-Based Question Answering , 2017, ArXiv.

[20]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[21]  Christopher D. Manning,et al.  Effective Approaches to Attention-based Neural Machine Translation , 2015, EMNLP.

[22]  Phil Blunsom,et al.  Teaching Machines to Read and Comprehend , 2015, NIPS.

[23]  Samuel R. Bowman,et al.  Ruminating Reader: Reasoning with Gated Multi-hop Attention , 2017, QA@ACL.

[24]  Ali Farhadi,et al.  Bidirectional Attention Flow for Machine Comprehension , 2016, ICLR.

[25]  Jason Weston,et al.  End-To-End Memory Networks , 2015, NIPS.

[26]  Philip Bachman,et al.  Iterative Alternating Neural Attention for Machine Reading , 2016, ArXiv.

[27]  Jason Weston,et al.  Reading Wikipedia to Answer Open-Domain Questions , 2017, ACL.

[28]  Ruslan Salakhutdinov,et al.  Gated-Attention Readers for Text Comprehension , 2016, ACL.

[29]  Jason Weston,et al.  The Goldilocks Principle: Reading Children's Books with Explicit Memory Representations , 2015, ICLR.

[30]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.