Emergent Logical Structure in Vector Representations of Neural Readers

Reading comprehension is a question answering task where the answer is to be found in a given passage about entities and events not mentioned in general knowledge sources. A significant number of neural architectures for this task (neural readers) have recently been developed and evaluated on large cloze-style datasets. We present experiments supporting the existence of logical structure in the hidden state vectors of "aggregation readers" such as the Attentive Reader and Stanford Reader. The logical structure of aggregation readers reflects the architecture of "explicit reference readers" such as the Attention-Sum Reader, the Gated Attention Reader and the Attention-over-Attention Reader. This relationship between aggregation readers and explicit reference readers presents a case study in emergent logical structure. In an independent contribution, we show that the addition of linguistics features to the input to existing neural readers significantly boosts performance yielding the best results to date on the Who-did-What datasets.

[1]  Naoaki Okazaki,et al.  Dynamic Entity Representation with Max-pooling Improves Machine Reading , 2016, NAACL.

[2]  Dirk Weissenborn,et al.  Separating Answers from Queries for Neural Reading Comprehension , 2016, ArXiv.

[3]  Ting Liu,et al.  Attention-over-Attention Neural Networks for Reading Comprehension , 2016, ACL.

[4]  Rudolf Kadlec,et al.  Text Understanding with the Attention Sum Reader Network , 2016, ACL.

[5]  Philip Bachman,et al.  Natural Language Comprehension with the EpiReader , 2016, EMNLP.

[6]  David A. McAllester,et al.  Who did What: A Large-Scale Person-Centered Cloze Dataset , 2016, EMNLP.

[7]  Yelong Shen,et al.  ReasoNet: Learning to Stop Reading in Machine Comprehension , 2016, CoCo@NIPS.

[8]  Matthew Richardson,et al.  MCTest: A Challenge Dataset for the Open-Domain Machine Comprehension of Text , 2013, EMNLP.

[9]  Surya Ganguli,et al.  Exact solutions to the nonlinear dynamics of learning in deep linear neural networks , 2013, ICLR.

[10]  Jianfeng Gao,et al.  Reasoning in Vector Space: An Exploratory Study of Question Answering , 2016, ICLR.

[11]  Razvan Pascanu,et al.  On the difficulty of training recurrent neural networks , 2012, ICML.

[12]  Phil Blunsom,et al.  Teaching Machines to Read and Comprehend , 2015, NIPS.

[13]  Razvan Pascanu,et al.  Theano: new features and speed improvements , 2012, ArXiv.

[14]  Jason Weston,et al.  End-To-End Memory Networks , 2015, NIPS.

[15]  Philip Bachman,et al.  Iterative Alternating Neural Attention for Machine Reading , 2016, ArXiv.

[16]  Hong Yu,et al.  Reasoning with Memory Augmented Neural Networks for Language Comprehension , 2017, ICLR.

[17]  Sandro Pezzelle,et al.  The LAMBADA dataset: Word prediction requiring a broad discourse context , 2016, ACL.

[18]  Jason Weston,et al.  The Goldilocks Principle: Reading Children's Books with Explicit Memory Representations , 2015, ICLR.

[19]  Danqi Chen,et al.  A Thorough Examination of the CNN/Daily Mail Reading Comprehension Task , 2016, ACL.

[20]  Jason Weston,et al.  Towards AI-Complete Question Answering: A Set of Prerequisite Toy Tasks , 2015, ICLR.

[21]  Ruslan Salakhutdinov,et al.  Gated-Attention Readers for Text Comprehension , 2016, ACL.

[22]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[23]  Jian Zhang,et al.  SQuAD: 100,000+ Questions for Machine Comprehension of Text , 2016, EMNLP.

[24]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[25]  Hai Wang,et al.  Broad Context Language Modeling as Reading Comprehension , 2016, EACL.