Entity Tracking Improves Cloze-style Reading Comprehension

Recent work has improved on modeling for reading comprehension tasks with simple approaches such as the Attention Sum-Reader; however, automatic systems still significantly trail human performance. Analysis suggests that many of the remaining hard instances are related to the inability to track entity-references throughout documents. This work focuses on these hard entity tracking cases with two extensions: (1) additional entity features, and (2) training with a multi-task tracking objective. We show that these simple modifications improve performance both independently and in combination, and we outperform the previous state of the art on the LAMBADA dataset by 8 pts, particularly on difficult entity examples. We also effectively match the performance of more complicated models on the named entity portion of the CBT dataset.

[1]  Ruslan Salakhutdinov,et al.  Gated-Attention Readers for Text Comprehension , 2016, ACL.

[2]  David A. McAllester,et al.  Emergent Predication Structure in Hidden State Vectors of Neural Readers , 2016, Rep4NLP@ACL.

[3]  Jason Weston,et al.  Learning Anaphoricity and Antecedent Ranking Features for Coreference Resolution , 2015, ACL.

[4]  Jason Weston,et al.  Tracking the World State with Recurrent Entity Networks , 2016, ICLR.

[5]  Ali Farhadi,et al.  Query-Reduction Networks for Question Answering , 2016, ICLR.

[6]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[7]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[8]  Wiebke Wagner,et al.  Steven Bird, Ewan Klein and Edward Loper: Natural Language Processing with Python, Analyzing Text with the Natural Language Toolkit , 2010, Lang. Resour. Evaluation.

[9]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[10]  Sandro Pezzelle,et al.  The LAMBADA dataset: Word prediction requiring a broad discourse context , 2016, ACL.

[11]  Jason Weston,et al.  The Goldilocks Principle: Reading Children's Books with Explicit Memory Representations , 2015, ICLR.

[12]  Rudolf Kadlec,et al.  Text Understanding with the Attention Sum Reader Network , 2016, ACL.

[13]  Alexander M. Rush,et al.  Learning Global Features for Coreference Resolution , 2016, NAACL.

[14]  Danqi Chen,et al.  A Thorough Examination of the CNN/Daily Mail Reading Comprehension Task , 2016, ACL.

[15]  Christopher D. Manning,et al.  Incorporating Non-local Information into Information Extraction Systems by Gibbs Sampling , 2005, ACL.

[16]  Hai Wang,et al.  Broad Context Language Modeling as Reading Comprehension , 2016, EACL.

[17]  Jing Zhang,et al.  DIM Reader: Dual Interaction Model for Machine Comprehension , 2017, CCL.

[18]  Phil Blunsom,et al.  Teaching Machines to Read and Comprehend , 2015, NIPS.

[19]  Philip Bachman,et al.  Natural Language Comprehension with the EpiReader , 2016, EMNLP.

[20]  Ruslan Salakhutdinov,et al.  Linguistic Knowledge as Memory for Recurrent Neural Networks , 2017, ArXiv.

[21]  Ruslan Salakhutdinov,et al.  Neural Models for Reasoning over Multiple Mentions Using Coreference , 2018, NAACL.

[22]  David A. McAllester,et al.  Who did What: A Large-Scale Person-Centered Cloze Dataset , 2016, EMNLP.

[23]  Wilson L. Taylor,et al.  “Cloze Procedure”: A New Tool for Measuring Readability , 1953 .

[24]  Dekang Lin,et al.  Bootstrapping Path-Based Pronoun Resolution , 2006, ACL.

[25]  Thomas G. Dietterich Approximate Statistical Tests for Comparing Supervised Classification Learning Algorithms , 1998, Neural Computation.

[26]  Ting Liu,et al.  Attention-over-Attention Neural Networks for Reading Comprehension , 2016, ACL.

[27]  David A. McAllester,et al.  Machine Comprehension with Syntax, Frames, and Semantics , 2015, ACL.

[28]  Yoshua Bengio,et al.  Random Search for Hyper-Parameter Optimization , 2012, J. Mach. Learn. Res..

[29]  Yoshua Bengio,et al.  Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling , 2014, ArXiv.