The Referential Reader: A Recurrent Entity Network for Anaphora Resolution

We present a new architecture for storing and accessing entity mentions during online text processing. While reading the text, entity references are identified, and may be stored by either updating or overwriting a cell in a fixed-length memory. The update operation implies coreference with the other mentions that are stored in the same cell; the overwrite operations causes these mentions to be forgotten. By encoding the memory operations as differentiable gates, it is possible to train the model end-to-end, using both a supervised anaphora resolution objective as well as a supplementary language modeling objective. Evaluation on a dataset of pronoun-name anaphora demonstrates that the model achieves state-of-the-art performance with purely left-to-right processing of the text.

[1]  Jason Baldridge,et al.  Mind the GAP: A Balanced Corpus of Gendered Ambiguous Pronouns , 2018, TACL.

[2]  Yejin Choi,et al.  Dynamic Entity Representations in Neural Language Models , 2017, EMNLP.

[3]  Ruslan Salakhutdinov,et al.  Neural Models for Reasoning over Multiple Mentions Using Coreference , 2018, NAACL.

[4]  Jason Weston,et al.  Tracking the World State with Recurrent Entity Networks , 2016, ICLR.

[5]  Alexander M. Rush,et al.  Learning Global Features for Coreference Resolution , 2016, NAACL.

[6]  Christopher D. Manning,et al.  Entity-Centric Coreference Resolution with Model Stacking , 2015, ACL.

[7]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[8]  Naoaki Okazaki,et al.  A Neural Language Model for Dynamically Representing the Meanings of Unknown Words and Entities in a Discourse , 2017, IJCNLP.

[9]  Jason Weston,et al.  End-To-End Memory Networks , 2015, NIPS.

[10]  Scott Weinstein,et al.  Centering: A Framework for Modeling the Local Coherence of Discourse , 1995, CL.

[11]  Luke S. Zettlemoyer,et al.  End-to-end Neural Coreference Resolution , 2017, EMNLP.

[12]  Yoshua Bengio,et al.  Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling , 2014, ArXiv.

[13]  David Schlangen,et al.  Incremental Reference Resolution: The Task, Metrics for Evaluation, and a Bayesian Filtering Model that is Sensitive to Disfluencies , 2009, SIGDIAL Conference.

[14]  Ilya Sutskever,et al.  Language Models are Unsupervised Multitask Learners , 2019 .

[15]  Jason Weston,et al.  Memory Networks , 2014, ICLR.

[16]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[17]  Julie C. Sedivy,et al.  Subject Terms: Linguistics Language Eyes & eyesight Cognition & reasoning , 1995 .

[18]  Ben Poole,et al.  Categorical Reparameterization with Gumbel-Softmax , 2016, ICLR.

[19]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[20]  Heeyoung Lee,et al.  Deterministic Coreference Resolution Based on Entity-Centric, Precision-Ranked Rules , 2013, CL.

[21]  Wang Ling,et al.  Reference-Aware Language Models , 2016, EMNLP.

[22]  Mirella Lapata,et al.  Long Short-Term Memory-Networks for Machine Reading , 2016, EMNLP.