论文信息 - PeTra: A Sparsely Supervised Memory Model for People Tracking - 字舞流文

PeTra: A Sparsely Supervised Memory Model for People Tracking

We propose PeTra, a memory-augmented neural network designed to track entities in its memory slots. PeTra is trained using sparse annotation from the GAP pronoun resolution dataset and outperforms a prior memory model on the task while using a simpler architecture. We empirically compare key modeling choices, finding that we can simplify several aspects of the design of the memory module while retaining strong performance. To measure the people tracking capability of memory models, we (a) propose a new diagnostic evaluation based on counting the number of unique entities in text, and (b) conduct a small scale human evaluation to compare evidence of people tracking in the memory logs of PeTra relative to a previous approach. PeTra is highly effective in both evaluations, demonstrating its ability to track people in its memory despite being trained with limited annotation.

Allyson Ettinger | Kevin Gimpel | Karen Livescu | Shubham Toshniwal | Kevin Gimpel | Karen Livescu | Allyson Ettinger | Shubham Toshniwal

[1] Yuchen Zhang,et al. CoNLL-2012 Shared Task: Modeling Multilingual Unrestricted Coreference in OntoNotes , 2012, EMNLP-CoNLL Shared Task.

[2] Christopher D. Manning,et al. Deep Reinforcement Learning for Mention-Ranking Coreference Models , 2016, EMNLP.

[3] Ben Poole,et al. Categorical Reparameterization with Gumbel-Softmax , 2016, ICLR.

[4] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[5] Alex Graves,et al. Neural Turing Machines , 2014, ArXiv.

[6] Julie C. Sedivy,et al. Subject Terms: Linguistics Language Eyes & eyesight Cognition & reasoning , 1995 .

[7] Sergio Gomez Colmenarejo,et al. Hybrid computing using a neural network with dynamic external memory , 2016, Nature.

[8] Gholamreza Haffari,et al. Document Context Neural Machine Translation with Memory Networks , 2017, ACL.

[9] Jason Weston,et al. Key-Value Memory Networks for Directly Reading Documents , 2016, EMNLP.

[10] Alexander M. Rush,et al. Learning Global Features for Coreference Resolution , 2016, NAACL.

[11] Ruslan Salakhutdinov,et al. Neural Models for Reasoning over Multiple Mentions Using Coreference , 2018, NAACL.

[12] Jason Weston,et al. Tracking the World State with Recurrent Entity Networks , 2016, ICLR.

[13] Luke S. Zettlemoyer,et al. End-to-end Neural Coreference Resolution , 2017, EMNLP.

[14] David Bamman,et al. An Annotated Dataset of Coreference in English Literature , 2020, LREC.

[15] Christopher D. Manning,et al. Improving Coreference Resolution by Learning Entity-Level Distributed Representations , 2016, ACL.

[16] Richard Socher,et al. Ask Me Anything: Dynamic Memory Networks for Natural Language Processing , 2015, ICML.

[17] Jason Baldridge,et al. Mind the GAP: A Balanced Corpus of Gendered Ambiguous Pronouns , 2018, TACL.

[18] Yejin Choi,et al. Dynamic Entity Representations in Neural Language Models , 2017, EMNLP.

[19] Jason Weston,et al. Learning Anaphoricity and Antecedent Ranking Features for Coreference Resolution , 2015, ACL.

[20] Timothy Baldwin,et al. Narrative Modeling with Memory Chains and Semantic Supervision , 2018, ACL.

[21] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.

[22] Fei Wang,et al. Coreference Resolution as Query-based Span Prediction , 2019, ArXiv.

[23] Yoshua Bengio,et al. Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[24] Yonatan Belinkov,et al. Linguistic Knowledge and Transferability of Contextual Representations , 2019, NAACL.

[25] Jason Weston,et al. Memory Networks , 2014, ICLR.

[26] Jason Weston,et al. End-To-End Memory Networks , 2015, NIPS.

[27] Omer Levy,et al. BERT for Coreference Resolution: Baselines and Analysis , 2019, EMNLP/IJCNLP.

[28] Frank Keller,et al. Cognitively Plausible Models of Human Language Processing , 2010, ACL.

[29] Timothy Baldwin,et al. Recurrent Entity Networks with Delayed Memory Update for Targeted Aspect-Based Sentiment Analysis , 2018, NAACL.

[30] Alexander M. Rush,et al. Entity Tracking Improves Cloze-style Reading Comprehension , 2018, EMNLP.

[31] Luke S. Zettlemoyer,et al. Higher-Order Coreference Resolution with Coarse-to-Fine Inference , 2018, NAACL.

[32] Hong Chen,et al. PreCo: A Large-scale Dataset in Preschool Vocabulary for Coreference Resolution , 2018, EMNLP.

[33] Luke S. Zettlemoyer,et al. The Referential Reader: A Recurrent Entity Network for Anaphora Resolution , 2019, ACL.

[34] Fei Liu,et al. Dialog state tracking, a machine reading approach using Memory Network , 2016, EACL.