GRIT: Generative Role-filler Transformers for Document-level Event Entity Extraction

We revisit the classic problem of document-level role-filler entity extraction (REE) for template filling. We argue that sentence-level approaches are ill-suited to the task and introduce a generative transformer-based encoder-decoder framework (GRIT) that is designed to model context at the document level: it can make extraction decisions across sentence boundaries; is implicitly aware of noun phrase coreference structure, and has the capacity to respect cross-role dependencies in the template structure. We evaluate our approach on the MUC-4 dataset, and show that our model performs substantially better than prior work. We also show that our modeling choices contribute to model performance, e.g., by implicitly capturing linguistic knowledge such as recognizing coreferent entity mentions.

[1]  Beth Sundheim,et al.  Overview of the Third Message Understanding Evaluation and Conference , 1991, MUC.

[2]  Xiaodong Liu,et al.  Unified Language Model Pre-training for Natural Language Understanding and Generation , 2019, NeurIPS.

[3]  Guillaume Lample,et al.  Neural Architectures for Named Entity Recognition , 2016, NAACL.

[4]  Heyan Huang,et al.  Open Domain Event Extraction Using Neural Latent Variable Models , 2019, ACL.

[5]  Nanyun Peng,et al.  Cross-Sentence N-ary Relation Extraction with Graph LSTMs , 2017, TACL.

[6]  Ralph Grishman,et al.  Using Document Level Cross-Event Inference to Improve Event Extraction , 2010, ACL.

[7]  Ralph Grishman,et al.  Joint Event Extraction via Recurrent Neural Networks , 2016, NAACL.

[8]  Yue Zhao,et al.  Document Embedding Enhanced Event Detection with Hierarchical and Supervised Attention , 2018, ACL.

[9]  Jackie Chi Kit Cheung,et al.  Probabilistic Frame Induction , 2013, NAACL.

[10]  Quoc V. Le,et al.  Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[11]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[12]  Hannaneh Hajishirzi,et al.  Entity, Relation, and Event Extraction with Contextualized Span Representations , 2019, EMNLP.

[13]  Hoifung Poon,et al.  Deep Probabilistic Logic: A Unifying Framework for Indirect Supervision , 2018, EMNLP.

[14]  Heng Ji,et al.  Joint Event Extraction via Structured Prediction with Global Features , 2013, ACL.

[15]  Makoto Miwa,et al.  End-to-End Relation Extraction using LSTMs on Sequences and Tree Structures , 2016, ACL.

[16]  Dan Klein,et al.  An Empirical Investigation of Statistical Significance in NLP , 2012, EMNLP.

[17]  Nathanael Chambers,et al.  Template-Based Information Extraction without the Templates , 2011, ACL.

[18]  Ralph Grishman,et al.  Design of the MUC-6 evaluation , 1995, MUC.

[19]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[20]  Nathanael Chambers,et al.  Event Schema Induction with a Probabilistic Entity-Driven Model , 2013, EMNLP.

[21]  Jun Zhao,et al.  Exploiting Argument Information to Improve Event Detection via Supervised Attention Mechanisms , 2017, ACL.

[22]  Jun Zhao,et al.  Event Extraction via Dynamic Multi-Pooling Convolutional Neural Networks , 2015, ACL.

[23]  Mari Ostendorf,et al.  Multi-Task Identification of Entities, Relations, and Coreference for Scientific Knowledge Graph Construction , 2018, EMNLP.

[24]  Xiaocheng Feng,et al.  A language-independent neural network for event detection , 2018, ACL.

[25]  Heng Ji,et al.  Joint Entity and Event Extraction with Generative Adversarial Imitation Learning , 2019, Data Intelligence.

[26]  Benjamin Van Durme,et al.  Multi-Sentence Argument Linking , 2020, ACL.

[27]  Mari Ostendorf,et al.  A general framework for information extraction using dynamic span graphs , 2019, NAACL.

[28]  Di He,et al.  Layer-Wise Coordination between Encoder and Decoder for Neural Machine Translation , 2018, NeurIPS.

[29]  Xiao Liu,et al.  Jointly Multiple Events Extraction via Attention-based Graph Information Aggregation , 2018, EMNLP.

[30]  Hoifung Poon,et al.  Document-Level N-ary Relation Extraction with Multiscale Representation Learning , 2019, NAACL.

[31]  Heng Ji,et al.  Refining Event Extraction through Cross-Document Inference , 2008, ACL.

[32]  Tom M. Mitchell,et al.  Joint Extraction of Events and Entities within a Document Context , 2016, NAACL.

[33]  Maosong Sun,et al.  DocRED: A Large-Scale Document-Level Relation Extraction Dataset , 2019, ACL.

[34]  Ellen Riloff,et al.  Peeling Back the Layers: Detecting Event Role Fillers in Secondary Contexts , 2011, ACL.

[35]  Beth M. Sundheim The Message Understanding Conferences , 1996, TIPSTER.

[36]  James H. Martin,et al.  Speech and language processing: an introduction to natural language processing, computational linguistics, and speech recognition, 2nd Edition , 2000, Prentice Hall series in artificial intelligence.

[37]  Ralph Grishman,et al.  Improving Event Detection with Abstract Meaning Representation , 2015 .

[38]  Siddharth Patwardhan,et al.  A Unified Model of Phrasal and Sentential Evidence for Information Extraction , 2009, EMNLP.

[39]  Xiaoqiang Luo,et al.  On Coreference Resolution Performance Metrics , 2005, HLT.

[40]  Ellen Riloff,et al.  Modeling Textual Cohesion for Event Extraction , 2012, AAAI.

[41]  Harold W. Kuhn,et al.  The Hungarian method for the assignment problem , 1955, 50 Years of Integer Programming.

[42]  Xinya Du,et al.  Document-Level Event Role Filler Extraction using Multi-Granularity Contextualized Encoding , 2020, ACL.

[43]  Ralph Grishman,et al.  Event Detection and Domain Adaptation with Convolutional Neural Networks , 2015, ACL.

[44]  J. Munkres ALGORITHMS FOR THE ASSIGNMENT AND TRANSIORTATION tROBLEMS* , 1957 .