Document-level Event-based Extraction Using Generative Template-filling Transformers

We revisit the classic information extraction problem of document-level template filling. We argue that sentence-level approaches are ill-suited to the task and introduce a generative transformer-based encoder-decoder framework that is designed to model context at the document level: it can make extraction decisions across sentence boundaries; is \emph{implicitly} aware of noun phrase coreference structure, and has the capacity to respect cross-role dependencies in the template structure. We evaluate our approach on the MUC-4 dataset, and show that our model performs substantially better than prior work. We also show that our modeling choices contribute to model performance, e.g., by implicitly capturing linguistic knowledge such as recognizing coreferent entity mentions. Our code for the evaluation script and models will be open-sourced at this https URL for reproduction purposes.

[1]  Makoto Miwa,et al.  End-to-End Relation Extraction using LSTMs on Sequences and Tree Structures , 2016, ACL.

[2]  Ruifang He,et al.  Exploiting Document Level Information to Improve Event Detection via Recurrent Neural Networks , 2017, IJCNLP.

[3]  Ellen Riloff,et al.  Peeling Back the Layers: Detecting Event Role Fillers in Secondary Contexts , 2011, ACL.

[4]  Ellen Riloff,et al.  Modeling Textual Cohesion for Event Extraction , 2012, AAAI.

[5]  Benjamin Van Durme,et al.  Multi-Sentence Argument Linking , 2020, ACL.

[6]  Nathanael Chambers,et al.  Template-Based Information Extraction without the Templates , 2011, ACL.

[7]  Beth M. Sundheim The Message Understanding Conferences , 1996, TIPSTER.

[8]  Xiaocheng Feng,et al.  A language-independent neural network for event detection , 2018, ACL.

[9]  Dan Klein,et al.  An Empirical Investigation of Statistical Significance in NLP , 2012, EMNLP.

[10]  Hoifung Poon,et al.  Document-Level N-ary Relation Extraction with Multiscale Representation Learning , 2019, NAACL.

[11]  Ralph Grishman,et al.  Design of the MUC-6 evaluation , 1995, MUC.

[12]  Yue Zhao,et al.  Document Embedding Enhanced Event Detection with Hierarchical and Supervised Attention , 2018, ACL.

[13]  Ralph Grishman,et al.  Event Detection and Domain Adaptation with Convolutional Neural Networks , 2015, ACL.

[14]  Heng Ji,et al.  Refining Event Extraction through Cross-Document Inference , 2008, ACL.

[15]  Nanyun Peng,et al.  Cross-Sentence N-ary Relation Extraction with Graph LSTMs , 2017, TACL.

[16]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[17]  Maosong Sun,et al.  DocRED: A Large-Scale Document-Level Relation Extraction Dataset , 2019, ACL.

[18]  Heyan Huang,et al.  Open Domain Event Extraction Using Neural Latent Variable Models , 2019, ACL.

[19]  Tom M. Mitchell,et al.  Joint Extraction of Events and Entities within a Document Context , 2016, NAACL.

[20]  Jun Zhao,et al.  Event Extraction via Dynamic Multi-Pooling Convolutional Neural Networks , 2015, ACL.

[21]  Harold W. Kuhn,et al.  The Hungarian method for the assignment problem , 1955, 50 Years of Integer Programming.

[22]  Guillaume Lample,et al.  Neural Architectures for Named Entity Recognition , 2016, NAACL.

[23]  Ralph Grishman,et al.  Improving Event Detection with Abstract Meaning Representation , 2015 .

[24]  Beth Sundheim,et al.  Overview of the Third Message Understanding Evaluation and Conference , 1991, MUC.

[25]  Ralph Grishman,et al.  Joint Event Extraction via Recurrent Neural Networks , 2016, NAACL.

[26]  Xiaoqiang Luo,et al.  On Coreference Resolution Performance Metrics , 2005, HLT.

[27]  Heng Ji,et al.  Joint Event Extraction via Structured Prediction with Global Features , 2013, ACL.

[28]  Xiao Liu,et al.  Jointly Multiple Events Extraction via Attention-based Graph Information Aggregation , 2018, EMNLP.

[29]  Jackie Chi Kit Cheung,et al.  Probabilistic Frame Induction , 2013, NAACL.

[30]  Nathanael Chambers,et al.  Event Schema Induction with a Probabilistic Entity-Driven Model , 2013, EMNLP.

[31]  James H. Martin,et al.  Speech and language processing: an introduction to natural language processing, computational linguistics, and speech recognition, 2nd Edition , 2000, Prentice Hall series in artificial intelligence.

[32]  Siddharth Patwardhan,et al.  A Unified Model of Phrasal and Sentential Evidence for Information Extraction , 2009, EMNLP.

[33]  Heng Ji,et al.  Joint Entity and Event Extraction with Generative Adversarial Imitation Learning , 2019, Data Intelligence.

[34]  Hoifung Poon,et al.  Deep Probabilistic Logic: A Unifying Framework for Indirect Supervision , 2018, EMNLP.

[35]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[36]  Hannaneh Hajishirzi,et al.  Entity, Relation, and Event Extraction with Contextualized Span Representations , 2019, EMNLP.

[37]  Mari Ostendorf,et al.  Multi-Task Identification of Entities, Relations, and Coreference for Scientific Knowledge Graph Construction , 2018, EMNLP.

[38]  Mari Ostendorf,et al.  A general framework for information extraction using dynamic span graphs , 2019, NAACL.

[39]  Di He,et al.  Layer-Wise Coordination between Encoder and Decoder for Neural Machine Translation , 2018, NeurIPS.

[40]  Xiaodong Liu,et al.  Unified Language Model Pre-training for Natural Language Understanding and Generation , 2019, NeurIPS.

[41]  J. Munkres ALGORITHMS FOR THE ASSIGNMENT AND TRANSIORTATION tROBLEMS* , 1957 .

[42]  Xinya Du,et al.  Document-Level Event Role Filler Extraction using Multi-Granularity Contextualized Encoding , 2020, ACL.

[43]  Ralph Grishman,et al.  Using Document Level Cross-Event Inference to Improve Event Extraction , 2010, ACL.

[44]  Jun Zhao,et al.  Exploiting Argument Information to Improve Event Detection via Supervised Attention Mechanisms , 2017, ACL.

[45]  Quoc V. Le,et al.  Sequence to Sequence Learning with Neural Networks , 2014, NIPS.