Improving Event Detection using Contextual Word and Sentence Embeddings

The task of Event Detection (ED) is a subfield of Information Extraction (IE) that consists in recognizing event mentions in natural language texts. Several applications can take advantage of an ED system, including alert systems, text summarization, question-answering systems, and any system that needs to extract structured information about events from unstructured texts. ED is a complex task, which is hampered by two main challenges: the lack of a dataset large enough to train and test the developed models and the variety of event type definitions that exist in the literature. These problems make generalization hard to achieve, resulting in poor adaptation to different domains and targets. The main contribution of this paper is the design, implementation and evaluation of a recurrent neural network model for ED that combines several features. In particular, the paper makes the following contributions: (1) it uses BERT embeddings to define contextual word and contextual sentence embeddings as attributes, which to the best of our knowledge were never used before for the ED task; (2) the proposed model has the ability to use its first layer to learn good feature representations; (3) a new public dataset with a general definition of event; (4) an extensive empirical evaluation that includes (i) the exploration of different architectures and hyperparameters, (ii) an ablation test to study the impact of each attribute, and (iii) a comparison with a replication of a state-of-the-art model. The results offer several insights into the importance of contextual embeddings and indicate that the proposed approach is effective in the ED task, outperforming the baseline models.

[1]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[2]  Zhifang Sui,et al.  Jointly Extracting Event Triggers and Arguments by Dependency-Bridge RNN and Tensor-Based Argument Interaction , 2018, AAAI.

[3]  Ken Lang,et al.  NewsWeeder: Learning to Filter Netnews , 1995, ICML.

[4]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[5]  Emanuela Boros Neural Methods for Event Extraction , 2018 .

[6]  Lifu Huang,et al.  Zero-Shot Transfer Learning for Event Extraction , 2017, ACL.

[7]  Luke S. Zettlemoyer,et al.  Deep Contextualized Word Representations , 2018, NAACL.

[8]  Xiaocheng Feng,et al.  A language-independent neural network for event detection , 2016, Science China Information Sciences.

[9]  Ellen Riloff,et al.  Automatically Generating Extraction Patterns from Untagged Text , 1996, AAAI/IAAI, Vol. 2.

[10]  Tomas Mikolov,et al.  Enriching Word Vectors with Subword Information , 2016, TACL.

[11]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[12]  Jian Liu,et al.  Event Detection via Gated Multilingual Attention Mechanism , 2018, AAAI.

[13]  Ralph Grishman,et al.  Automatic Acquisition of Domain Knowledge for Information Extraction , 2000, COLING.

[14]  Mihai Surdeanu,et al.  A Hybrid Approach for the Acquisition of Information Extraction Patterns , 2006 .

[15]  Joel R. Tetreault,et al.  It Depends: Dependency Parser Comparison Using A Web-based Evaluation Tool , 2015, ACL.

[16]  Lisa F. Rau,et al.  GE: description of the NLTooLSET system as used for MUC-3 , 1991, MUC.

[17]  Hwee Tou Ng,et al.  Closing the Gap: Learning-Based Information Extraction Rivaling Knowledge-Engineering Methods , 2003, ACL.

[18]  Ellen Riloff,et al.  An Empirical Study of Automated Dictionary Construction for Information Extraction in Three Domains , 1996, Artif. Intell..

[19]  Siddharth Patwardhan,et al.  A Unified Model of Phrasal and Sentential Evidence for Information Extraction , 2009, EMNLP.

[20]  Heng Ji,et al.  Joint Entity and Event Extraction with Generative Adversarial Imitation Learning , 2019, Data Intelligence.

[21]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[22]  James Pustejovsky,et al.  TimeML: Robust Specification of Event and Temporal Expressions in Text , 2003, New Directions in Question Answering.

[23]  Ralph Grishman,et al.  Using Document Level Cross-Event Inference to Improve Event Extraction , 2010, ACL.

[24]  Chang-Shing Lee,et al.  Ontology-based fuzzy event extraction agent for Chinese e-news summarization , 2003, Expert Syst. Appl..

[25]  Heng Ji,et al.  Joint Event Extraction via Structured Prediction with Global Features , 2013, ACL.

[26]  Xiao Liu,et al.  Jointly Multiple Events Extraction via Attention-based Graph Information Aggregation , 2018, EMNLP.

[27]  David Ahn,et al.  The stages of event extraction , 2006 .

[28]  Ralph Grishman,et al.  Modeling Skip-Grams for Event Detection with Convolutional Neural Networks , 2016, EMNLP.

[29]  Mark A. Przybocki,et al.  The Automatic Content Extraction (ACE) Program – Tasks, Data, and Evaluation , 2004, LREC.

[30]  Heng Ji,et al.  Refining Event Extraction through Cross-Document Inference , 2008, ACL.

[31]  Shuang Wu,et al.  Zero-Shot Event Detection Using Multi-modal Fusion of Weakly Supervised Concepts , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[32]  Jun Zhao,et al.  Event Extraction via Dynamic Multi-Pooling Convolutional Neural Networks , 2015, ACL.

[33]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[34]  Ruifang He,et al.  Exploiting Document Level Information to Improve Event Detection via Recurrent Neural Networks , 2017, IJCNLP.

[35]  Douglas E. Appelt,et al.  SRI International: description of the FASTUS system used for MUC-4 , 1992, MUC.

[36]  Bin Ma,et al.  Using Cross-Entity Inference to Improve Event Extraction , 2011, ACL.

[37]  Ralph Grishman,et al.  Joint Event Extraction via Recurrent Neural Networks , 2016, NAACL.

[38]  Ralph Grishman,et al.  Graph Convolutional Networks With Argument-Aware Pooling for Event Detection , 2018, AAAI.

[39]  Ralph Grishman,et al.  Event Detection and Domain Adaptation with Convolutional Neural Networks , 2015, ACL.

[40]  Heng Ji,et al.  Seed-Based Event Trigger Labeling: How far can event descriptions get us? , 2015, ACL.

[41]  Ralph Grishman,et al.  A Two-stage Approach for Extending Event Detection to New Types via Neural Networks , 2016, Rep4NLP@ACL.

[42]  Dayne Freitag,et al.  Information Extraction from HTML: Application of a General Machine Learning Approach , 1998, AAAI/IAAI.

[43]  Els Lefever,et al.  Economic Event Detection in Company-Specific News Text , 2018, ECONLP@ACL.

[44]  Ellen Riloff,et al.  Peeling Back the Layers: Detecting Event Role Fillers in Secondary Contexts , 2011, ACL.

[45]  Mohamed Medhat Gaber,et al.  A rule dynamics approach to event detection in Twitter with its application to sports and politics , 2016, Expert Syst. Appl..