Empower event detection with bi-directional neural language model

Abstract Event detection is an essential and challenging task in Information Extraction (IE). Recent advances in neural networks make it possible to build reliable models without complicated feature engineering. However, data scarcity hinders their further performance. Moreover, training data has been underused since majority of labels in datasets are not event triggers and contribute very little to the training process. In this paper, we propose a novel multi-task learning framework to extract more general patterns from raw data and make better use of the training data. Specifically, we present two paradigms to incorporate neural language model into event detection model on both word and character levels: (1) we use the features extracted by language model as an additional input to event detection model. (2) We use a hard parameter sharing approach between language model and event detection model. The extensive experiments demonstrate the benefits of the proposed multi-task learning framework for event detection. Compared to the previous methods, our method does not rely on any additional supervision but still beats the majority of them and achieves a competitive performance on the ACE 2005 benchmark.

[1]  Yoshua Bengio,et al.  Why Does Unsupervised Pre-training Help Deep Learning? , 2010, AISTATS.

[2]  Jun Zhao,et al.  Exploiting Argument Information to Improve Event Detection via Supervised Attention Mechanisms , 2017, ACL.

[3]  David Ahn,et al.  The stages of event extraction , 2006 .

[4]  Ralph Grishman,et al.  Modeling Skip-Grams for Event Detection with Convolutional Neural Networks , 2016, EMNLP.

[5]  Ralph Grishman,et al.  Joint Event Extraction via Recurrent Neural Networks , 2016, NAACL.

[6]  Jun Zhao,et al.  A Probabilistic Soft Logic Based Approach to Exploiting Latent and Global Information in Event Classification , 2016, AAAI.

[7]  Jun Zhao,et al.  Leveraging FrameNet to Improve Automatic Event Detection , 2016, ACL.

[8]  Jason Weston,et al.  A unified architecture for natural language processing: deep neural networks with multitask learning , 2008, ICML '08.

[9]  Heng Ji,et al.  A Language-Independent Neural Network for Event Detection , 2016, ACL 2016.

[10]  Heng Ji,et al.  Joint Event Extraction via Structured Prediction with Global Features , 2013, ACL.

[11]  Ralph Grishman,et al.  Event Detection and Domain Adaptation with Convolutional Neural Networks , 2015, ACL.

[12]  Heng Ji,et al.  Refining Event Extraction through Cross-Document Inference , 2008, ACL.

[13]  Alexander M. Rush,et al.  Character-Aware Neural Language Models , 2015, AAAI.

[14]  Zaiqing Nie,et al.  Joint Entity Recognition and Disambiguation , 2015, EMNLP.

[15]  Jun Zhao,et al.  Event Extraction via Dynamic Multi-Pooling Convolutional Neural Networks , 2015, ACL.

[16]  Anders Søgaard,et al.  Deep multi-task learning with low level tasks supervised at lower layers , 2016, ACL.

[17]  Bin Ma,et al.  Using Cross-Entity Inference to Improve Event Extraction , 2011, ACL.

[18]  Xiang Zhang,et al.  Automatically Labeled Data Generation for Large Scale Event Extraction , 2017, ACL.

[19]  Yoshua Bengio,et al.  A Neural Probabilistic Language Model , 2003, J. Mach. Learn. Res..

[20]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[21]  Xiaoli Z. Fern,et al.  Event Nugget Detection with Forward-Backward Recurrent Neural Networks , 2016, ACL.

[22]  Mihai Surdeanu,et al.  The Stanford CoreNLP Natural Language Processing Toolkit , 2014, ACL.