Similar but not the Same: Word Sense Disambiguation Improves Event Detection via Neural Representation Matching

Event detection (ED) and word sense disambiguation (WSD) are two similar tasks in that they both involve identifying the classes (i.e. event types or word senses) of some word in a given sentence. It is thus possible to extract the knowledge hidden in the data for WSD, and utilize it to improve the performance on ED. In this work, we propose a method to transfer the knowledge learned on WSD to ED by matching the neural representations learned for the two tasks. Our experiments on two widely used datasets for ED demonstrate the effectiveness of the proposed method.

[1]  Ralph Grishman,et al.  Joint Event Extraction via Recurrent Neural Networks , 2016, NAACL.

[2]  Xavier Carreras,et al.  Joint Arc-factored Parsing of Syntactic and Semantic Dependencies , 2013, Transactions of the Association for Computational Linguistics.

[3]  Heng Ji,et al.  Refining Event Extraction through Cross-Document Inference , 2008, ACL.

[4]  Dianhai Yu,et al.  Multi-Task Learning for Multiple Language Translation , 2015, ACL.

[5]  Hwee Tou Ng,et al.  One Million Sense-Tagged Instances for Word Sense Disambiguation and Induction , 2015, CoNLL.

[6]  Ryan Doherty,et al.  Semi-supervised Word Sense Disambiguation with Neural Models , 2016, COLING.

[7]  Jun Zhao,et al.  Event Extraction via Dynamic Multi-Pooling Convolutional Neural Networks , 2015, ACL.

[8]  Jun'ichi Tsujii,et al.  Incremental Joint Approach to Word Segmentation, POS Tagging, and Dependency Parsing in Chinese , 2012, ACL.

[9]  Heng Ji,et al.  Joint Event Extraction via Structured Prediction with Global Features , 2013, ACL.

[10]  Ralph Grishman,et al.  New York University 2016 System for KBP Event Nugget: A Deep Learning Approach , 2016, TAC.

[11]  George A. Miller,et al.  Using a Semantic Concordance for Sense Identification , 1994, HLT.

[12]  Jason Weston,et al.  A unified architecture for natural language processing: deep neural networks with multitask learning , 2008, ICML '08.

[13]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[14]  Ralph Grishman,et al.  Event Detection and Domain Adaptation with Convolutional Neural Networks , 2015, ACL.

[15]  Xiang Zhang,et al.  Automatically Labeled Data Generation for Large Scale Event Extraction , 2017, ACL.

[16]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[17]  Thien Huu Nguyen,et al.  Who is Killed by Police: Introducing Supervised Attention for Hierarchical LSTMs , 2018, COLING.

[18]  Xiaocheng Feng,et al.  A language-independent neural network for event detection , 2016, Science China Information Sciences.

[19]  Sigrid Klerke,et al.  Improving sentence compression by learning to predict gaze , 2016, NAACL.

[20]  Ralph Grishman,et al.  Relation Extraction: Perspective from Convolutional Neural Networks , 2015, VS@HLT-NAACL.

[21]  Jun Xu,et al.  A Unified Architecture for Semantic Role Labeling and Relation Classification , 2016, COLING.

[22]  Haizhou Li,et al.  Joint Models for Chinese POS Tagging and Dependency Parsing , 2011, EMNLP.

[23]  Quoc V. Le,et al.  Multi-task Sequence to Sequence Learning , 2015, ICLR.

[24]  Hwee Tou Ng,et al.  It Makes Sense: A Wide-Coverage Word Sense Disambiguation System for Free Text , 2010, ACL.

[25]  Ivan Titov,et al.  Multilingual Joint Parsing of Syntactic and Semantic Dependencies with a Latent Variable Model , 2013, CL.

[26]  Joakim Nivre,et al.  A Transition-Based System for Joint Part-of-Speech Tagging and Labeled Non-Projective Dependency Parsing , 2012, EMNLP.

[27]  Teruko Mitamura,et al.  Overview of TAC KBP 2015 Event Nugget Track , 2015, TAC.

[28]  Roberto Navigli,et al.  Neural Sequence Learning Models for Word Sense Disambiguation , 2017, EMNLP.

[29]  Ralph Grishman,et al.  Graph Convolutional Networks With Argument-Aware Pooling for Event Detection , 2018, AAAI.

[30]  Ralph Grishman,et al.  Improving Event Detection with Abstract Meaning Representation , 2015 .

[31]  Yang Liu,et al.  Implicit Discourse Relation Classification via Multi-Task Neural Networks , 2016, AAAI.

[32]  Bin Ma,et al.  Using Cross-Entity Inference to Improve Event Extraction , 2011, ACL.

[33]  Ralph Grishman,et al.  A Two-stage Approach for Extending Event Detection to New Types via Neural Networks , 2016, Rep4NLP@ACL.

[34]  Hong Yu,et al.  Bidirectional RNN for Medical Event Detection in Electronic Health Records , 2016, NAACL.

[35]  David Ahn,et al.  The stages of event extraction , 2006 .

[36]  Chen Chen,et al.  Relieving the Computational Bottleneck: Joint Inference for Event Extraction with High-Dimensional Features , 2014, EMNLP.

[37]  Ralph Grishman,et al.  Modeling Skip-Grams for Event Detection with Convolutional Neural Networks , 2016, EMNLP.

[38]  Jun Zhao,et al.  Exploiting Argument Information to Improve Event Detection via Supervised Attention Mechanisms , 2017, ACL.

[39]  Jian Liu,et al.  Event Detection via Gated Multilingual Attention Mechanism , 2018, AAAI.

[40]  Matthew D. Zeiler ADADELTA: An Adaptive Learning Rate Method , 2012, ArXiv.

[41]  Trevor Cohn,et al.  Low Resource Dependency Parsing: Cross-lingual Parameter Sharing in a Neural Network Parser , 2015, ACL.