How Does Context Matter? On the Robustness of Event Detection with Context-Selective Mask Generalization

Event detection (ED) aims to identify and classify event triggers in texts, which is a crucial subtask of event extraction (EE). Despite many advances in ED, the existing studies are typically centered on improving the overall performance of an ED model, which rarely consider the robustness of an ED model. This paper aims to fill this research gap by stressing the importance of robustness modeling in ED models. We first pinpoint three stark cases demonstrating the brittleness of the existing ED models. After analyzing the underlying reason, we propose a new training mechanism, called context-selective mask generalization for ED, which can effectively mine context-specific patterns for learning and robustify an ED model. The experimental results have confirmed the effectiveness of our model regarding defending against adversarial attacks, exploring unseen predicates, and tackling ambiguity cases. Moreover, a deeper analysis suggests that our approach can learn a complementary predictive bias with most ED models that use full context for feature learning.

[1]  Ankur Taly,et al.  Did the Model Understand the Question? , 2018, ACL.

[2]  Dongsheng Li,et al.  Exploring Pre-trained Language Models for Event Extraction and Generation , 2019, ACL.

[3]  Heng Ji,et al.  Refining Event Extraction through Cross-Document Inference , 2008, ACL.

[4]  Dejing Dou,et al.  On Adversarial Examples for Character-Level Neural Machine Translation , 2018, COLING.

[5]  Ralph Grishman,et al.  Using Document Level Cross-Event Inference to Improve Event Extraction , 2010, ACL.

[6]  Mohit Bansal,et al.  Avoiding Reasoning Shortcuts: Adversarial Evaluation, Training, and Model Development for Multi-Hop QA , 2019, ACL.

[7]  Jian Liu,et al.  Exploiting the Ground-Truth: An Adversarial Imitation Based Knowledge Distillation Approach for Event Detection , 2019, AAAI.

[8]  Mani B. Srivastava,et al.  Generating Natural Language Adversarial Examples , 2018, EMNLP.

[9]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[10]  Jun Zhao,et al.  Exploiting Argument Information to Improve Event Detection via Supervised Attention Mechanisms , 2017, ACL.

[11]  Yubo Chen,et al.  Neural Cross-Lingual Event Detection with Minimal Parallel Resources , 2019, EMNLP.

[12]  Jun Zhao,et al.  Event Extraction via Dynamic Multi-Pooling Convolutional Neural Networks , 2015, ACL.

[13]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[14]  Xiaocheng Feng,et al.  A language-independent neural network for event detection , 2018, ACL.

[15]  Jian Liu,et al.  Event Detection via Gated Multilingual Attention Mechanism , 2018, AAAI.

[16]  Yaojie Lu,et al.  Distilling Discrimination and Generalization Knowledge for Event Detection via Delta-Representation Learning , 2019, ACL.

[17]  Zellig S. Harris,et al.  Distributional Structure , 1954 .

[18]  Yuval Pinter,et al.  Attention is not not Explanation , 2019, EMNLP.

[19]  Ralph Grishman,et al.  Joint Event Extraction via Recurrent Neural Networks , 2016, NAACL.

[20]  Luke S. Zettlemoyer,et al.  End-to-end Neural Coreference Resolution , 2017, EMNLP.

[21]  Heng Ji,et al.  Knowledge Base Population: Successful Approaches and Challenges , 2011, ACL.

[22]  Heng Ji,et al.  Joint Event Extraction via Structured Prediction with Global Features , 2013, ACL.

[23]  Xiao Liu,et al.  Jointly Multiple Events Extraction via Attention-based Graph Information Aggregation , 2018, EMNLP.

[24]  Zoubin Ghahramani,et al.  Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning , 2015, ICML.

[25]  Percy Liang,et al.  Adversarial Examples for Evaluating Reading Comprehension Systems , 2017, EMNLP.

[26]  Teruko Mitamura,et al.  Events Detection, Coreference and Sequencing: What's next? Overview of the TAC KBP 2017 Event Track , 2017, TAC.

[27]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[28]  Ilya Sutskever,et al.  Language Models are Unsupervised Multitask Learners , 2019 .

[29]  Peter Clark,et al.  Modeling Biological Processes for Reading Comprehension , 2014, EMNLP.

[30]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[31]  Lifu Huang,et al.  Zero-Shot Transfer Learning for Event Extraction , 2017, ACL.

[32]  Joan Bruna,et al.  Intriguing properties of neural networks , 2013, ICLR.

[33]  Cho-Jui Hsieh,et al.  On the Robustness of Self-Attentive Models , 2019, ACL.

[34]  David Ahn,et al.  The stages of event extraction , 2006 .

[35]  Ananthram Swami,et al.  Crafting adversarial input sequences for recurrent neural networks , 2016, MILCOM 2016 - 2016 IEEE Military Communications Conference.

[36]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[37]  Yonatan Belinkov,et al.  Synthetic and Natural Noise Both Break Neural Machine Translation , 2017, ICLR.

[38]  Bin Ma,et al.  Using Cross-Entity Inference to Improve Event Extraction , 2011, ACL.

[39]  Vasileios Hatzivassiloglou,et al.  Event-Based Extractive Summarization , 2004 .

[40]  Zhiyuan Liu,et al.  Neural Relation Extraction with Selective Attention over Instances , 2016, ACL.