DPNPED: Dynamic Perception Network for Polysemous Event Trigger Detection

Event detection is the process of analyzing event streams to detect the occurrences of events and categorize them. General methods for solving this problem are to identify and classify event triggers. Most previous works focused on improving the recognition and classification networks which neglected the representation of polysemous event triggers. Polysemy is habitually somewhat confusing in semantic understanding and hard to detect. To improve polysemous trigger detection, this paper proposes a novel framework called DPNPED, which dynamically adjusts the network depth between polysemous and common words. Firstly, to measure the polysemy, the difficulty factor is devised based on the frequency of a word as an event trigger. Secondly, the DPNPED utilizes a confidence measure to automatically adjust the network depth by comparing the predicted and initial probability distribution. Finally, our model applies focal loss to dynamically integrate the difficulty factor and confidence measure to enhance the learning of polysemous triggers. The experimental results show that our method achieves a noticeable improvement in polysemous event trigger detection.

[1]  K. Youcef-Toumi,et al.  D-DARTS: Distributed Differentiable Architecture Search , 2021, Pattern Recognition Letters.

[2]  Huajun Chen,et al.  MLBiNet: A Cross-Sentence Collective Event Detection Network , 2021, ACL.

[3]  Zhihui Li,et al.  Dynamic Slimmable Network , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Paul N. Bennett,et al.  COCO-LM: Correcting and Contrasting Text Sequences for Language Model Pretraining , 2021, NeurIPS.

[5]  Jun Li,et al.  Generalized Focal Loss V2: Learning Reliable Localization Quality Estimation for Dense Object Detection , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  S. Sra,et al.  Contrastive Learning with Hard Negative Samples , 2020, ICLR.

[7]  Fandong Meng,et al.  Faster Depth-Adaptive Transformers , 2020, AAAI.

[8]  Shikun Zhang,et al.  Point, Disambiguate and Copy: Incorporating Bilingual Dictionaries for Neural Machine Translation , 2021, ACL.

[9]  Yu Chen,et al.  On Extending NLP Techniques from the Categorical to the Latent Space: KL Divergence, Zipf's Law, and Similarity Search , 2020, ArXiv.

[10]  Yiming Yang,et al.  On the Sentence Embeddings from BERT for Semantic Textual Similarity , 2020, EMNLP.

[11]  Thien Huu Nguyen,et al.  Event Detection: Gate Diversity and Syntactic Importance Scores for Graph Convolution Neural Networks , 2020, EMNLP.

[12]  Juan-Zi Li,et al.  Improving Event Detection via Open-domain Trigger Knowledge , 2020, ACL.

[13]  Jie Zhou,et al.  MAVEN: A Massive General Domain Event Detection Dataset , 2020, EMNLP.

[14]  Peng Zhou,et al.  FastBERT: a Self-distilling BERT with Adaptive Inference Time , 2020, ACL.

[15]  Jinqiao Shi,et al.  Edge-Enhanced Graph Convolution Networks for Event Detection with Syntactic Relation , 2020, FINDINGS.

[16]  Frank Nielsen,et al.  On a Generalization of the Jensen–Shannon Divergence and the Jensen–Shannon Centroid , 2019, Entropy.

[17]  Frank F. Xu,et al.  How Can We Know What Language Models Know? , 2019, Transactions of the Association for Computational Linguistics.

[18]  Michael Auli,et al.  Depth-Adaptive Transformer , 2019, ICLR.

[19]  Kevin Gimpel,et al.  ALBERT: A Lite BERT for Self-supervised Learning of Language Representations , 2019, ICLR.

[20]  Ross B. Girshick,et al.  Focal Loss for Dense Object Detection , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[21]  Haoran Yan,et al.  Event Detection with Multi-Order Graph Convolution and Aggregated Attention , 2019, EMNLP.

[22]  Ning Ding,et al.  Event Detection with Trigger-Aware Lattice Neural Network , 2019, EMNLP.

[23]  Shay B. Cohen,et al.  Experimenting with Power Divergences for Language Modeling , 2019, EMNLP.

[24]  Benoît Sagot,et al.  What Does BERT Learn about the Structure of Language? , 2019, ACL.

[25]  Yaojie Lu,et al.  Distilling Discrimination and Generalization Knowledge for Event Detection via Delta-Representation Learning , 2019, ACL.

[26]  Mark A. Finlayson,et al.  Detecting Subevents using Discourse and Narrative Features , 2019, ACL.

[27]  Bill Byrne,et al.  Domain Adaptive Inference for Neural Machine Translation , 2019, ACL.

[28]  Xu Han,et al.  Adversarial Training for Weakly Supervised Event Detection , 2019, NAACL.

[29]  Yang Li,et al.  You Look Twice: GaterNet for Dynamic Filter Selection in CNNs , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Lukasz Kaiser,et al.  Universal Transformers , 2018, ICLR.

[31]  Zhiru Zhang,et al.  Channel Gating Neural Networks , 2018, NeurIPS.

[32]  Debadeepta Dey,et al.  Learning Anytime Predictions in Neural Networks via Adaptive Loss Balancing , 2017, AAAI.

[33]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[34]  Xiao Liu,et al.  Jointly Multiple Events Extraction via Attention-based Graph Information Aggregation , 2018, EMNLP.

[35]  Yi-Ting Huang,et al.  Characterizing the Influence of Features on Reading Difficulty Estimation for Non-native Readers , 2018, ArXiv.

[36]  Yue Zhao,et al.  Document Embedding Enhanced Event Detection with Hierarchical and Supervised Attention , 2018, ACL.

[37]  Mari Ostendorf,et al.  Estimating Linguistic Complexity for Science Texts , 2018, BEA@NAACL-HLT.

[38]  Jun Zhao,et al.  Collective Event Detection via a Hierarchical and Bias Tagging Networks with Gated Multi-level Attention Mechanisms , 2018, EMNLP.

[39]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[40]  Xiaoxiao Li,et al.  Not All Pixels Are Equal: Difficulty-Aware Semantic Segmentation via Deep Layer Cascade , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[41]  Venkatesh Saligrama,et al.  Adaptive Neural Networks for Efficient Inference , 2017, ICML.

[42]  Ralph Grishman,et al.  Joint Event Extraction via Recurrent Neural Networks , 2016, NAACL.

[43]  Walt Detmar Meurers,et al.  Characterizing Text Difficulty with Word Frequencies , 2016, BEA@NAACL-HLT.

[44]  Tom M. Mitchell,et al.  Joint Extraction of Events and Entities within a Document Context , 2016, NAACL.

[45]  Alex Graves,et al.  Adaptive Computation Time for Recurrent Neural Networks , 2016, ArXiv.

[46]  Richard Socher,et al.  Ask Me Anything: Dynamic Memory Networks for Natural Language Processing , 2015, ICML.

[47]  John R. Hershey,et al.  Approximating the Kullback Leibler Divergence Between Gaussian Mixture Models , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[48]  Hal Burdick,et al.  THE LEXILE FRAMEWORK AS AN APPROACH FOR READING MEASUREMENT AND SUCCESS , 2004 .