Classifier-adaptation knowledge distillation framework for relation extraction and event detection with imbalanced data

Abstract Fundamental information extraction tasks, such as relation extraction and event detection, suffer from a data imbalance problem. To alleviate this problem, existing methods rely mostly on well-designed loss functions to reduce the negative influence of imbalanced data. However, this approach requires additional hyper-parameters and limits scalability. Furthermore, these methods can only benefit specific tasks and do not provide a unified framework across relation extraction and event detection. In this paper, a Classifier-Adaptation Knowledge Distillation (CAKD) framework is proposed to address these issues, thus improving relation extraction and event detection performance. The first step is to exploit sentence-level identification information across relation extraction and event detection, which can reduce identification errors caused by the data imbalance problem without relying on additional hyper-parameters. Moreover, this sentence-level identification information is used by a teacher network to guide the baseline model’s training by sharing its classifier. Like an instructor, the classifier improves the baseline model’s ability to extract this sentence-level identification information from raw texts, thus benefiting overall performance. Experiments were conducted on both relation extraction and event detection using the Text Analysis Conference Relation Extraction Dataset (TACRED) and Automatic Content Extraction (ACE) 2005 English datasets, respectively. The results demonstrate the effectiveness of the proposed framework.

[1]  Baoyu Ma,et al.  Relation Extraction for Massive News Texts , 2019 .

[2]  Kin Choong Yow,et al.  Block change learning for knowledge distillation , 2020, Inf. Sci..

[3]  Lishuang Li,et al.  Exploiting dependency information to improve biomedical event detection via gated polar attention mechanism , 2021, Neurocomputing.

[4]  Yafeng Ren,et al.  A tree-based neural network model for biomedical event trigger detection , 2020, Inf. Sci..

[5]  Jian Liu,et al.  Exploiting the Ground-Truth: An Adversarial Imitation Based Knowledge Distillation Approach for Event Detection , 2019, AAAI.

[6]  Heyan Huang,et al.  Earlier Attention? Aspect-Aware LSTM for Aspect Sentiment Analysis , 2019, IJCAI.

[7]  R. Sherratt,et al.  Adversarial learning for distant supervised relation extraction , 2018 .

[8]  Guangquan Xu,et al.  Novel Android Malware Detection Method Based on Multi-dimensional Hybrid Features Extraction and Analysis , 2019 .

[9]  Omer Levy,et al.  SpanBERT: Improving Pre-training by Representing and Predicting Spans , 2019, TACL.

[11]  Qian Liu,et al.  Domain-specific meta-embedding with latent semantic structures , 2020, Inf. Sci..

[12]  Xiang Zhang,et al.  Improving knowledge distillation via an expressive teacher , 2021, Knowl. Based Syst..

[13]  Yiyang Yao,et al.  Neural Compatibility Modeling With Probabilistic Knowledge Distillation , 2020, IEEE Transactions on Image Processing.

[14]  Hongyang Yan,et al.  MHAT: An efficient model-heterogenous aggregation training scheme for federated learning , 2021, Inf. Sci..

[15]  Jincai Yang,et al.  A novel joint biomedical event extraction framework via two-level modeling of documents , 2020, Inf. Sci..

[16]  Xuejin Sun,et al.  Texture Feature Extraction Method for Ground Nephogram Based on Contourlet and the Power Spectrum Analysis Algorithm , 2019 .

[17]  Heyan Huang,et al.  Document-level relation extraction with Entity-Selection Attention , 2021, Inf. Sci..

[18]  Arun Kumar Sangaiah,et al.  Distant Supervised Relation Extraction with Cost-Sensitive Loss , 2019 .

[19]  Yongming Han,et al.  Semantic relation extraction using sequential and tree-structured LSTM with attention , 2020, Inf. Sci..

[20]  Madichetty Sreenivasulu,et al.  Comparative study of statistical features to detect the target event during disaster , 2020, Big Data Min. Anal..