Document-Level Relation Extraction with Adaptive Focal Loss and Knowledge Distillation

Document-level Relation Extraction (DocRE) is a more challenging task compared to its sentence-level counterpart. It aims to extract relations from multiple sentences at once. In this paper, we propose a semi-supervised framework for DocRE with three novel components. Firstly, we use an axial attention module for learning the interdependency among entity-pairs, which improves the performance on two-hop relations. Secondly, we propose an adaptive focal loss to tackle the class imbalance problem of DocRE. Lastly, we use knowledge distillation to overcome the differences between human annotated data and distantly supervised data. We conducted experiments on two DocRE datasets. Our model consistently outperforms strong baselines and its performance exceeds the previous SOTA by 1.36 F1 and 1.46 Ign_F1 score on the DocRED leaderboard.

[1]  Chuanqi Tan,et al.  Document-level Relation Extraction as Semantic Segmentation , 2021, IJCAI.

[2]  Baobao Chang,et al.  SIRE: Separate Intra- and Inter-sentential Reasoning for Document-level Relation Extraction , 2021, FINDINGS.

[3]  Zhendong Mao,et al.  Entity Structure Within and Throughout: Modeling Mention Dependencies for Document-Level Relation Extraction , 2021, AAAI.

[4]  Jing Huang,et al.  Document-Level Relation Extraction with Adaptive Thresholding and Localized Context Pooling , 2020, AAAI.

[5]  Zhiyuan Liu,et al.  Learning from Context or Names? An Empirical Study on Neural Relation Extraction , 2020, EMNLP.

[6]  Shuang Zeng,et al.  Double Graph Based Reasoning for Document-level Relation Extraction , 2020, EMNLP.

[7]  Wei Lu,et al.  Reasoning with Latent Structure Refinement for Document-Level Relation Extraction , 2020, ACL.

[8]  Zhenghao Liu,et al.  Coreferential Reasoning Learning for Language Representation , 2020, EMNLP.

[9]  Zhenyu Zhang,et al.  HIN: Hierarchical Inference Network for Document-Level Relation Extraction , 2020, PAKDD.

[10]  A. Yuille,et al.  Axial-DeepLab: Stand-Alone Axial-Attention for Panoptic Segmentation , 2020, ECCV.

[11]  Myle Ott,et al.  Unsupervised Cross-lingual Representation Learning at Scale , 2019, ACL.

[12]  Hong Wang,et al.  Fine-tune Bert for DocRED with Two-step Process , 2019, ArXiv.

[13]  Sophia Ananiadou,et al.  Connecting the Dots: Document-level Neural Relation Extraction with Edge-oriented Graphs , 2019, EMNLP.

[14]  Omer Levy,et al.  RoBERTa: A Robustly Optimized BERT Pretraining Approach , 2019, ArXiv.

[15]  Maosong Sun,et al.  DocRED: A Large-Scale Document-Level Relation Extraction Dataset , 2019, ACL.

[16]  Jeffrey Ling,et al.  Matching the Blanks: Distributional Similarity for Relation Learning , 2019, ACL.

[17]  Christopher D. Manning,et al.  Graph Convolution over Pruned Dependency Trees Improves Relation Extraction , 2018, EMNLP.

[18]  William Yang Wang,et al.  DSGAN: Generative Adversarial Training for Distant Supervision Relation Extraction , 2018, ACL.

[19]  Li Zhao,et al.  Reinforcement Learning for Relation Classification From Noisy Data , 2018, AAAI.

[20]  Andrew McCallum,et al.  Simultaneously Self-Attending to All Mentions for Full-Abstract Biological Relation Extraction , 2018, NAACL.

[21]  Frank Hutter,et al.  Decoupled Weight Decay Regularization , 2017, ICLR.

[22]  Danqi Chen,et al.  Position-aware Attention and Supervised Data Improve Slot Filling , 2017, EMNLP.

[23]  Ross B. Girshick,et al.  Focal Loss for Dense Object Detection , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[24]  Kaiming He,et al.  Focal Loss for Dense Object Detection , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[25]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[26]  Nanyun Peng,et al.  Cross-Sentence N-ary Relation Extraction with Graph LSTMs , 2017, TACL.

[27]  Hoifung Poon,et al.  Distant Supervision for Relation Extraction beyond the Sentence Boundary , 2016, EACL.

[28]  Markus Krötzsch,et al.  Wikidata , 2014, Commun. ACM.

[29]  Juntao Liu,et al.  HacRED: A Large-Scale Relation Extraction Dataset Toward Hard Cases in Practical Applications , 2021, FINDINGS.

[30]  Dong-Hong Ji,et al.  MRN: A Locally and Globally Mention-Based Reasoning Network for Document-Level Relation Extraction , 2021, FINDINGS.

[31]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[32]  Ah Chung Tsoi,et al.  The Graph Neural Network Model , 2009, IEEE Transactions on Neural Networks.