Denoising Relation Extraction from Document-level Distant Supervision

Distant supervision (DS) has been widely used to generate auto-labeled data for sentence-level relation extraction (RE), which improves RE performance. However, the existing success of DS cannot be directly transferred to the more challenging document-level relation extraction (DocRE), since the inherent noise in DS may be even multiplied in document level and significantly harm the performance of RE. To address this challenge, we propose a novel pre-trained model for DocRE, which denoises the document-level DS data via multiple pre-training tasks. Experimental results on the large-scale DocRE benchmark show that our model can capture useful information from noisy DS data and achieve promising results.

[1]  Yaoyun Zhang,et al.  CD-REST: a system for extracting chemical-induced disease relation in literature , 2016, Database J. Biol. Databases Curation.

[2]  Hong Wang,et al.  Fine-tune Bert for DocRED with Two-step Process , 2019, ArXiv.

[3]  William Yang Wang,et al.  DSGAN: Generative Adversarial Training for Distant Supervision Relation Extraction , 2018, ACL.

[4]  Nanyun Peng,et al.  Cross-Sentence N-ary Relation Extraction with Graph LSTMs , 2017, TACL.

[5]  Li Zhao,et al.  Reinforcement Learning for Relation Classification From Noisy Data , 2018, AAAI.

[6]  Iryna Gurevych,et al.  Context-Aware Representations for Knowledge Base Relation Extraction , 2017, EMNLP.

[7]  Sophia Ananiadou,et al.  Connecting the Dots: Document-level Neural Relation Extraction with Edge-oriented Graphs , 2019, EMNLP.

[8]  Hoifung Poon,et al.  Distant Supervision for Relation Extraction beyond the Sentence Boundary , 2016, EACL.

[9]  Zhiyuan Liu,et al.  Neural Relation Extraction with Selective Attention over Instances , 2016, ACL.

[10]  Guodong Zhou,et al.  Chemical-induced disease relation extraction via convolutional neural network , 2017, Database J. Biol. Databases Curation.

[11]  Tianyang Zhang,et al.  A Hierarchical Framework for Relation Extraction with Reinforcement Learning , 2018, AAAI.

[12]  Maosong Sun,et al.  DocRED: A Large-Scale Document-Level Relation Extraction Dataset , 2019, ACL.

[13]  Ralph Grishman,et al.  Distant Supervision for Relation Extraction with an Incomplete Knowledge Base , 2013, NAACL.

[14]  Sophia Ananiadou,et al.  Inter-sentence Relation Extraction with Document-level Graph Convolutional Neural Network , 2019, ACL.

[15]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[16]  Daniel Jurafsky,et al.  Distant supervision for relation extraction without labeled data , 2009, ACL.

[17]  Jun Zhao,et al.  Relation Classification via Convolutional Deep Neural Network , 2014, COLING.

[18]  Peng Zhou,et al.  Distant supervision for relation extraction with hierarchical selective attention , 2018, Neural Networks.

[19]  Zhenyu Zhang,et al.  HIN: Hierarchical Inference Network for Document-Level Relation Extraction , 2020, PAKDD.

[20]  Jeffrey Ling,et al.  Matching the Blanks: Distributional Similarity for Relation Learning , 2019, ACL.