论文信息 - Finding Influential Instances for Distantly Supervised Relation Extraction - 字舞流文

Finding Influential Instances for Distantly Supervised Relation Extraction

Distant supervision (DS) is a strong way to expand the datasets for enhancing relation extraction (RE) models but often suffers from high label noise. Current works based on attention, reinforcement learning, or GAN are black-box models so they neither provide meaningful interpretation of sample selection in DS nor stability on different domains. On the contrary, this work proposes a novel model-agnostic instance sampling method for DS by influence function (IF), namely REIF. Our method identifies favorable/unfavorable instances in the bag based on IF, then does dynamic instance sampling. We design a fast influence sampling algorithm that reduces the computational complexity from \mathcal{O}(mn) to \mathcal{O}(1), with analyzing its robustness on the selected sampling function. Experiments show that by simply sampling the favorable instances during training, REIF is able to win over a series of baselines which have complicated architectures. We also demonstrate that REIF can support interpretable instance selection.

Shao-Lun Huang | Zifeng Wang | Rui Wen | Xi Chen | Yefeng Zheng | Ningyu Zhang

[1] Senlin Luo,et al. Self-selective attention using correlation between instances for distant supervision relation extraction , 2021, Neural Networks.

[2] Zhiqiang Guo,et al. Distant Supervision for Relation Extraction via Noise Filtering , 2021, ICMLC.

[3] Grigorios Tsoumakas,et al. Improving Distantly-Supervised Relation Extraction Through BERT-Based Label and Instance Embeddings , 2021, IEEE Access.

[4] Kuangrong Hao,et al. Hybrid Attention-Based Transformer Block Model for Distant Supervision Relation Extraction , 2020, Neurocomputing.

[5] Hong Zhu,et al. Less Is Better: Unweighted Data Subsampling via Influence Function , 2019, AAAI.

[6] Natalia Gimelshein,et al. PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.

[7] Xinyan Xiao,et al. ARNOR: Attention Regularization based Noise Reduction for Distant Supervision Relation Classification , 2019, ACL.

[8] Leonhard Hennig,et al. Fine-tuning Pre-Trained Transformer Language Models to Distantly Supervised Relation Extraction , 2019, ACL.

[9] Zhen-Hua Ling,et al. Distant Supervision Relation Extraction with Intra-Bag and Inter-Bag Attentions , 2019, NAACL.

[10] Liyuan Liu,et al. Cross-relation Cross-bag Attention for Distantly-supervised Relation Extraction , 2018, AAAI.

[11] Peng Zhou,et al. Distant supervision for relation extraction with hierarchical selective attention , 2018, Neural Networks.

[12] Bo Li,et al. Data Dropout: Optimizing Training Data for Convolutional Neural Networks , 2018, 2018 IEEE 30th International Conference on Tools with Artificial Intelligence (ICTAI).

[13] Min Zhang,et al. Distantly Supervised NER with Partial Annotation Learning and Reinforcement Learning , 2018, COLING.

[14] Zhiyuan Liu,et al. Denoising Distant Supervision for Relation Extraction via Instance-Level Adversarial Training , 2018, ArXiv.

[15] William Yang Wang,et al. DSGAN: Generative Adversarial Training for Distant Supervision Relation Extraction , 2018, ACL.

[16] William Yang Wang,et al. Robust Distant Supervision Relation Extraction via Deep Reinforcement Learning , 2018, ACL.

[17] Jun Zhao,et al. Large Scaled Relation Extraction With Reinforcement Learning , 2018, AAAI.

[18] Li Zhao,et al. Reinforcement Learning for Relation Classification From Noisy Data , 2018, AAAI.

[19] M. de Rijke,et al. Finding Influential Training Samples for Gradient Boosted Decision Trees , 2018, ICML.

[20] David Bamman,et al. Adversarial Training for Relation Extraction , 2017, EMNLP.

[21] Zhifang Sui,et al. A Soft-label Method for Noise-tolerant Distantly Supervised Relation Extraction , 2017, EMNLP.

[22] Percy Liang,et al. Understanding Black-box Predictions via Influence Functions , 2017, ICML.

[23] Jun Zhao,et al. Distant Supervision for Relation Extraction with Sentence-Level Attention and Entity Descriptions , 2017, AAAI.

[24] Zhiyuan Liu,et al. Neural Relation Extraction with Selective Attention over Instances , 2016, ACL.

[25] Martín Abadi,et al. TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems , 2016, ArXiv.

[26] Jun Zhao,et al. Distant Supervision for Relation Extraction via Piecewise Convolutional Neural Networks , 2015, EMNLP.

[27] Matthew D. Zeiler. ADADELTA: An Adaptive Learning Rate Method , 2012, ArXiv.

[28] Ramesh Nallapati,et al. Multi-instance Multi-label Learning for Relation Extraction , 2012, EMNLP.

[29] Hiroshi Nakagawa,et al. Reducing Wrong Labels in Distant Supervision for Relation Extraction , 2012, ACL.

[30] Luke S. Zettlemoyer,et al. Knowledge-Based Weak Supervision for Information Extraction of Overlapping Relations , 2011, ACL.

[31] Andrew McCallum,et al. Modeling Relations and Their Mentions without Labeled Text , 2010, ECML/PKDD.

[32] Daniel Jurafsky,et al. Distant supervision for relation extraction without labeled data , 2009, ACL.

[33] Ana M. Pires,et al. Influence functions and outlier detection under the common principal components model: A robust approach , 2002 .

[34] Heng Ji,et al. Genre Separation Network with Adversarial Training for Cross-genre Relation Extraction , 2018, EMNLP.