Revisiting the Negative Data of Distantly Supervised Relation Extraction

Distantly supervision automatically generates plenty of training samples for relation extraction. However, it also incurs two major problems: noisy labels and imbalanced training data. Previous works focus more on reducing wrongly labeled relations (false positives) while few explore the missing relations that are caused by incompleteness of knowledge base (false negatives). Furthermore, the quantity of negative labels overwhelmingly surpasses the positive ones in previous problem formulations. In this paper, we first provide a thorough analysis of the above challenges caused by negative data. Next, we formulate the problem of relation extraction into as a positive unlabeled learning task to alleviate false negative problem. Thirdly, we propose a pipeline approach, dubbed RERE, that performs sentencelevel relation detection then subject/object extraction to achieve sample-efficient training. Experimental results show that the proposed method consistently outperforms existing approaches and remains excellent performance even learned with a large quantity of false positive samples.

[1]  Makoto Miwa,et al.  Modeling Joint Entity and Relation Extraction with Table Representation , 2014, EMNLP.

[2]  Bin Wang,et al.  Joint Extraction of Entities and Relations Based on a Novel Decomposition Strategy , 2020, ECAI.

[3]  Ralph Grishman,et al.  Distant Supervision for Relation Extraction with an Incomplete Knowledge Base , 2013, NAACL.

[4]  Jun Zhao,et al.  Distant Supervision for Relation Extraction via Piecewise Convolutional Neural Networks , 2015, EMNLP.

[5]  Heng Ji,et al.  Incremental Joint Extraction of Entity Mentions and Relations , 2014, ACL.

[6]  Hongsong Zhu,et al.  TPLinker: Single-stage Joint Extraction of Entities and Relations Through Token Pair Linking , 2020, COLING.

[7]  Mark Dredze,et al.  Improved Relation Extraction with Feature-Rich Compositional Embedding Models , 2015, EMNLP.

[8]  Zhiyuan Liu,et al.  Neural Relation Extraction with Selective Attention over Instances , 2016, ACL.

[9]  Jun Zhao,et al.  Extracting Relational Facts by an End-to-End Neural Model with Copy Mechanism , 2018, ACL.

[10]  Li Zhao,et al.  Reinforcement Learning for Relation Classification From Noisy Data , 2018, AAAI.

[11]  Zhepei Wei,et al.  A Novel Cascade Binary Tagging Framework for Relational Triple Extraction , 2019, ACL.

[12]  Le Zhao,et al.  Filling Knowledge Base Gaps for Distant Supervision of Relation Extraction , 2013, ACL.

[13]  Claire Cardie,et al.  Going out on a limb: Joint Extraction of Entity Mentions and Relations without Dependency Trees , 2017, ACL.

[14]  Eneko Agirre,et al.  Improving distant supervision using inference learning , 2015, ACL.

[15]  Andrew McCallum,et al.  Modeling Relations and Their Mentions without Labeled Text , 2010, ECML/PKDD.

[16]  Geoffrey E. Hinton,et al.  Speech recognition with deep recurrent neural networks , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[17]  Praveen Paritosh,et al.  Freebase: a collaboratively created graph database for structuring human knowledge , 2008, SIGMOD Conference.

[18]  Hinrich Schütze,et al.  Table Filling Multi-Task Recurrent Neural Network for Joint Entity and Relation Extraction , 2016, COLING.

[19]  Daniel Jurafsky,et al.  Distant supervision for relation extraction without labeled data , 2009, ACL.

[20]  Heng Ji,et al.  CoType: Joint Extraction of Typed Entities and Relations with Knowledge Bases , 2016, WWW.

[21]  Peng Zhou,et al.  Joint Extraction of Entities and Relations Based on a Novel Tagging Scheme , 2017, ACL.

[22]  Wai Lam,et al.  Jointly Identifying Entities and Extracting Relations in Encyclopedia Text via A Graphical Model Approach , 2010, COLING.

[23]  Alberto Lavelli,et al.  Impact of Less Skewed Distributions on Efficiency and Effectiveness of Biomedical Relation Extraction , 2012, COLING.

[24]  Shuohang Wang,et al.  Machine Comprehension Using Match-LSTM and Answer Pointer , 2016, ICLR.

[25]  Bowen Zhou,et al.  Classifying Relations by Ranking with Convolutional Neural Networks , 2015, ACL.

[26]  Makoto Miwa,et al.  End-to-End Relation Extraction using LSTMs on Sequences and Tree Structures , 2016, ACL.

[27]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[28]  Mingxin Zhou,et al.  Entity-Relation Extraction as Multi-Turn Question Answering , 2019, ACL.

[29]  Yaojie Lu,et al.  Adaptive Scaling for Sparse Detection in Information Extraction , 2018, ACL.

[30]  Luke S. Zettlemoyer,et al.  Knowledge-Based Weak Supervision for Information Extraction of Overlapping Relations , 2011, ACL.

[31]  Jun Zhao,et al.  Relation Classification via Convolutional Deep Neural Network , 2014, COLING.

[32]  Yanghua Xiao,et al.  Collective Loss Function for Positive and Unlabeled Learning , 2020, ArXiv.

[33]  Gang Niu,et al.  Class-prior estimation for learning from positive and unlabeled data , 2016, Machine Learning.

[34]  Omer Levy,et al.  RoBERTa: A Robustly Optimized BERT Pretraining Approach , 2019, ArXiv.

[35]  Xinyan Xiao,et al.  ARNOR: Attention Regularization based Noise Reduction for Distant Supervision Relation Classification , 2019, ACL.

[36]  Shashi Narayan,et al.  Creating Training Corpora for NLG Micro-Planners , 2017, ACL.

[37]  Bin Liang,et al.  CN-DBpedia: A Never-Ending Chinese Knowledge Extraction System , 2017, IEA/AIE.

[38]  Markus Krötzsch,et al.  Wikidata , 2014, Commun. ACM.

[39]  Zhoujun Li,et al.  Asking Effective and Diverse Questions: A Machine Reading Comprehension based Framework for Joint Entity-Relation Extraction , 2020, IJCAI.

[40]  Jun Zhao,et al.  Large Scaled Relation Extraction With Reinforcement Learning , 2018, AAAI.

[41]  Tianyang Zhang,et al.  A Hierarchical Framework for Relation Extraction with Reinforcement Learning , 2018, AAAI.

[42]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[43]  Dan Roth,et al.  Exploiting Syntactico-Semantic Structures for Relation Extraction , 2011, ACL.

[44]  Jesse Davis,et al.  Estimating the Class Prior in Positive and Unlabeled Data Through Decision Tree Induction , 2018, AAAI.