Pre-training Entity Relation Encoder with Intra-span and Inter-span Information

In this paper, we integrate span-related information into pre-trained encoder for entity relation extraction task. Instead of using general-purpose sentence encoder (e.g., existing universal pre-trained models), we introduce a span encoder and a span pair encoder to the pre-training network, which makes it easier to import intra-span and inter-span information into the pre-trained model. To learn the encoders, we devise three customized pre-training objectives from different perspectives, which target on tokens, spans, and span pairs. In particular, a span encoder is trained to recover a random shuffling of tokens in a span, and a span pair encoder is trained to predict positive pairs that are from the same sentences and negative pairs that are from different sentences using contrastive loss. Experimental results show that the proposed pre-training method outperforms distantly supervised pre-training, and achieves promising performance on two entity relation extraction benchmark datasets (ACE05, SciERC).

[1]  Kaiming He,et al.  Momentum Contrast for Unsupervised Visual Representation Learning , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Yu Sun,et al.  ERNIE: Enhanced Representation through Knowledge Integration , 2019, ArXiv.

[3]  Lei Yu,et al.  A Mutual Information Maximization Perspective of Language Representation Learning , 2019, ICLR.

[4]  Shiliang Sun,et al.  Extracting Entities and Relations with Joint Minimum Risk Training , 2018, EMNLP.

[5]  Dan Roth,et al.  Exploiting Syntactico-Semantic Structures for Relation Extraction , 2011, ACL.

[6]  Caixia Yuan,et al.  MrMep: Joint Extraction of Multiple Relations and Multiple Entity Pairs Based on Triplet Attention , 2019, CoNLL.

[7]  Wei-Yun Ma,et al.  GraphRel: Modeling Text as Relational Graphs for Joint Entity and Relation Extraction , 2019, ACL.

[8]  Xinyan Xiao,et al.  ARNOR: Attention Regularization based Noise Reduction for Distant Supervision Relation Classification , 2019, ACL.

[9]  Makoto Miwa,et al.  End-to-End Relation Extraction using LSTMs on Sequences and Tree Structures , 2016, ACL.

[10]  Claire Cardie,et al.  Joint Inference for Fine-grained Opinion Extraction , 2013, ACL.

[11]  Tianyang Zhang,et al.  A Hierarchical Framework for Relation Extraction with Reinforcement Learning , 2018, AAAI.

[12]  Peng Zhou,et al.  Joint Extraction of Entities and Relations Based on a Novel Tagging Scheme , 2017, ACL.

[13]  Kaiming He,et al.  Improved Baselines with Momentum Contrastive Learning , 2020, ArXiv.

[14]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[15]  Zhiyuan Liu,et al.  Neural Relation Extraction with Selective Attention over Instances , 2016, ACL.

[16]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[17]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[18]  Luke S. Zettlemoyer,et al.  Deep Contextualized Word Representations , 2018, NAACL.

[19]  Pieter Abbeel,et al.  CURL: Contrastive Unsupervised Representations for Reinforcement Learning , 2020, ICML.

[20]  Mingxin Zhou,et al.  Entity-Relation Extraction as Multi-Turn Question Answering , 2019, ACL.

[21]  Hannaneh Hajishirzi,et al.  Entity, Relation, and Event Extraction with Contextualized Span Representations , 2019, EMNLP.

[22]  Maosong Sun,et al.  ERNIE: Enhanced Language Representation with Informative Entities , 2019, ACL.

[23]  Heng Ji,et al.  CoType: Joint Extraction of Typed Entities and Relations with Knowledge Bases , 2016, WWW.

[24]  Mari Ostendorf,et al.  A general framework for information extraction using dynamic span graphs , 2019, NAACL.

[25]  Yue Zhang,et al.  Joint Extraction of Entities and Relations Based on a Novel Graph Scheme , 2018, IJCAI.

[26]  Omer Levy,et al.  SpanBERT: Improving Pre-training by Representing and Predicting Spans , 2019, TACL.

[27]  Oriol Vinyals,et al.  Representation Learning with Contrastive Predictive Coding , 2018, ArXiv.

[28]  Jun Zhao,et al.  Extracting Relational Facts by an End-to-End Neural Model with Copy Mechanism , 2018, ACL.

[29]  Claire Cardie,et al.  Going out on a limb: Joint Extraction of Entity Mentions and Relations without Dependency Trees , 2017, ACL.

[30]  Geoffrey E. Hinton,et al.  A Simple Framework for Contrastive Learning of Visual Representations , 2020, ICML.

[31]  Xiaodong Liu,et al.  Unified Language Model Pre-training for Natural Language Understanding and Generation , 2019, NeurIPS.

[32]  Sebastian Ruder,et al.  Universal Language Model Fine-tuning for Text Classification , 2018, ACL.

[33]  Mikhail Khodak,et al.  A Theoretical Analysis of Contrastive Unsupervised Representation Learning , 2019, ICML.

[34]  Heng Ji,et al.  Incremental Joint Extraction of Entity Mentions and Relations , 2014, ACL.

[35]  Shiliang Sun,et al.  Joint Type Inference on Entities and Relations via Graph Convolutional Networks , 2019, ACL.

[36]  Yuanbin Wu,et al.  Distantly Supervised Entity Relation Extraction with Adapted Manual Annotations , 2019, AAAI.

[37]  Thomas Wolf,et al.  A Hierarchical Multi-task Approach for Learning Embeddings from Semantic Tasks , 2018, AAAI.

[38]  Danqi Chen,et al.  Position-aware Attention and Supervised Data Improve Slot Filling , 2017, EMNLP.