Contrastive Triple Extraction with Generative Transformer

Triple extraction is an essential task in information extraction for natural language processing and knowledge graph construction. In this paper, we revisit the end-to-end triple extraction task for sequence generation. Since generative triple extraction may struggle to capture long-term dependencies and generate unfaithful triples, we introduce a novel model, contrastive triple extraction with a generative transformer. Specifically, we introduce a single shared transformer module for encoder-decoder-based generation. To generate faithful results, we propose a novel triplet contrastive training object. Moreover, We introduce two mechanisms to further improve model performance (i.e., batch-wise dynamic attention-masking and triple-wise calibration). Experimental results on three datasets (i.e., NYT, WebNLG, and MIE) show that our approach achieves better performance than that of baselines. Our code and datasets will be released after publication.

[1]  Satoshi Sekine,et al.  A survey of named entity recognition and classification , 2007 .

[2]  Hwee Tou Ng,et al.  Effective Modeling of Encoder-Decoder Architecture for Joint Entity and Relation Extraction , 2019, AAAI.

[3]  Jun Zhao,et al.  Distant Supervision for Relation Extraction via Piecewise Convolutional Neural Networks , 2015, EMNLP.

[4]  Zhiyuan Liu,et al.  Neural Relation Extraction with Selective Attention over Instances , 2016, ACL.

[5]  Claire Cardie,et al.  Going out on a limb: Joint Extraction of Entity Mentions and Relations without Dependency Trees , 2017, ACL.

[6]  Changhan Wang,et al.  Levenshtein Transformer , 2019, NeurIPS.

[7]  Chris Develder,et al.  Joint entity recognition and relation extraction as a multi-head selection problem , 2018, Expert Syst. Appl..

[8]  Xi Chen,et al.  Long-tail Relation Extraction via Knowledge Graph Embeddings and Graph Convolution Networks , 2019, NAACL.

[9]  Richard Socher,et al.  GeDi: Generative Discriminator Guided Sequence Generation , 2021, EMNLP.

[10]  Caixia Yuan,et al.  MrMep: Joint Extraction of Multiple Relations and Multiple Entity Pairs Based on Triplet Attention , 2019, CoNLL.

[11]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[12]  Zhiyuan Liu,et al.  Neural Relation Extraction with Multi-lingual Attention , 2017, ACL.

[13]  Tianyang Zhang,et al.  A Hierarchical Framework for Relation Extraction with Reinforcement Learning , 2018, AAAI.

[14]  Arman Cohan,et al.  Longformer: The Long-Document Transformer , 2020, ArXiv.

[15]  Razvan C. Bunescu,et al.  A Shortest Path Dependency Kernel for Relation Extraction , 2005, HLT.

[16]  Shao-Lun Huang,et al.  Finding Influential Instances for Distantly Supervised Relation Extraction , 2020, International Conference on Computational Linguistics.

[17]  Xiaodong Liu,et al.  Unified Language Model Pre-training for Natural Language Understanding and Generation , 2019, NeurIPS.

[18]  Xu Chen,et al.  Bridge Text and Knowledge by Learning Multi-Prototype Entity Mention Embedding , 2017, ACL.

[19]  Colin Raffel,et al.  Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer , 2019, J. Mach. Learn. Res..

[20]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[21]  Xinya Du,et al.  Document-level Event-based Extraction Using Generative Template-filling Transformers , 2020, ArXiv.

[22]  George Kurian,et al.  Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation , 2016, ArXiv.

[23]  Chenguang Zhu,et al.  Mind The Facts: Knowledge-Boosted Coherent Abstractive Text Summarization , 2020, ArXiv.

[24]  Furu Wei,et al.  Faithful to the Original: Fact Aware Neural Abstractive Summarization , 2017, AAAI.

[25]  Jun Zhao,et al.  Extracting Relational Facts by an End-to-End Neural Model with Copy Mechanism , 2018, ACL.

[26]  Huajun Chen,et al.  Logic-guided Semantic Representation Learning for Zero-Shot Relation Classification , 2020, COLING.

[27]  Daojian Zeng,et al.  CopyMTL: Copy Mechanism for Joint Extraction of Entities and Relations with Multi-Task Learning , 2019, AAAI.

[28]  Gaurav Pandey,et al.  Exemplar Encoder-Decoder for Neural Conversation Generation , 2018, ACL.

[29]  Zhepei Wei,et al.  A Novel Cascade Binary Tagging Framework for Relational Triple Extraction , 2019, ACL.

[30]  Andrew McCallum,et al.  Modeling Relations and Their Mentions without Labeled Text , 2010, ECML/PKDD.

[31]  Wei Zhang,et al.  Summarizing Chinese Medical Answer with Graph Convolution Networks and Question-focused Dual Attention , 2020, FINDINGS.

[32]  Relation Adversarial Network for Low Resource Knowledge Graph Completion , 2019, WWW.

[33]  Yuanzhe Zhang,et al.  MIE: A Medical Information Extractor towards Medical Dialogues , 2020, ACL.

[34]  D. Roth 1 Global Inference for Entity and Relation Identification via a Linear Programming Formulation , 2007 .

[35]  Wei Zhang,et al.  Attention-Based Capsule Networks with Dynamic Routing for Relation Extraction , 2018, EMNLP.

[36]  Xuedong Huang,et al.  Boosting Factual Correctness of Abstractive Summarization with Knowledge Graph , 2020, ArXiv.

[37]  Guillaume Lample,et al.  Neural Architectures for Named Entity Recognition , 2016, NAACL.

[38]  Sheng Zhang,et al.  Selective Decoding for Cross-lingual Open Information Extraction , 2017, IJCNLP.

[39]  Heng Ji,et al.  Incremental Joint Extraction of Entity Mentions and Relations , 2014, ACL.

[40]  Kuldip K. Paliwal,et al.  Bidirectional recurrent neural networks , 1997, IEEE Trans. Signal Process..

[41]  Wei Zhang,et al.  SEE: Syntax-aware Entity Embedding for Neural Relation Extraction , 2018, AAAI.

[42]  L. Getoor,et al.  1 Global Inference for Entity and Relation Identification via a Linear Programming Formulation , 2007 .

[43]  Peng Zhou,et al.  Joint Extraction of Entities and Relations Based on a Novel Tagging Scheme , 2017, ACL.

[44]  Natalia Gimelshein,et al.  PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.

[45]  Makoto Miwa,et al.  End-to-End Relation Extraction using LSTMs on Sequences and Tree Structures , 2016, ACL.

[46]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[47]  Christopher D. Manning,et al.  Optimizing the Factual Correctness of a Summary: A Study of Summarizing Radiology Reports , 2020, ACL.

[48]  Zhe Gan,et al.  Distilling Knowledge Learned in BERT for Text Generation , 2019, ACL.

[49]  Alexander M. Rush,et al.  Sequence-to-Sequence Learning as Beam-Search Optimization , 2016, EMNLP.

[50]  Shashi Narayan,et al.  Creating Training Corpora for NLG Micro-Planners , 2017, ACL.

[51]  Huajun Chen,et al.  OpenUE: An Open Toolkit of Universal Extraction from Text , 2020, EMNLP.

[52]  Xinyan Xiao,et al.  Joint Extraction of Entities and Overlapping Relations Using Position-Attentive Sequence Labeling , 2019, AAAI.

[53]  Preslav Nakov,et al.  SemEval-2010 Task 8: Multi-Way Classification of Semantic Relations Between Pairs of Nominals , 2009, SEW@NAACL-HLT.