In-Context Few-Shot Relation Extraction via Pre-Trained Language Models

Relation extraction aims at inferring structured human knowledge from textual documents. State-of-the-art methods based on language models commonly have two limitations: (1) they require named entities to be either given as input or infer them, which introduces additional noise, and (2) they require human annotations of documents. As a remedy, we present a novel framework for in-context few-shot relation extraction via pre-trained language models. To the best of our knowledge, we are the first to reformulate the relation extraction task as a tailored in-context few-shot learning paradigm. Thereby, we achieve crucial benefits in that we eliminate the need for both named entity recognition and human annotation of documents. Unlike existing methods based on fine-tuning, our framework is flexible in that it can be easily updated for a new set of relations without re-training. We evaluate our framework using DocRED, the largest publicly available dataset for document-level relation extraction, and demonstrate that our framework achieves state-of-the-art performance. Finally, our framework allows us to identify missing annotations, and we thus show that our framework actually performs much better than the original labels from the development set of DocRED.

[1]  Christopher D. Manning,et al.  Holistic Evaluation of Language Models , 2023, Trans. Mach. Learn. Res..

[2]  Henrique Pondé de Oliveira Pinto,et al.  GPT-4 Technical Report , 2023, 2303.08774.

[3]  Xinyun Chen,et al.  Larger language models do in-context learning differently , 2023, ArXiv.

[4]  Colin Raffel,et al.  Large Language Models Struggle to Learn Long-Tail Knowledge , 2022, ICML.

[5]  Alexander J. Smola,et al.  Automatic Chain of Thought Prompting in Large Language Models , 2022, ICLR.

[6]  Noah A. Smith,et al.  Selective Annotation Makes Language Models Better Few-Shot Learners , 2022, ICLR.

[7]  Yoav Goldberg,et al.  Measuring Causal Effects of Data Statistics on Language Model's 'Factual' Predictions , 2022, ArXiv.

[8]  J. Dean,et al.  Emergent Abilities of Large Language Models , 2022, Trans. Mach. Learn. Res..

[9]  Lili Mou,et al.  Document-Level Relation Extraction with Sentences Importance Estimation and Focusing , 2022, NAACL.

[10]  Mona T. Diab,et al.  A Review on Language Models as Knowledge Bases , 2022, ArXiv.

[11]  Gary D Bader,et al.  A sequence-to-sequence approach for document-level relation extraction , 2022, BIONLP.

[12]  Le Sun,et al.  Unified Structure Generation for Universal Information Extraction , 2022, ACL.

[13]  D. Schuurmans,et al.  Self-Consistency Improves Chain of Thought Reasoning in Language Models , 2022, ICLR.

[14]  H. Ng,et al.  Document-Level Relation Extraction with Adaptive Focal Loss and Knowledge Distillation , 2022, FINDINGS.

[15]  M. Lewis,et al.  Rethinking the Role of Demonstrations: What Makes In-Context Learning Work? , 2022, Conference on Empirical Methods in Natural Language Processing.

[16]  M. Zaheer,et al.  Knowledge Base Question Answering by Case-based Reasoning over Subgraphs , 2022, ICML.

[17]  Dale Schuurmans,et al.  Chain of Thought Prompting Elicits Reasoning in Large Language Models , 2022, NeurIPS.

[18]  Jonathan Berant,et al.  Learning To Retrieve Prompts for In-Context Learning , 2021, NAACL.

[19]  Ronan Le Bras,et al.  Generated Knowledge Prompting for Commonsense Reasoning , 2021, ACL.

[20]  Prafulla Kumar Choubey,et al.  P-Adapters: Robustly Extracting Factual Information from Language Models with Diverse Prompts , 2021, ICLR.

[21]  Zhilin Yang,et al.  P-Tuning v2: Prompt Tuning Can Be Comparable to Fine-tuning Universally Across Scales and Tasks , 2021, ArXiv.

[22]  Carl Yang,et al.  SAIS: Supervising and Augmenting Intermediate Steps for Document-Level Relation Extraction , 2021, NAACL.

[23]  Luke Zettlemoyer,et al.  Noisy Channel Language Model Prompting for Few-Shot Text Classification , 2021, ACL.

[24]  Meng Liao,et al.  Knowledgeable or Educated Guess? Revisiting Language Models as Knowledge Bases , 2021, ACL.

[25]  Chuanqi Tan,et al.  Document-level Relation Extraction as Semantic Segmentation , 2021, IJCAI.

[26]  Douwe Kiela,et al.  True Few-Shot Learning with Language Models , 2021, NeurIPS.

[27]  S. Riedel,et al.  Fantastically Ordered Prompts and Where to Find Them: Overcoming Few-Shot Prompt Order Sensitivity , 2021, ACL.

[28]  Brian Lester,et al.  The Power of Scale for Parameter-Efficient Prompt Tuning , 2021, EMNLP.

[29]  Luke Zettlemoyer,et al.  Surface Form Competition: Why the Highest Probability Answer Isn’t Always Right , 2021, EMNLP.

[30]  Danqi Chen,et al.  Factual Probing Is [MASK]: Learning vs. Learning to Recall , 2021, NAACL.

[31]  Zhilin Yang,et al.  GPT Understands, Too , 2021, AI Open.

[32]  S. Parthasarathy,et al.  FACE-KEG: Fact Checking Explained using KnowledgE Graphs , 2021, WSDM.

[33]  Zhendong Mao,et al.  Entity Structure Within and Throughout: Modeling Mention Dependencies for Document-Level Relation Extraction , 2021, AAAI.

[34]  D. Klein,et al.  Calibrate Before Use: Improving Few-Shot Performance of Language Models , 2021, ICML.

[35]  Adrian Ulges,et al.  An End-to-end Model for Entity-level Relation Extraction using Multi-instance Learning , 2021, EACL.

[36]  E. Hovy,et al.  Measuring and Improving Consistency in Pretrained Language Models , 2021, Transactions of the Association for Computational Linguistics.

[37]  Weizhu Chen,et al.  What Makes Good In-Context Examples for GPT-3? , 2021, DEELIO.

[38]  Tengyu Ma,et al.  Document-Level Relation Extraction with Adaptive Thresholding and Localized Context Pooling , 2020, AAAI.

[39]  Fabian M. Suchanek,et al.  Machine Knowledge: Creation and Curation of Comprehensive Knowledge Bases , 2020, Found. Trends Databases.

[40]  Kun Zhou,et al.  Improving Conversational Recommender Systems via Knowledge Graph based Semantic Fusion , 2020, KDD.

[41]  Weinan Zhang,et al.  Interactive Recommender System via Knowledge Graph-enhanced Reinforcement Learning , 2020, SIGIR.

[42]  Mark Chen,et al.  Language Models are Few-Shot Learners , 2020, NeurIPS.

[43]  Zhiyuan Liu,et al.  More Data, More Relations, More Context and More Openness: A Review and Outlook for Relation Extraction , 2020, AACL.

[44]  Hinrich Schütze,et al.  E-BERT: Efficient-Yet-Effective Entity Embeddings for BERT , 2019, FINDINGS.

[45]  Paolo Papotti,et al.  A Benchmark for Fact Checking Algorithms Built on Knowledge Bases , 2019, CIKM.

[46]  Omer Levy,et al.  BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension , 2019, ACL.

[47]  Peter J. Liu,et al.  Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer , 2019, J. Mach. Learn. Res..

[48]  Hong Wang,et al.  Fine-tune Bert for DocRED with Two-step Process , 2019, ArXiv.

[49]  Ralph Grishman,et al.  Twenty-five years of information extraction , 2019, Natural Language Engineering.

[50]  Xiang Ren,et al.  KagNet: Knowledge-Aware Graph Networks for Commonsense Reasoning , 2019, EMNLP.

[51]  Sebastian Riedel,et al.  Language Models as Knowledge Bases? , 2019, EMNLP.

[52]  Iryna Gurevych,et al.  Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks , 2019, EMNLP.

[53]  Omer Levy,et al.  RoBERTa: A Robustly Optimized BERT Pretraining Approach , 2019, ArXiv.

[54]  Maosong Sun,et al.  DocRED: A Large-Scale Document-Level Relation Extraction Dataset , 2019, ACL.

[55]  Iz Beltagy,et al.  SciBERT: A Pretrained Language Model for Scientific Text , 2019, EMNLP.

[56]  Minyi Guo,et al.  DKN: Deep Knowledge-Aware Network for News Recommendation , 2018, WWW.

[57]  Pushpak Bhattacharyya,et al.  Relation Extraction : A Survey , 2017, ArXiv.

[58]  Bo Xu,et al.  Joint entity and relation extraction based on a hybrid neural network , 2017, Neurocomputing.

[59]  Claire Cardie,et al.  Going out on a limb: Joint Extraction of Entity Mentions and Relations without Dependency Trees , 2017, ACL.

[60]  Heike Adel,et al.  Global Normalization of Convolutional Neural Networks for Joint Entity and Relation Classification , 2017, EMNLP.

[61]  Jiawei Han,et al.  MetaPAD: Meta Pattern Discovery from Massive Text Corpora , 2017, KDD.

[62]  Wei Shi,et al.  Attention-Based Bidirectional Long Short-Term Memory Networks for Relation Classification , 2016, ACL.

[63]  Makoto Miwa,et al.  End-to-End Relation Extraction using LSTMs on Sequences and Tree Structures , 2016, ACL.

[64]  Zhiyuan Liu,et al.  Learning Entity and Relation Embeddings for Knowledge Graph Completion , 2015, AAAI.

[65]  Venu Govindaraju,et al.  Parallel Feature Selection Inspired by Group Testing , 2014, NIPS.

[66]  Markus Krötzsch,et al.  Wikidata , 2014, Commun. ACM.

[67]  Jun Zhao,et al.  Relation Classification via Convolutional Deep Neural Network , 2014, COLING.

[68]  Zhen Wang,et al.  Knowledge Graph Embedding by Translating on Hyperplanes , 2014, AAAI.

[69]  Gerhard Weikum,et al.  PATTY: A Taxonomy of Relational Patterns with Semantic Types , 2012, EMNLP.

[70]  Wai Lam,et al.  Jointly Identifying Entities and Extracting Relations in Encyclopedia Text via A Graphical Model Approach , 2010, COLING.

[71]  Tom Michael Mitchell,et al.  Toward an Architecture for Never-Ending Language Learning , 2010, AAAI.

[72]  Praveen Paritosh,et al.  Freebase: a collaboratively created graph database for structuring human knowledge , 2008, SIGMOD Conference.

[73]  ChengXiang Zhai,et al.  A Systematic Exploration of the Feature Space for Relation Extraction , 2007, NAACL.

[74]  Jens Lehmann,et al.  DBpedia: A Nucleus for a Web of Open Data , 2007, ISWC/ASWC.

[75]  Mitsuru Ishizuka,et al.  Relation Extraction from Wikipedia Using Subtree Mining , 2007, AAAI.

[76]  Jian Su,et al.  A Composite Kernel to Extract Relations between Entities with Both Flat and Structured Features , 2006, ACL.

[77]  Jian Su,et al.  Exploring Syntactic Features for Relation Extraction using a Convolution Tree Kernel , 2006, NAACL.

[78]  William W. Cohen,et al.  Semi-Markov Conditional Random Fields for Information Extraction , 2004, NIPS.

[79]  E. Xing,et al.  BertNet: Harvesting Knowledge Graphs from Pretrained Language Models , 2022, ArXiv.

[80]  Hongxia Jin,et al.  A New Concept of Knowledge based Question Answering (KBQA) System for Multi-hop Reasoning , 2022, NAACL.

[81]  Roberto Navigli,et al.  REBEL: Relation Extraction By End-to-end Language generation , 2021, EMNLP.

[82]  Percy Liang,et al.  Prefix-Tuning: Optimizing Continuous Prompts for Generation , 2021, ACL.

[83]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[84]  Ilya Sutskever,et al.  Language Models are Unsupervised Multitask Learners , 2019 .

[85]  Kenny Q. Zhu,et al.  Knowledge Base Question Answering via Encoding of Complex Query Graphs , 2018, EMNLP.

[86]  Mengqiu Wang,et al.  A Re-examination of Dependency Path Kernels for Relation Extraction , 2008, IJCNLP.

[87]  Gerhard Weikum,et al.  WWW 2007 / Track: Semantic Web Session: Ontologies ABSTRACT YAGO: A Core of Semantic Knowledge , 2022 .