AdaPrompt: Adaptive Prompt-based Finetuning for Relation Extraction

In this paper, we reformulate the relation extraction task as mask language modeling and propose a novel adaptive prompt-based finetuning approach. We propose an adaptive label words selection mechanism that scatters the relation label into variable number of label tokens to handle the complex multiple label space. We further introduce an auxiliary entity discriminator object to encourage the model to focus on context representation learning. Extensive experiments on benchmark datasets demonstrate that our approach can achieve better performance on both the few-shot and supervised setting1.

[1]  Christopher D. Manning,et al.  Graph Convolution over Pruned Dependency Trees Improves Relation Extraction , 2018, EMNLP.

[2]  Xi Chen,et al.  Long-tail Relation Extraction via Knowledge Graph Embeddings and Graph Convolution Networks , 2019, NAACL.

[3]  Dan Klein,et al.  Calibrate Before Use: Improving Few-Shot Performance of Language Models , 2021, ICML.

[4]  Maosong Sun,et al.  FewRel 2.0: Towards More Challenging Few-Shot Relation Classification , 2019, EMNLP.

[5]  Scott B. Huffman,et al.  Learning information extraction patterns from examples , 1995, Learning for Natural Language Processing.

[6]  Zhiyuan Liu,et al.  ERICA: Improving Entity and Relation Understanding for Pre-trained Language Models via Contrastive Learning , 2020, ACL.

[7]  Omer Levy,et al.  SpanBERT: Improving Pre-training by Representing and Predicting Spans , 2019, TACL.

[8]  Wei Shi,et al.  Attention-Based Bidirectional Long Short-Term Memory Networks for Relation Classification , 2016, ACL.

[9]  Hao Zhang,et al.  GDPNet: Refining Latent Multi-View Graph for Relation Extraction , 2020, AAAI.

[10]  Colin Raffel,et al.  Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer , 2019, J. Mach. Learn. Res..

[11]  Sameer Singh,et al.  Eliciting Knowledge from Language Models Using Automatically Generated Prompts , 2020, EMNLP.

[12]  Wei Zhang,et al.  Attention-Based Capsule Networks with Dynamic Routing for Relation Extraction , 2018, EMNLP.

[13]  Wei Lu,et al.  Reasoning with Latent Structure Refinement for Document-Level Relation Extraction , 2020, ACL.

[14]  Timo Schick,et al.  Exploiting Cloze-Questions for Few-Shot Text Classification and Natural Language Inference , 2020, EACL.

[15]  Aleksandra Gabryszak,et al.  TACRED Revisited: A Thorough Evaluation of the TACRED Relation Extraction Task , 2020, ACL.

[16]  Kilian Q. Weinberger,et al.  Revisiting Few-sample BERT Fine-tuning , 2020, ArXiv.

[17]  Raymond J. Mooney,et al.  Relational Learning of Pattern-Match Rules for Information Extraction , 1999, CoNLL.

[18]  Zhengxiao Du,et al.  GPT Understands, Too , 2021, AI Open.

[19]  Zhiyuan Liu,et al.  FewRel: A Large-Scale Supervised Few-Shot Relation Classification Dataset with State-of-the-Art Evaluation , 2018, EMNLP.

[20]  Alexander M. Rush,et al.  How many data points is a prompt worth? , 2021, NAACL.

[21]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[22]  Wei Lu,et al.  Attention Guided Graph Convolutional Networks for Relation Extraction , 2019, ACL.

[23]  Danqi Chen,et al.  Making Pre-trained Language Models Better Few-shot Learners , 2021, ACL/IJCNLP.

[24]  Jeffrey Ling,et al.  Matching the Blanks: Distributional Similarity for Relation Learning , 2019, ACL.

[25]  Maosong Sun,et al.  Learning from Context or Names? An Empirical Study on Neural Relation Extraction , 2020, EMNLP.

[26]  Yifan He,et al.  Enriching Pre-trained Language Model with Entity Information for Relation Classification , 2019, CIKM.

[27]  Guilin Qi,et al.  Curriculum-Meta Learning for Order-Robust Continual Relation Extraction , 2021, AAAI.

[28]  Omer Levy,et al.  SuperGLUE: A Stickier Benchmark for General-Purpose Language Understanding Systems , 2019, NeurIPS.

[29]  Danqi Chen,et al.  Position-aware Attention and Supervised Data Improve Slot Filling , 2017, EMNLP.

[30]  Graham Neubig,et al.  How Can We Know What Language Models Know? , 2019, Transactions of the Association for Computational Linguistics.

[31]  Omer Levy,et al.  GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding , 2018, BlackboxNLP@EMNLP.

[32]  Hinrich Schutze,et al.  It’s Not Just Size That Matters: Small Language Models Are Also Few-Shot Learners , 2020, NAACL.

[33]  Claire Cardie,et al.  Dialogue-Based Relation Extraction , 2020, ACL.

[34]  Mark Chen,et al.  Language Models are Few-Shot Learners , 2020, NeurIPS.

[35]  Ali Farhadi,et al.  Fine-Tuning Pretrained Language Models: Weight Initializations, Data Orders, and Early Stopping , 2020, ArXiv.