论文信息 - Improving Relation Extraction by Pre-trained Language Representations - 字舞流文

Improving Relation Extraction by Pre-trained Language Representations

Current state-of-the-art relation extraction methods typically rely on a set of lexical, syntactic, and semantic features, explicitly computed in a pre-processing step. Training feature extraction models requires additional annotated language resources, which severely restricts the applicability and portability of relation extraction to novel languages. Similarly, pre-processing introduces an additional source of error. To address these limitations, we introduce TRE, a Transformer for Relation Extraction, extending the OpenAI Generative Pre-trained Transformer [Radford et al., 2018]. Unlike previous relation extraction models, TRE uses pre-trained deep language representations instead of explicit linguistic features to inform the relation classification and combines it with the self-attentive Transformer architecture to effectively model long-range dependencies between entity mentions. TRE allows us to learn implicit linguistic features solely from plain text corpora by unsupervised pre-training, before fine-tuning the learned language representations on the relation extraction task. TRE obtains a new state-of-the-art result on the TACRED and SemEval 2010 Task 8 datasets, achieving a test F1 of 67.4 and 87.1, respectively. Furthermore, we observe a significant increase in sample efficiency. With only 20% of the training examples, TRE matches the performance of our baselines and our model trained from scratch on 100% of the TACRED dataset. We open-source our trained models, experiments, and source code.

Leonhard Hennig | Christoph Alt | Marc Hübner | Leonhard Hennig | Christoph Alt | Marc Hübner

[1] Danqi Chen,et al. Position-aware Attention and Supervised Data Improve Slot Filling , 2017, EMNLP.

[2] Razvan C. Bunescu,et al. A Shortest Path Dependency Kernel for Relation Extraction , 2005, HLT.

[3] Sanda M. Harabagiu,et al. UTD: Classifying Semantic Relations by Combining Lexical and Semantic Resources , 2010, *SEMEVAL.

[4] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.

[5] Yoshua Bengio,et al. Word Representations: A Simple and General Method for Semi-Supervised Learning , 2010, ACL.

[6] Andrew McCallum,et al. Simultaneously Self-Attending to All Mentions for Full-Abstract Biological Relation Extraction , 2018, NAACL.

[7] Zhiyuan Liu,et al. Hierarchical Relation Extraction with Coarse-to-Fine Grained Attention , 2018, EMNLP.

[8] Christopher D. Manning,et al. Graph Convolution over Pruned Dependency Trees Improves Relation Extraction , 2018, EMNLP.

[9] Dongyan Zhao,et al. Semantic Relation Classification via Convolutional Neural Networks with Simple Negative Sampling , 2015, EMNLP.

[10] Quoc V. Le,et al. Grounded Compositional Semantics for Finding and Describing Images with Sentences , 2014, TACL.

[11] Jun Zhao,et al. Relation Classification via Convolutional Deep Neural Network , 2014, COLING.

[12] Emmanuel Dupoux,et al. Assessing the Ability of LSTMs to Learn Syntax-Sensitive Dependencies , 2016, TACL.

[13] Andrew Y. Ng,et al. Semantic Compositionality through Recursive Matrix-Vector Spaces , 2012, EMNLP.

[14] Lukasz Kaiser,et al. Generating Wikipedia by Summarizing Long Sequences , 2018, ICLR.

[15] Jun Zhao,et al. Distant Supervision for Relation Extraction via Piecewise Convolutional Neural Networks , 2015, EMNLP.

[16] Bowen Zhou,et al. Improved Neural Relation Detection for Knowledge Base Question Answering , 2017, ACL.

[17] Jeffrey Pennington,et al. GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[18] Luke S. Zettlemoyer,et al. Deep Contextualized Word Representations , 2018, NAACL.

[19] Sanja Fidler,et al. Aligning Books and Movies: Towards Story-Like Visual Explanations by Watching Movies and Reading Books , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[20] Rico Sennrich,et al. Neural Machine Translation of Rare Words with Subword Units , 2015, ACL.

[21] Dmitry Zelenko,et al. Kernel Methods for Relation Extraction , 2002, J. Mach. Learn. Res..

[22] Houfeng Wang,et al. Bidirectional Recurrent Convolutional Neural Network for Relation Classification , 2016, ACL.

[23] Andrew McCallum,et al. Modeling Relations and Their Mentions without Labeled Text , 2010, ECML/PKDD.

[24] Jeffrey Dean,et al. Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[25] Dong Wang,et al. Relation Classification via Recurrent Neural Network , 2015, ArXiv.

[26] Preslav Nakov,et al. SemEval-2010 Task 8: Multi-Way Classification of Semantic Relations Between Pairs of Nominals , 2009, SEW@NAACL-HLT.

[27] Oren Etzioni,et al. Identifying Relations for Open Information Extraction , 2011, EMNLP.

[28] Sebastian Ruder,et al. Universal Language Model Fine-tuning for Text Classification , 2018, ACL.

[29] Zhi Jin,et al. Classifying Relations via Long Short Term Memory Networks along Shortest Dependency Paths , 2015, EMNLP.

[30] Heng Ji,et al. Knowledge Base Population: Successful Approaches and Challenges , 2011, ACL.

[31] Daniel Jurafsky,et al. Distant supervision for relation extraction without labeled data , 2009, ACL.

[32] Alec Radford,et al. Improving Language Understanding by Generative Pre-Training , 2018 .

[33] Zhi Jin,et al. Improved relation classification by deep recurrent neural networks with data augmentation , 2016, COLING.

[34] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.