论文信息 - JAKET: Joint Pre-training of Knowledge Graph and Language Understanding - 字舞流文

JAKET: Joint Pre-training of Knowledge Graph and Language Understanding

Knowledge graphs (KGs) contain rich information about world knowledge, entities and relations. Thus, they can be great supplements to existing pre-trained language models. However, it remains a challenge to efficiently integrate information from KG into language modeling. And the understanding of a knowledge graph requires related context. We propose a novel joint pre-training framework, JAKET, to model both the knowledge graph and language. The knowledge module and language module provide essential information to mutually assist each other: the knowledge module produces embeddings for entities in text while the language module generates context-aware initial embeddings for entities and relations in the graph. Our design enables the pre-trained model to easily adapt to unseen knowledge graphs in new domains. Experimental results on several knowledge-aware NLP tasks show that our proposed framework achieves superior performance by effectively leveraging knowledge in language understanding.

Donghan Yu | Yiming Yang | Michael Zeng | Chenguang Zhu | Yiming Yang | Donghan Yu | Michael Zeng | Chenguang Zhu

[1] Vikram Nitin,et al. Composition-based Multi-Relational Graph Convolutional Networks , 2020, ICLR.

[2] Wenhan Xiong,et al. Pretrained Encyclopedia: Weakly Supervised Knowledge-Pretrained Language Model , 2019, ICLR.

[3] Alexander J. Smola,et al. Deep Graph Library: Towards Efficient and Scalable Deep Learning on Graphs , 2019, ArXiv.

[4] Tao Shen,et al. Exploiting Structured Knowledge in Text via Graph-Guided Representation Learning , 2020, EMNLP.

[5] Jonathan Berant,et al. oLMpics-On What Language Model Pre-training Captures , 2019, Transactions of the Association for Computational Linguistics.

[6] Yiming Yang,et al. XLNet: Generalized Autoregressive Pretraining for Language Understanding , 2019, NeurIPS.

[7] Le Song,et al. Variational Reasoning for Question Answering with Knowledge Graph , 2017, AAAI.

[8] Jason Weston,et al. Translating Embeddings for Modeling Multi-relational Data , 2013, NIPS.

[9] William W. Cohen,et al. PullNet: Open Domain Question Answering with Iterative Retrieval on Knowledge Bases and Text , 2019, EMNLP.

[10] Maosong Sun,et al. FewRel 2.0: Towards More Challenging Few-Shot Relation Classification , 2019, EMNLP.

[11] Yu Sun,et al. ERNIE: Enhanced Representation through Knowledge Integration , 2019, ArXiv.

[12] Nan Duan,et al. Graph-Based Reasoning over Heterogeneous External Knowledge for Commonsense Question Answering , 2019, AAAI.

[13] Omer Levy,et al. RoBERTa: A Robustly Optimized BERT Pretraining Approach , 2019, ArXiv.

[14] Zhiyuan Liu,et al. FewRel: A Large-Scale Supervised Few-Shot Relation Classification Dataset with State-of-the-Art Evaluation , 2018, EMNLP.

[15] Ming-Wei Chang,et al. REALM: Retrieval-Augmented Language Model Pre-Training , 2020, ICML.

[16] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[17] Mark Chen,et al. Language Models are Few-Shot Learners , 2020, NeurIPS.

[18] Luke S. Zettlemoyer,et al. Deep Contextualized Word Representations , 2018, NAACL.

[19] Jeffrey Ling,et al. Matching the Blanks: Distributional Similarity for Relation Learning , 2019, ACL.

[20] Geoffrey E. Hinton,et al. Layer Normalization , 2016, ArXiv.

[21] Roy Schwartz,et al. Knowledge Enhanced Contextual Word Representations , 2019, EMNLP/IJCNLP.

[22] Tianyu Gao,et al. KEPLER: A Unified Model for Knowledge Embedding and Pre-trained Language Representation , 2019, ArXiv.

[23] Ruslan Salakhutdinov,et al. Open Domain Question Answering Using Early Fusion of Knowledge Bases and Text , 2018, EMNLP.

[24] Lingfan Yu,et al. Deep Graph Library: A Graph-Centric, Highly-Performant Package for Graph Neural Networks. , 2019 .

[25] Markus Krötzsch,et al. Wikidata , 2014, Commun. ACM.

[26] Zhe Zhao,et al. K-BERT: Enabling Language Representation with Knowledge Graph , 2019, AAAI.

[27] William W. Cohen,et al. Facts as Experts: Adaptable and Interpretable Neural Memory over Symbolic Knowledge , 2020, ArXiv.

[28] Chang Zhou,et al. Cognitive Graph for Multi-Hop Reading Comprehension at Scale , 2019, ACL.

[29] Eunsol Choi,et al. Entities as Experts: Sparse Memory Access with Entity Supervision , 2020, EMNLP.

[30] Maosong Sun,et al. ERNIE: Enhanced Language Representation with Informative Entities , 2019, ACL.

[31] Pietro Liò,et al. Graph Attention Networks , 2017, ICLR.

[32] Yoav Shoham,et al. SenseBERT: Driving Some Sense into BERT , 2019, ACL.

[33] Yonatan Belinkov,et al. Linguistic Knowledge and Transferability of Contextual Representations , 2019, NAACL.

[34] Rahul Gupta,et al. SLING: A framework for frame semantic parsing , 2017, ArXiv.

[35] Jure Leskovec,et al. Inductive Representation Learning on Large Graphs , 2017, NIPS.

[36] Frank Hutter,et al. Decoupled Weight Decay Regularization , 2017, ICLR.