Combining pre-trained language models and structured knowledge

In recent years, transformer-based language models have achieved state of the art performance in various NLP benchmarks. These models are able to extract mostly distributional information with some semantics from unstructured text, however it has proven challenging to integrate structured information, such as knowledge graphs into these models. We examine a variety of approaches to integrate structured knowledge into current language models and determine challenges, and possible opportunities to leverage both structured and unstructured information sources. From our survey, we find that there are still opportunities at exploiting adapter-based injections and that it may be possible to further combine various of the explored approaches into one system.

[1]  Catherine Havasi,et al.  ConceptNet 5.5: An Open Multilingual Graph of General Knowledge , 2016, AAAI.

[2]  Nan Duan,et al.  Graph-Based Reasoning over Heterogeneous External Knowledge for Commonsense Question Answering , 2019, AAAI.

[3]  Catherine Havasi,et al.  Representing General Relational Knowledge in ConceptNet 5 , 2012, LREC.

[4]  Christophe Gravier,et al.  T-REx: A Large Scale Alignment of Natural Language with Knowledge Base Triples , 2018, LREC.

[5]  Yejin Choi,et al.  COMET: Commonsense Transformers for Automatic Knowledge Graph Construction , 2019, ACL.

[6]  Yejin Choi,et al.  ATOMIC: An Atlas of Machine Commonsense for If-Then Reasoning , 2019, AAAI.

[7]  Alexander M. Rush,et al.  Commonsense Knowledge Mining from Pretrained Models , 2019, EMNLP.

[8]  Yiming Yang,et al.  Transformer-XL: Attentive Language Models beyond a Fixed-Length Context , 2019, ACL.

[9]  Zhe Zhao,et al.  K-BERT: Enabling Language Representation with Knowledge Graph , 2019, AAAI.

[10]  Danqi Chen,et al.  A Fast and Accurate Dependency Parser using Neural Networks , 2014, EMNLP.

[11]  S. Arikawa,et al.  Byte Pair Encoding: a Text Compression Scheme That Accelerates Pattern Matching , 1999 .

[12]  Jungo Kasai,et al.  Cracking the Contextual Commonsense Code: Understanding Commonsense Reasoning Aptitude of Deep Contextual Representations , 2019, EMNLP.

[13]  Christiane Fellbaum,et al.  Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.

[14]  Yejin Choi,et al.  Dynamic Knowledge Graph Construction for Zero-shot Commonsense Question Answering , 2019, ArXiv.

[15]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[16]  Erik T. Mueller,et al.  Open Mind Common Sense: Knowledge Acquisition from the General Public , 2002, OTM.

[17]  Guokun Lai,et al.  RACE: Large-scale ReAding Comprehension Dataset From Examinations , 2017, EMNLP.

[18]  Chengsheng Mao,et al.  KG-BERT: BERT for Knowledge Graph Completion , 2019, ArXiv.

[19]  Tsendsuren Munkhdalai,et al.  Metalearned Neural Memory , 2019, NeurIPS.

[20]  Goran Glavas,et al.  Informing Unsupervised Pretraining with External Linguistic Knowledge , 2019, ArXiv.

[21]  Praveen Paritosh,et al.  Freebase: a collaboratively created graph database for structuring human knowledge , 2008, SIGMOD Conference.

[22]  Christopher D. Manning,et al.  A Structural Probe for Finding Syntax in Word Representations , 2019, NAACL.

[23]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[24]  Simon Ostermann,et al.  MCScript2.0: A Machine Comprehension Corpus Focused on Script Events and Participants , 2019, *SEMEVAL.

[25]  Xavier Carreras,et al.  Semantic Role Labeling: An Introduction to the Special Issue , 2008, Computational Linguistics.

[26]  Olivier Bodenreider,et al.  The Unified Medical Language System (UMLS): integrating biomedical terminology , 2004, Nucleic Acids Res..

[27]  Roy Schwartz,et al.  Knowledge Enhanced Contextual Word Representations , 2019, EMNLP/IJCNLP.

[28]  Hao Tian,et al.  ERNIE 2.0: A Continual Pre-training Framework for Language Understanding , 2019, AAAI.

[29]  Zhen-Hua Ling,et al.  Align, Mask and Select: A Simple Method for Incorporating Commonsense Knowledge into Language Representation Models , 2019, ArXiv.

[30]  Hai Zhao,et al.  Semantics-aware BERT for Language Understanding , 2020, AAAI.

[31]  Alec Radford,et al.  Improving Language Understanding by Generative Pre-Training , 2018 .

[32]  Yiming Yang,et al.  XLNet: Generalized Autoregressive Pretraining for Language Understanding , 2019, NeurIPS.

[33]  Yejin Choi,et al.  Commonsense Knowledge Base Completion with Structural and Semantic Context , 2020, AAAI.

[34]  Razvan Pascanu,et al.  Overcoming catastrophic forgetting in neural networks , 2016, Proceedings of the National Academy of Sciences.

[35]  Iryna Gurevych,et al.  Common Sense or World Knowledge? Investigating Adapter-Based Knowledge Injection into Pretrained Transformers , 2020, DEELIO.

[36]  Nicholas Jing Yuan,et al.  Integrating Graph Contextualized Knowledge into Pre-trained Language Models , 2019, FINDINGS.

[37]  Xuanjing Huang,et al.  K-Adapter: Infusing Knowledge into Pre-Trained Models with Adapters , 2020, FINDINGS.

[38]  Mona Attariyan,et al.  Parameter-Efficient Transfer Learning for NLP , 2019, ICML.

[39]  Tao Shen,et al.  Exploiting Structured Knowledge in Text via Graph-Guided Representation Learning , 2020, EMNLP.