Empowering Language Models with Knowledge Graph Reasoning for Open-Domain Question Answering

Answering open-domain questions requires world knowledge about in-context entities. As pre-trained Language Models (LMs) lack the power to store all required knowledge, external knowledge sources, such as knowledge graphs, are often used to augment LMs. In this work, we propose knOwledge REasOning empowered Language Model(OREO-LM), which consists of a novel Knowledge Interaction Layer that can be flexibly plugged into existing Transformer-based LMs to interact with a differentiable Knowledge Graph Reasoning module collaboratively. In this way, LM guides KG to walk towards the desired answer, while the retrieved knowledge improves LM.By adopting OREO-LM to RoBERTa and T5, we show significant performance gain, achieving state-of-art results in the Closed-Book setting. The performance enhancement is mainly from the KG reasoning’s capacity to infer missing relational facts. In addition, OREO-LM provides reasoning paths as rationales to interpret the model’s decision.

[1]  Dan Iter,et al.  Generate rather than Retrieve: Large Language Models are Strong Context Generators , 2022, ICLR.

[2]  Michiel de Jong,et al.  Augmenting Pre-trained Language Models with QA-Memory for Open-Domain Question Answering , 2022, EACL.

[3]  M. Zaheer,et al.  Knowledge Base Question Answering by Case-based Reasoning over Subgraphs , 2022, ICML.

[4]  Jure Leskovec,et al.  GreaseLM: Graph REASoning Enhanced Language Models for Question Answering , 2022, ICLR.

[5]  Shuohang Wang,et al.  KG-FiD: Infusing Knowledge Graph in Fusion-in-Decoder for Open-Domain Question Answering , 2021, ACL.

[6]  Kai-Wei Chang,et al.  Relation-Guided Pre-Training for Open-Domain Question Answering , 2021, EMNLP.

[7]  Danqi Chen,et al.  Simple Entity-Centric Questions Challenge Dense Retrievers , 2021, EMNLP.

[8]  Xiaoyan Zhu,et al.  JointGT: Graph-Text Joint Representation Learning for Text Generation from Knowledge Graphs , 2021, FINDINGS.

[9]  Dani Yogatama,et al.  End-to-End Training of Multi-Document Reader and Retriever for Open-Domain Question Answering , 2021, NeurIPS.

[10]  Haitian Sun,et al.  Adaptable and Interpretable Neural MemoryOver Symbolic Knowledge , 2021, NAACL.

[11]  Rajarshi Das,et al.  Case-based Reasoning for Natural Language Queries over Knowledge Bases , 2021, EMNLP.

[12]  J. Leskovec,et al.  QA-GNN: Reasoning with Language Models and Knowledge Graphs for Question Answering , 2021, NAACL.

[13]  Yuxiang Wu,et al.  PAQ: 65 Million Probably-Asked Questions and What You Can Do With Them , 2021, Transactions of the Association for Computational Linguistics.

[14]  William L. Hamilton,et al.  End-to-End Training of Neural Retrievers for Open-Domain Question Answering , 2021, ACL.

[15]  William W. Cohen,et al.  Differentiable Open-Ended Commonsense Reasoning , 2020, NAACL.

[16]  Zhiting Hu,et al.  A Survey of Knowledge-enhanced Text Generation , 2020, ACM Comput. Surv..

[17]  Donghan Yu,et al.  JAKET: Joint Pre-training of Knowledge Graph and Language Understanding , 2020, AAAI.

[18]  Nicola De Cao,et al.  Autoregressive Entity Retrieval , 2020, ICLR.

[19]  Philip S. Yu,et al.  KG-BART: Knowledge Graph-Augmented BART for Generative Commonsense Reasoning , 2020, AAAI.

[20]  Sebastian Riedel,et al.  Question and Answer Test-Train Overlap in Open-Domain Question Answering Datasets , 2020, EACL.

[21]  Jun Yan,et al.  Scalable Multi-Hop Relational Reasoning for Knowledge-Aware Question Answering , 2020, EMNLP.

[22]  Eunsol Choi,et al.  Entities as Experts: Sparse Memory Access with Entity Supervision , 2020, EMNLP.

[23]  Danqi Chen,et al.  Dense Passage Retrieval for Open-Domain Question Answering , 2020, EMNLP.

[24]  William W. Cohen,et al.  Differentiable Reasoning over a Virtual Knowledge Base , 2020, ICLR.

[25]  Jure Leskovec,et al.  Query2box: Reasoning over Knowledge Graphs in Vector Space using Box Embeddings , 2020, ICLR.

[26]  Colin Raffel,et al.  How Much Knowledge Can You Pack into the Parameters of a Language Model? , 2020, EMNLP.

[27]  Ming-Wei Chang,et al.  REALM: Retrieval-Augmented Language Model Pre-Training , 2020, ICML.

[28]  R. Socher,et al.  Learning to Retrieve Reasoning Paths over Wikipedia Graph for Question Answering , 2019, ICLR.

[29]  Zhiyuan Liu,et al.  KEPLER: A Unified Model for Knowledge Embedding and Pre-trained Language Representation , 2019, Transactions of the Association for Computational Linguistics.

[30]  Danqi Chen,et al.  Knowledge Guided Text Retrieval and Reading for Open Domain Question Answering , 2019, ArXiv.

[31]  Hinrich Schütze,et al.  E-BERT: Efficient-Yet-Effective Entity Embeddings for BERT , 2019, FINDINGS.

[32]  Colin Raffel,et al.  Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer , 2019, J. Mach. Learn. Res..

[33]  Xiang Ren,et al.  KagNet: Knowledge-Aware Graph Networks for Commonsense Reasoning , 2019, EMNLP.

[34]  Ming-Wei Chang,et al.  Natural Questions: A Benchmark for Question Answering Research , 2019, TACL.

[35]  Yiming Yang,et al.  XLNet: Generalized Autoregressive Pretraining for Language Understanding , 2019, NeurIPS.

[36]  Maosong Sun,et al.  ERNIE: Enhanced Language Representation with Informative Entities , 2019, ACL.

[37]  Chang Zhou,et al.  Cognitive Graph for Multi-Hop Reading Comprehension at Scale , 2019, ACL.

[38]  William W. Cohen,et al.  PullNet: Open Domain Question Answering with Iterative Retrieval on Knowledge Bases and Text , 2019, EMNLP.

[39]  Yoshua Bengio,et al.  HotpotQA: A Dataset for Diverse, Explainable Multi-hop Question Answering , 2018, EMNLP.

[40]  Jonathan Berant,et al.  The Web as a Knowledge-Base for Answering Complex Questions , 2018, NAACL.

[41]  Alexander J. Smola,et al.  Go for a Walk and Arrive at the Answer: Reasoning Over Paths in Knowledge Bases using Reinforcement Learning , 2017, ICLR.

[42]  Wenhan Xiong,et al.  DeepPath: A Reinforcement Learning Method for Knowledge Graph Reasoning , 2017, EMNLP.

[43]  Eunsol Choi,et al.  TriviaQA: A Large Scale Distantly Supervised Challenge Dataset for Reading Comprehension , 2017, ACL.

[44]  Jason Weston,et al.  Reading Wikipedia to Answer Open-Domain Questions , 2017, ACL.

[45]  Ming-Wei Chang,et al.  Semantic Parsing via Staged Query Graph Generation: Question Answering with Knowledge Base , 2015, ACL.

[46]  Markus Krötzsch,et al.  Wikidata , 2014, Commun. ACM.

[47]  Jason Weston,et al.  Translating Embeddings for Modeling Multi-relational Data , 2013, NIPS.

[48]  Andrew Chou,et al.  Semantic Parsing on Freebase from Question-Answer Pairs , 2013, EMNLP.

[49]  Tom M. Mitchell,et al.  Random Walk Inference and Learning in A Large Scale Knowledge Base , 2011, EMNLP.

[50]  Ian S. Dunn,et al.  Exploring the Limits , 2009 .

[51]  Praveen Paritosh,et al.  Freebase: a collaboratively created graph database for structuring human knowledge , 2008, SIGMOD Conference.

[52]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[53]  Mark Andrew Greenwood,et al.  Open-domain question answering , 2005 .