Can Language Models Capture Graph Semantics? From Graphs to Language Model and Vice-Versa

Knowledge Graphs are a great resource to capture semantic knowledge in terms of entities and relationships between the entities. However, current deep learning models takes as input distributed representations or vectors. Thus, the graph is compressed in a vectorized representation. We conduct a study to examine if the deep learning model can compress a graph and then output the same graph with most of the semantics intact. Our experiments show that Transformer models are not able to express the full semantics of the input knowledge graph. We find that this is due to the disparity between the directed, relationship and type based information contained in a Knowledge Graph and the fully connected token-token undirected graphical interpretation of the Transformer Attention matrix.

[1]  Martin Jaggi,et al.  Interpreting Language Models Through Knowledge Graph Extraction , 2021, ArXiv.

[2]  Amit Sheth,et al.  Knowledge-Intensive Language Understanding for Explainable AI , 2021, IEEE Internet Computing.

[3]  J. Leskovec,et al.  QA-GNN: Reasoning with Language Models and Knowledge Graphs for Question Answering , 2021, NAACL.

[4]  Dawn Song,et al.  Language Models are Open Knowledge Graphs , 2020, ArXiv.

[5]  Mark Chen,et al.  Language Models are Few-Shot Learners , 2020, NeurIPS.

[6]  A. Sheth,et al.  Knowledge Infused Learning (K-IL): Towards Deep Incorporation of Knowledge in Deep Learning , 2019, AAAI Spring Symposium Combining Machine Learning with Knowledge Engineering.

[7]  Edward Choi,et al.  Graph Convolutional Transformer: Learning the Graphical Structure of Electronic Health Records , 2019, ArXiv.

[8]  Zhiyuan Liu,et al.  OpenKE: An Open Toolkit for Knowledge Embedding , 2018, EMNLP.

[9]  Alistair E. W. Johnson,et al.  The eICU Collaborative Research Database, a freely available multi-center database for critical care research , 2018, Scientific Data.

[10]  Raphaël Troncy,et al.  entity2rec: Learning User-Item Relatedness from Knowledge Graphs for Top-N Item Recommendation , 2017, RecSys.

[11]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[12]  Max Welling,et al.  Modeling Relational Data with Graph Convolutional Networks , 2017, ESWC.

[13]  Catherine Havasi,et al.  ConceptNet 5.5: An Open Multilingual Graph of General Knowledge , 2016, AAAI.

[14]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[15]  Markus Krötzsch,et al.  Wikidata , 2014, Commun. ACM.

[16]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[17]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[18]  Jens Lehmann,et al.  DBpedia: A Nucleus for a Web of Open Data , 2007, ISWC/ASWC.

[19]  W. Bruce Croft,et al.  Relevance-Based Language Models , 2001, SIGIR '01.

[20]  S. Hochreiter,et al.  Long Short-Term Memory , 1997, Neural Computation.

[21]  Wolf-Tilo Balke,et al.  Do Embeddings Actually Capture Knowledge Graph Semantics? , 2021, ESWC.

[22]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[23]  James M. Joyce Kullback-Leibler Divergence , 2011, International Encyclopedia of Statistical Science.

[24]  Ah Chung Tsoi,et al.  The Graph Neural Network Model , 2009, IEEE Transactions on Neural Networks.

[25]  Juan Enrique Ramos,et al.  Using TF-IDF to Determine Word Relevance in Document Queries , 2003 .