Message Passing for Complex Question Answering over Knowledge Graphs

Question answering over knowledge graphs (KGQA) has evolved from simple single-fact questions to complex questions that require graph traversal and aggregation. We propose a novel approach for complex KGQA that uses unsupervised message passing, which propagates confidence scores obtained by parsing an input question and matching terms in the knowledge graph to a set of possible answers. First, we identify entity, relationship, and class names mentioned in a natural language question, and map these to their counterparts in the graph. Then, the confidence scores of these mappings propagate through the graph structure to locate the answer entities. Finally, these are aggregated depending on the identified question type. This approach can be efficiently implemented as a series of sparse matrix multiplications mimicking joins over small local subgraphs. Our evaluation results show that the proposed approach outperforms the state of the art on the LC-QuAD benchmark. Moreover, we show that the performance of the approach depends only on the quality of the question interpretation results, i.e., given a correct relevance score distribution, our approach always produces a correct answer ranking. Our error analysis reveals correct answers missing from the benchmark dataset and inconsistencies in the DBpedia knowledge graph. Finally, we provide a comprehensive evaluation of the proposed approach accompanied with an ablation study and an error analysis, which showcase the pitfalls for each of the question answering components in more detail.

[1]  Franz Franchetti,et al.  Mathematical foundations of the GraphBLAS , 2016, 2016 IEEE High Performance Extreme Computing Conference (HPEC).

[2]  Jens Lehmann,et al.  No One is Perfect: Analysing the Performance of Question Answering Components over the DBpedia Knowledge Graph , 2018, J. Web Semant..

[3]  Günter Neumann,et al.  The QALL-ME Framework: A specifiable-domain multilingual Question Answering architecture , 2011, J. Web Semant..

[4]  Jens Lehmann,et al.  Formal Query Generation for Question Answering over Knowledge Bases , 2018, ESWC.

[5]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[6]  Muhammad Saleem,et al.  9th Challenge on Question Answering over Linked Data (QALD-9) (invited paper) , 2018, Semdeep/NLIWoD@ISWC.

[7]  Kuldeep Singh,et al.  Frankenstein: A Platform Enabling Reuse of Question Answering Components , 2018, ESWC.

[8]  Razvan Pascanu,et al.  Relational inductive biases, deep learning, and graph networks , 2018, ArXiv.

[9]  Jens Lehmann,et al.  EARL: Joint Entity and Relation Linking for Question Answering over Knowledge Graphs , 2018, SEMWEB.

[10]  Abraham Bernstein,et al.  How Useful Are Natural Language Interfaces to the Semantic Web for Casual End-Users? , 2007, ISWC/ASWC.

[11]  Miguel A. Martínez-Prieto,et al.  Exchange and Consumption of Huge RDF Data , 2012, ESWC.

[12]  Tiejun Zhao,et al.  Constraint-Based Question Answering with Knowledge Graph , 2016, COLING.

[13]  Stefan Decker,et al.  Knowledge Graphs: New Directions for Knowledge Representation on the Semantic Web (Dagstuhl Seminar 18371) , 2019, Dagstuhl Reports.

[14]  André Freitas,et al.  An Introduction to Question Answering over Linked Data , 2014, Reasoning Web.

[15]  Jens Lehmann,et al.  Learning to Rank Query Graphs for Complex Question Answering over Knowledge Graphs , 2018, SEMWEB.

[16]  Judea Pearl,et al.  Chapter 2 – BAYESIAN INFERENCE , 1988 .

[17]  Seán O'Riain,et al.  Querying Heterogeneous Datasets on the Linked Data Web: Challenges, Approaches, and Trends , 2012, IEEE Internet Computing.

[18]  Jens Lehmann,et al.  DBpedia - A large-scale, multilingual knowledge base extracted from Wikipedia , 2015, Semantic Web.

[19]  Meng Wang,et al.  Towards Empty Answers in SPARQL: Approximating Querying with RDF Embedding , 2018, SEMWEB.

[20]  Dongyan Zhao,et al.  Natural language question answering over RDF: a graph data driven approach , 2014, SIGMOD Conference.

[21]  Samuel S. Schoenholz,et al.  Neural Message Passing for Quantum Chemistry , 2017, ICML.

[22]  Pierre Maret,et al.  Towards a Question Answering System over the Semantic Web , 2018, Semantic Web.

[23]  Markus Krötzsch,et al.  Wikidata , 2014, Commun. ACM.

[24]  Luke S. Zettlemoyer,et al.  SimpleQuestions Nearly Solved: A New Upperbound and Baseline Approach , 2018, EMNLP.

[25]  Nir Friedman,et al.  Probabilistic Graphical Models - Principles and Techniques , 2009 .

[26]  Axel-Cyrille Ngonga Ngomo,et al.  The Scalable Question Answering Over Linked Data (SQA) Challenge 2018 , 2018, SemWebEval@ESWC.

[27]  Gary G. Hendrix Natural-Language Interface , 1982, Am. J. Comput. Linguistics.

[28]  Chris Callison-Burch,et al.  Magnitude: A Fast, Efficient Universal Vector Embedding Utility Package , 2018, EMNLP.

[29]  Murtaza Haider,et al.  Beyond the hype: Big data concepts, methods, and analytics , 2015, Int. J. Inf. Manag..

[30]  Panos Kalnis,et al.  A Demonstration of MAGiQ: Matrix Algebra Approach for Solving RDF Graph Queries , 2018, Proc. VLDB Endow..

[31]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[32]  Christopher D. Manning,et al.  Introduction to Information Retrieval , 2010, J. Assoc. Inf. Sci. Technol..

[33]  Mark Sanderson,et al.  Christopher D. Manning, Prabhakar Raghavan, Hinrich Schütze, Introduction to Information Retrieval, Cambridge University Press 2008. ISBN-13 978-0-521-86571-5, xxi + 482 pages , 2010, Natural Language Engineering.

[34]  Axel Polleres,et al.  Binary RDF representation for publication and exchange (HDT) , 2013, J. Web Semant..

[35]  Jian Zhang,et al.  SQuAD: 100,000+ Questions for Machine Comprehension of Text , 2016, EMNLP.

[36]  Jens Lehmann,et al.  LC-QuAD: A Corpus for Complex Question Answering over Knowledge Graphs , 2017, SEMWEB.

[37]  Max Welling,et al.  Modeling Relational Data with Graph Convolutional Networks , 2017, ESWC.

[38]  Jason Weston,et al.  Large-scale Simple Question Answering with Memory Networks , 2015, ArXiv.

[39]  Viktor de Boer,et al.  The knowledge graph as the default data model for learning on heterogeneous knowledge , 2017, Data Sci..

[40]  Jens Lehmann,et al.  Why Reinvent the Wheel: Let's Build Question Answering Systems Together , 2018, WWW.

[41]  André Freitas,et al.  OKBQA: an Open Collaboration Framework for Development of Natural Language Question-Answering over Knowledge Bases , 2017, SEMWEB.

[42]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[43]  Tomas Mikolov,et al.  Enriching Word Vectors with Subword Information , 2016, TACL.

[44]  Edward Curry,et al.  Querying linked data graphs using semantic relatedness: A vocabulary independent approach , 2013, Data Knowl. Eng..

[45]  Bert F. Green,et al.  Baseball: an automatic question-answerer , 1899, IRE-AIEE-ACM '61 (Western).

[46]  Jure Leskovec,et al.  Embedding Logical Queries on Knowledge Graphs , 2018, NeurIPS.

[47]  Iryna Gurevych,et al.  Modeling Semantics with Gated Graph Neural Networks for Knowledge Base Question Answering , 2018, COLING.

[48]  André Freitas,et al.  Question Answering Mediated by Visual Clues and Knowledge Graphs , 2018, WWW.

[49]  Yash Goyal,et al.  Making the V in VQA Matter: Elevating the Role of Image Understanding in Visual Question Answering , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).