Towards Optimisation of Collaborative Question Answering over Knowledge Graphs

Collaborative Question Answering (CQA) frameworks for knowledge graphs aim at integrating existing question answering (QA) components for implementing sequences of QA tasks (i.e. QA pipelines). The research community has paid substantial attention to CQAs since they support reusability and scalability of the available components in addition to the flexibility of pipelines. CQA frameworks attempt to build such pipelines automatically by solving two optimisation problems: 1) local collective performance of QA components per QA task and 2) global performance of QA pipelines. In spite offering several advantages over monolithic QA systems, the effectiveness and efficiency of CQA frameworks in answering questions is limited. In this paper, we tackle the problem of local optimisation of CQA frameworks and propose a three fold approach, which applies feature selection techniques with supervised machine learning approaches in order to identify the best performing components efficiently. We have empirically evaluated our approach over existing benchmarks and compared to existing automatic CQA frameworks. The observed results provide evidence that our approach answers a higher number of questions than the state of the art while reducing: i) the number of used features by 50% and ii) the number of components used by 76%.

[1]  Jens Lehmann,et al.  Why Reinvent the Wheel: Let's Build Question Answering Systems Together , 2018, WWW.

[2]  Jens Lehmann,et al.  QaldGen: Towards Microbenchmarking of Question Answering Systems over Knowledge Graphs , 2019, SEMWEB.

[3]  Gerhard Weikum,et al.  Robust Disambiguation of Named Entities in Text , 2011, EMNLP.

[4]  Kuldeep Singh,et al.  Capturing Knowledge in Semantically-typed Relational Patterns to Enhance Relation Linking , 2017, K-CAP.

[5]  Elena Cabrio,et al.  Question Answering over Linked Data (QALD-5) , 2014, CLEF.

[6]  Petya Osenova,et al.  The Role of the WordNet Relations in the Knowledge-based Word Sense Disambiguation Task , 2016, GWC.

[7]  Sören Auer,et al.  AGDISTIS - Graph-Based Disambiguation of Named Entities Using Linked Data , 2014, International Semantic Web Conference.

[8]  Felix Conrads,et al.  Benchmarking question answering systems , 2019, Semantic Web.

[9]  Praveen Paritosh,et al.  Freebase: a collaboratively created graph database for structuring human knowledge , 2008, SIGMOD Conference.

[10]  Pierre Geurts,et al.  Extremely randomized trees , 2006, Machine Learning.

[11]  Chandrajit L. Bajaj,et al.  Higher Order Mutual Information Approximation for Feature Selection , 2016, ArXiv.

[12]  Saeedeh Shekarpour,et al.  Semantic Interpretation of User Query for Question Answering on Interlinked Data , 2015 .

[13]  Lei Zou,et al.  Interactive natural language question answering over knowledge graphs , 2019, Inf. Sci..

[14]  Jens Lehmann,et al.  DBpedia: A Nucleus for a Web of Open Data , 2007, ISWC/ASWC.

[15]  Antonio Toral,et al.  Exploiting Wikipedia and EuroWordNet to solve Cross-Lingual Question Answering , 2009, Inf. Sci..

[16]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[17]  Jens Lehmann,et al.  Formal Query Generation for Question Answering over Knowledge Bases , 2018, ESWC.

[18]  Jens Lehmann,et al.  EARL: Joint Entity and Relation Linking for Question Answering over Knowledge Graphs , 2018, SEMWEB.

[19]  Paolo Ferragina,et al.  Fast and Accurate Annotation of Short Texts with Wikipedia Pages , 2010, IEEE Software.

[20]  Kuldeep Singh,et al.  Qanary - A Methodology for Vocabulary-Driven Open Question Answering Systems , 2016, ESWC.

[21]  Günter Neumann,et al.  The QALL-ME Framework: A specifiable-domain multilingual Question Answering architecture , 2011, J. Web Semant..

[22]  Jens Lehmann,et al.  Towards an open question answering architecture , 2014, SEM '14.

[23]  Roberto Navigli,et al.  Entity Linking meets Word Sense Disambiguation: a Unified Approach , 2014, TACL.

[24]  Tomas Mikolov,et al.  Enriching Word Vectors with Subword Information , 2016, TACL.

[25]  Christian Bizer,et al.  DBpedia spotlight: shedding light on the web of documents , 2011, I-Semantics '11.

[26]  John B. Shoven,et al.  I , Edinburgh Medical and Surgical Journal.

[27]  Kuldeep Singh,et al.  Towards a Message-Driven Vocabulary for Promoting the Interoperability of Question Answering Systems , 2016, 2016 IEEE Tenth International Conference on Semantic Computing (ICSC).

[28]  Kuldeep Singh,et al.  Qanary - The Fast Track to Creating a Question Answering System with Linked Data Technology , 2016, ESWC.

[29]  Xiaodong Lin,et al.  Learning a complex metabolomic dataset using random forests and support vector machines , 2004, KDD.

[30]  P ? ? ? ? ? ? ? % ? ? ? ? , 1991 .

[31]  Paloma Martínez,et al.  Turning user generated health-related content into actionable knowledge through text analytics services , 2016, Comput. Ind..

[32]  Kuldeep Singh,et al.  Frankenstein: A Platform Enabling Reuse of Question Answering Components , 2018, ESWC.

[33]  Jens Lehmann,et al.  No One is Perfect: Analysing the Performance of Question Answering Components over the DBpedia Knowledge Graph , 2018, J. Web Semant..

[34]  Jens Lehmann,et al.  An Open Question Answering Framework , 2015, International Semantic Web Conference.

[35]  Kyong-Ho Lee,et al.  Predicate constraints based question answering over knowledge graph , 2019, Inf. Process. Manag..

[36]  Tsuyoshi Murata,et al.  {m , 1934, ACML.

[37]  Jens Lehmann,et al.  Template-based question answering over RDF data , 2012, WWW.

[38]  Jiseong Kim,et al.  The Open Framework for Developing Knowledge Base And Question Answering System , 2016, COLING.

[39]  Christopher D. Manning,et al.  Incorporating Non-local Information into Information Extraction Systems by Gibbs Sampling , 2005, ACL.

[40]  Jens Lehmann,et al.  LC-QuAD: A Corpus for Complex Question Answering over Knowledge Graphs , 2017, SEMWEB.

[41]  Xin Hu,et al.  Natural Language Aggregate Query over RDF Data , 2017, Inf. Sci..

[42]  Sören Auer,et al.  SINA: Semantic interpretation of user queries for question answering on interlinked data , 2015, J. Web Semant..

[43]  Kuldeep Singh,et al.  Matching Natural Language Relations to Knowledge Graph Properties for Question Answering , 2017, SEMANTiCS.

[44]  Denny Vrandecic,et al.  Wikidata: a new platform for collaborative data collection , 2012, WWW.

[45]  Kuldeep Singh,et al.  Dynamic Composition of Question Answering Pipelines with FRANKENSTEIN , 2018, SIGIR.