Discovering Neighborhood Pattern Queries by sample answers in knowledge base

Knowledge bases have shown their effectiveness in facilitating services like Web search and question-answering. Nevertheless, it remains challenging for ordinary users to fully understand the structure of a knowledge base and to issue structural queries. In many cases, users may have a natural language question and also know some popular (but not all) entities as sample answers. In this paper, we study the Reverse top-k Neighborhood Pattern Query problem, with the aim of discovering structural queries of the question based on: (i) the structure of the knowledge base, and (ii) the sample answers of the question. The proposed solution contains two phases: filter and refine. In the filter phase, a search space of candidate queries is systematically explored. The invalid queries whose result sets do not fully cover the sample answers are filtered out. In the refine phase, all surviving queries are verified to ensure that they are sufficiently relevant to the sample answers, with the assumption that the sample answers are more well-known or popular than other entities in the results of relevant queries. Several optimization techniques are proposed to accelerate the refine phrase. For evaluation, we conduct extensive experiments using the DBpedia knowledge base and a set of real-life questions. Empirical results show that our algorithm is able to provide a small set of possible queries, which contains the query matching the user question in natural language.

[1]  Angela Bonifati,et al.  Interactive Inference of Join Queries , 2014, EDBT.

[2]  Srinivasan Parthasarathy,et al.  Query by output , 2009, SIGMOD Conference.

[3]  Bernhard Seeger,et al.  Efficient Computation of Reverse Skyline Queries , 2007, VLDB.

[4]  Estevam R. Hruschka,et al.  Toward an Architecture for Never-Ending Language Learning , 2010, AAAI.

[5]  Hao Su,et al.  Reverse Top-k Search using Random Walk with Restart , 2013, Proc. VLDB Endow..

[6]  Timos K. Sellis,et al.  Multiple-query optimization , 1988, TODS.

[7]  Surajit Chaudhuri,et al.  Discovering queries based on example tuples , 2014, SIGMOD Conference.

[8]  Lei Zou,et al.  gStore: Answering SPARQL Queries via Subgraph Matching , 2011, Proc. VLDB Endow..

[9]  Ji-Rong Wen,et al.  Mining frequent neighborhood patterns in a large labeled graph , 2013, CIKM.

[10]  Alon Y. Halevy,et al.  Answering queries using views: A survey , 2001, The VLDB Journal.

[11]  S. Muthukrishnan,et al.  Influence sets based on reverse nearest neighbor queries , 2000, SIGMOD '00.

[12]  Praveen Paritosh,et al.  Freebase: a collaboratively created graph database for structuring human knowledge , 2008, SIGMOD Conference.

[13]  Prasan Roy,et al.  Efficient and extensible algorithms for multi query optimization , 1999, SIGMOD '00.

[14]  Rajeev Motwani,et al.  The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.

[15]  Jian Pei,et al.  Within-Network Classification Using Radius-Constrained Neighborhood Patterns , 2014, CIKM.

[16]  Meihui Zhang,et al.  Reverse engineering complex join queries , 2013, SIGMOD '13.

[17]  Themis Palpanas,et al.  Exemplar Queries: Give me an Example of What You Need , 2014, Proc. VLDB Endow..

[18]  Marcin Sydow,et al.  QBEES: query by entity examples , 2013, CIKM.

[19]  Reynold Cheng,et al.  Discovering Meta-Paths in Large Heterogeneous Information Networks , 2015, WWW.

[20]  Christos Doulkeridis,et al.  Reverse top-k queries , 2010, 2010 IEEE 26th International Conference on Data Engineering (ICDE 2010).

[21]  Slawomir Staworko,et al.  Learning twig and path queries , 2012, ICDT '12.

[22]  Haixun Wang,et al.  Semantic queries by example , 2013, EDBT '13.

[23]  Gerhard Weikum,et al.  WWW 2007 / Track: Semantic Web Session: Ontologies ABSTRACT YAGO: A Core of Semantic Knowledge , 2022 .

[24]  Jens Lehmann,et al.  DBpedia: A Nucleus for a Web of Open Data , 2007, ISWC/ASWC.

[25]  Aurélien Lemay,et al.  Learning Path Queries on Graph Databases , 2015, EDBT.

[26]  Feifei Li,et al.  Scalable Multi-query Optimization for SPARQL , 2012, 2012 IEEE 28th International Conference on Data Engineering.

[27]  Andrew Chou,et al.  Semantic Parsing on Freebase from Question-Answer Pairs , 2013, EMNLP.

[28]  编程语言 Query by Example , 2010, Encyclopedia of Database Systems.

[29]  Ramez Elmasri,et al.  Querying Knowledge Graphs by Example Entity Tuples , 2013, IEEE Transactions on Knowledge and Data Engineering.

[30]  Jeffrey F. Naughton,et al.  Simultaneous optimization and evaluation of multiple dimensional queries , 1998, SIGMOD '98.

[31]  Gerhard Weikum,et al.  Natural Language Questions for the Web of Data , 2012, EMNLP.

[32]  Jens Lehmann,et al.  Template-based question answering over RDF data , 2012, WWW.

[33]  Haixun Wang,et al.  Probase: a probabilistic taxonomy for text understanding , 2012, SIGMOD Conference.

[34]  Dongyan Zhao,et al.  Natural language question answering over RDF: a graph data driven approach , 2014, SIGMOD Conference.