Considering User Intention in Differential Graph Queries

Empty answers are a major problem by processing pattern matching queries in graph databases. Especially, there can be multiple reasons why a query failed. To support users in such situations, differential queries can be used that deliver missing parts of a graph query. Multiple heuristics are proposed for differential queries, which reduce the search space. Although they are successful in increasing the performance, they can discard query subgraphs relevant to a user. To address this issue, the authors extend the concept of differential queries and introduce top-k differential queries that calculate the ranking based on users' preferences and significantly support the users' understanding of query database management systems. A user assigns relevance weights to elements of a graph query that steer the search and are used for the ranking. In this paper the authors propose different strategies for selection of relevance weights and their propagation. As a result, the search is modelled along the most relevant paths. The authors evaluate their solution and both strategies on the DBpedia data graph.

[1]  Gerhard Weikum,et al.  NAGA: Searching and Ranking Knowledge , 2008, 2008 IEEE 24th International Conference on Data Engineering.

[2]  Carlisle George,et al.  Compiling Medical Data into National Medical Databases: Legitimate Practice or Data Protection Concern , 2008 .

[3]  Quoc Trung Tran,et al.  How to ConQueR why-not questions , 2010, SIGMOD Conference.

[4]  Wolfgang Lehner,et al.  Top-k Differential Queries in Graph Databases , 2014, ADBIS.

[5]  Melanie Herschel,et al.  Query-Based Why-Not Provenance with NedExplain , 2014, EDBT.

[6]  Wang Chiew Tan,et al.  Artemis: A System for Analyzing Missing Answers , 2009, Proc. VLDB Endow..

[7]  Hong Cheng,et al.  Finding top-k similar graphs in graph databases , 2012, EDBT '12.

[8]  Gautam Das,et al.  A Probabilistic Optimization Framework for the Empty-Answer Problem , 2013, Proc. VLDB Endow..

[9]  Xin Wang,et al.  Diversified Top-k Graph Pattern Matching , 2013, Proc. VLDB Endow..

[10]  Scott J. Lloyd,et al.  The schema mapper: an expert system that determines the least cost physical file structure in a database management system , 1997 .

[11]  Federica Mandreoli,et al.  Flexible query answering on graph-modeled data , 2009, EDBT '09.

[12]  Sihem Amer-Yahia,et al.  Structure and Content Scoring for XML , 2005, VLDB.

[13]  Surajit Chaudhuri Generalization and a framework for query modification , 1990, [1990] Proceedings. Sixth International Conference on Data Engineering.

[14]  Lei Zou,et al.  Top-k subgraph matching query in a large graph , 2007, PIKM '07.

[15]  Jiawei Han,et al.  Top-K interesting subgraph discovery in information networks , 2014, 2014 IEEE 30th International Conference on Data Engineering.

[16]  J. J. McGregor,et al.  Backtrack search algorithms and the maximal common subgraph problem , 1982, Softw. Pract. Exp..

[17]  Melanie Herschel,et al.  Explaining missing answers to SPJUA queries , 2010, Proc. VLDB Endow..

[18]  Chengfei Liu,et al.  On Modeling Query Refinement by Capturing User Intent through Feedback , 2012, ADC.

[19]  Marko A. Rodriguez,et al.  Constructions from Dots and Lines , 2010, ArXiv.

[20]  E. Prud hommeaux,et al.  SPARQL query language for RDF , 2011 .

[21]  Erhard Rahm,et al.  Similarity flooding: a versatile graph matching algorithm and its application to schema matching , 2002, Proceedings 18th International Conference on Data Engineering.

[22]  Hao Zhou,et al.  Querying graphs with uncertain predicates , 2010, MLG '10.

[23]  Melanie Herschel Wondering why data are missing from query results?: ask conseil why-not , 2013, CIKM.

[24]  Jeffrey F. Naughton,et al.  On the provenance of non-answers to queries over extracted data , 2008, Proc. VLDB Endow..

[25]  Dietmar Jannach Techniques for Fast Query Relaxation in Content-Based Recommender Systems , 2006, KI.