Quality Awareness over Graph Pattern Queries

We examine the problem of quality awareness when querying graph databases. According to quality annotations that denote quality problems appearing in data subgraphs (the annotations typically result from collaborative practices in the context of open data usage like e.g. users' feedbacks), we propose a notion of quality aware (graph pattern) query based on (usage-dependent) quality profiles. In this paper, we present the formal foundations of the approach. We also show how to simply extend a generic state-of-the-art algorithm for graph pattern queries evaluation in order to implement quality awareness at evaluation time and we study its complexity. We then expose implementation guidelines, supported by a proof-of-concept prototype based on the Neo4J graph database management system.

[1]  Moshe Y. Vardi On the complexity of bounded-variable queries (extended abstract) , 1995, PODS '95.

[2]  Josep-Lluís Larriba-Pey,et al.  Survey of Graph Database Performance on the HPC Scalable Graph Analysis Benchmark , 2010, WAIM Workshops.

[3]  Pablo Barceló Baeza Querying graph databases , 2013, PODS 2013.

[4]  Jens Lehmann,et al.  Test-driven evaluation of linked data quality , 2014, WWW.

[5]  Olivier Pivert,et al.  Expression and efficient processing of fuzzy queries in a graph database context , 2015, 2015 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE).

[6]  Jens Lehmann,et al.  Quality assessment for Linked Data: A Survey , 2015, Semantic Web.

[7]  Renzo Angles,et al.  A Comparison of Current Graph Database Models , 2012, 2012 IEEE 28th International Conference on Data Engineering Workshops.

[8]  Maribel Acosta,et al.  Crowdsourcing Linked Data Quality Assessment , 2013, SEMWEB.

[9]  Olivier Pivert,et al.  SPARQL extensions with preferences: a survey , 2016, SAC.

[10]  Georgia Koutrika,et al.  A survey on representation, composition and application of preferences in database systems , 2011, TODS.

[11]  Valeria Fionda,et al.  Querying graphs with preferences , 2013, CIKM.

[12]  Jens Lehmann,et al.  User-driven quality evaluation of DBpedia , 2013, I-SEMANTICS '13.

[13]  Yixin Chen,et al.  A comparison of a graph database and a relational database: a data provenance perspective , 2010, ACM SE '10.

[14]  Carlo Batini,et al.  Methodologies for data quality assessment and improvement , 2009, CSUR.

[15]  Dennis Shasha,et al.  GraphGrep: A fast and universal method for querying graphs , 2002, Object recognition supported by user interaction for service robots.

[16]  Pablo Barceló,et al.  Querying Regular Graph Patterns , 2014, JACM.

[17]  Carlo Batini,et al.  Data and Information Quality , 2016, Data-Centric Systems and Applications.

[18]  Moshe Y. Vardi On the Complexity of Bounded-Variable Queries. , 1995, PODS 1995.

[19]  Carlo Batini,et al.  Data and Information Quality , 2016, Data-Centric Systems and Applications.

[20]  Pablo Barceló,et al.  Querying graph databases , 2013, PODS '13.

[21]  Virginie Thion,et al.  A Methodology for Quality Assessment in Collaborative Score Libraries , 2016, ISMIR.

[22]  Jeong-Hoon Lee,et al.  An In-depth Comparison of Subgraph Isomorphism Algorithms in Graph Databases , 2012, Proc. VLDB Endow..

[23]  Claudio Gutierrez,et al.  Survey of graph database models , 2008, CSUR.