Containment of Data Graph Queries

The graph database model is currently one of the most popular paradigms for storing data, used in applications such as social networks, biological databases and the Semantic Web. Despite the popularity of this model, the development of graph database management systems is still in its infancy, and there are several fundamental issues regarding graph databases that are not fully understood. Indeed, while graph query languages that concentrate on topological properties are now well developed, not much is known about languages that can query both the topology of graphs and their underlying data. Our goal is to conduct a detailed study of static analysis problems for such languages. In this paper we consider the containment problem for several recently proposed classes of queries that manipulate both topology and data: regular queries with memory, regular queries with data tests, and graph XPath. Our results show that the problem is in general undecidable for all of these classes. However, by allowing only positive data comparisons we nd natural fragments that enjoy much better static analysis properties: the containment problem is decidable, and its computational complexity ranges from PSPACE-complete to EXPSPACEcomplete. We also propose extensions of regular queries with an inverse operator, and study query evaluation and query containment for them.

[1]  Pablo Barceló,et al.  Querying graph patterns , 2011, PODS.

[2]  Bernhard Beckert,et al.  Dynamic Logic , 2007, The KeY Approach.

[3]  Maarten de Rijke,et al.  A Modal Perspective on Path Constraints , 2003, J. Log. Comput..

[4]  Carsten Lutz,et al.  The complexity of query containment in expressive fragments of XPath 2.0 , 2007, PODS.

[5]  Diego Calvanese,et al.  Containment of Conjunctive Regular Path Queries with Inverse , 2000, KR.

[6]  Juan L. Reutter Containment of Nested Regular Expressions , 2013, ArXiv.

[7]  Thomas Schwentick,et al.  XPath query containment , 2004, SGMD.

[8]  Leonid Libkin,et al.  Regular expressions for data words , 2012, J. Comput. Syst. Sci..

[9]  Leonid Libkin,et al.  Regular path queries on graphs with data , 2012, ICDT '12.

[10]  Magdalena Ortiz,et al.  Conjunctive Regular Path Queries in Lightweight Description Logics , 2013, IJCAI.

[11]  Wim Martens,et al.  Querying graph databases with XPath , 2013, ICDT '13.

[12]  Diego Calvanese,et al.  View-Based Query Answering and Query Containment over Semistructured Data , 2001, DBPL.

[13]  Carlos A. Hurtado,et al.  Edinburgh Research Explorer Expressive Languages for Path Queries over Graph-Structured Data , 2012 .

[14]  Dan Suciu,et al.  Query containment for conjunctive queries with regular expressions , 1998, PODS.

[15]  Marcelo Arenas,et al.  nSPARQL: A Navigational Language for RDF , 2008, SEMWEB.

[16]  Nissim Francez,et al.  Finite-Memory Automata , 1994, Theor. Comput. Sci..

[17]  Claudio Gutierrez,et al.  Survey of graph database models , 2008, CSUR.

[18]  Serge Abiteboul,et al.  Foundations of Databases , 1994 .

[19]  Yuri Gurevich,et al.  The Classical Decision Problem , 1997, Perspectives in Mathematical Logic.

[20]  Diego Calvanese,et al.  Reasoning on regular path queries , 2003, SGMD.

[21]  Sergio Greco,et al.  Querying Graph Databases , 2000, EDBT.

[22]  Carsten Lutz,et al.  PDL with intersection and converse: satisfiability and infinite-state model checking , 2009, The Journal of Symbolic Logic.

[23]  Y. Gurevich,et al.  Remarks on Berger's paper on the domino problem , 1972 .

[24]  Luc Segoufin Automata and Logics for Words and Trees over an Infinite Alphabet , 2006, CSL.

[25]  Dan Suciu,et al.  Containment and equivalence for a fragment of XPath , 2004, JACM.

[26]  Claire David,et al.  Containment of pattern-based queries over data trees , 2013, ICDT '13.

[27]  Pablo Barceló,et al.  Parameterized regular expressions and their languages , 2011, Theor. Comput. Sci..

[28]  Leonid Libkin,et al.  Trial for RDF: adapting graph query languages for RDF data , 2013, PODS '13.

[29]  Alberto O. Mendelzon,et al.  A graphical query language supporting recursion , 1987, SIGMOD '87.

[30]  Robert Goldblatt,et al.  Well-structured program equivalence is highly undecidable , 2011, TOCL.

[31]  Maurizio Lenzerini,et al.  Data integration: a theoretical perspective , 2002, PODS.

[32]  Jorge Pérez,et al.  Relative Expressiveness of Nested Regular Expressions , 2012, AMW.

[33]  Peter T. Wood,et al.  Query languages for graph databases , 2012, SGMD.

[34]  Thomas Schwentick,et al.  Finite state machines for strings over infinite alphabets , 2004, TOCL.

[35]  Inderpal Singh Mumick,et al.  The Stanford Data Warehousing Project , 1995 .

[36]  Michael Benedikt,et al.  XPath leashed , 2009, CSUR.

[37]  Dan Suciu,et al.  A query language and optimization techniques for unstructured data , 1996, SIGMOD '96.

[38]  Alberto O. Mendelzon,et al.  GraphLog: a visual formalism for real life recursion , 1990, PODS '90.

[39]  Maarten Marx,et al.  Conditional XPath , 2005, TODS.

[40]  Michael Benedikt,et al.  XPath satisfiability in the presence of DTDs , 2008, JACM.

[41]  Oscar H. Ibarra,et al.  On Stateless Automata and P Systems , 2008, Int. J. Found. Comput. Sci..