Querying graph databases with XPath

XPath plays a prominent role as an XML navigational language due to several factors, including its ability to express queries of interest, its close connection to yardstick database query languages (e.g., first-order logic), and the low complexity of query evaluation for many fragments. Another common database model---graph databases---also requires a heavy use of navigation in queries; yet it largely adopts a different approach to querying, relying on reachability patterns expressed with regular constraints. Our goal here is to investigate the behavior and applicability of XPath-like languages for querying graph databases, concentrating on their expressiveness and complexity of query evaluation. We are particularly interested in a model of graph data that combines navigation through graphs with querying data held in the nodes, such as, for example, in a social network scenario. As navigational languages, we use analogs of core and regular XPath and augment them with various tests on data values. We relate these languages to first-order logic, its transitive closure extensions, and finite-variable fragments thereof, proving several capture results. In addition, we describe their relative expressive power. We then show that they behave very well computationally: they have a low-degree polynomial combined complexity, which becomes linear for several fragments. Furthermore, we introduce new types of tests for XPath languages that let them capture first-order logic with data comparisons and prove that the low complexity bounds continue to apply to such extended languages. Therefore, XPath-like languages seem to be very well-suited to query graphs.

[1]  Jan Van den Bussche,et al.  The Impact of Transitive Closure on the Boolean Expressiveness of Navigational Query Languages on Graphs , 2012, FoIKS.

[2]  Leonid Libkin,et al.  Regular path queries on graphs with data , 2012, ICDT '12.

[3]  Balder ten Cate,et al.  The expressivity of XPath with transitive closure , 2006, PODS.

[4]  Maarten Marx,et al.  Navigational XPath: calculus and algebra , 2007, SGMD.

[5]  Diego Calvanese,et al.  An Automata-Theoretic Approach to Regular XPath , 2009, DBPL.

[6]  Diego Figueira,et al.  Reasoning on words and trees with data , 2010 .

[7]  Leonid Libkin,et al.  Elements of Finite Model Theory , 2004, Texts in Theoretical Computer Science.

[8]  Claudio Gutierrez,et al.  Survey of graph database models , 2008, CSUR.

[9]  Marcelo Arenas,et al.  Semantics and Complexity of SPARQL , 2006, International Semantic Web Conference.

[10]  Richard S. Varga,et al.  Proof of Theorem 6 , 1983 .

[11]  Jianzhong Li,et al.  Graph pattern matching , 2010, Proc. VLDB Endow..

[12]  Neil Immerman,et al.  Reachability Logic: An Efficient Fragment of Transitive Closure Logic , 2000, Log. J. IGPL.

[13]  Maarten Marx,et al.  Conditional XPath , 2005, TODS.

[14]  Mikolaj Bojanczyk,et al.  XPath evaluation in linear time , 2011, JACM.

[15]  Richard S. Varga,et al.  Proof of Theorem 5 , 1983 .

[16]  Leonid Libkin,et al.  Elements Of Finite Model Theory (Texts in Theoretical Computer Science. An Eatcs Series) , 2004 .

[17]  Serge Abiteboul,et al.  Regular path queries with constraints , 1997, J. Comput. Syst. Sci..

[18]  Yde Venema,et al.  Dynamic Logic by David Harel, Dexter Kozen and Jerzy Tiuryn. The MIT Press, Cambridge, Massachusetts. Hardback: ISBN 0–262–08289–6, $50, xv + 459 pages , 2002, Theory and Practice of Logic Programming.

[19]  Claudio Gutiérrez,et al.  Representing, Querying and Transforming Social Networks with RDF/SPARQL , 2009, ESWC.

[20]  R. Varga,et al.  Proof of Theorem 4 , 1983 .

[21]  Jerzy Tiuryn,et al.  Dynamic logic , 2001, SIGA.

[22]  A. Tarski,et al.  A Formalization Of Set Theory Without Variables , 1987 .

[23]  Maarten Marx,et al.  XPath and Modal Logics of Finite DAG's , 2003, TABLEAUX.

[24]  Anthony Widjaja Lin,et al.  Expressive Languages for Path Queries over Graph-Structured Data , 2012, TODS.

[25]  Jan Van den Bussche,et al.  Relative expressive power of navigational querying on graphs , 2011, ICDT '11.

[26]  Thomas Schwentick,et al.  XPath query containment , 2004, SGMD.

[27]  Pablo Barceló,et al.  Graph Logics with Rational Relations and the Generalized Intersection Problem , 2012, 2012 27th Annual IEEE Symposium on Logic in Computer Science.

[28]  Oded Shmueli,et al.  SoQL: A Language for Querying and Creating Data in Social Networks , 2009, 2009 IEEE 25th International Conference on Data Engineering.

[29]  Martin Lange,et al.  Model checking propositional dynamic logic with all extras , 2006, J. Appl. Log..

[30]  Alberto O. Mendelzon,et al.  Foundations of Semantic Web databases , 2011, J. Comput. Syst. Sci..

[31]  Steve Cassidy,et al.  Generalizing XPath for directed graphs , 2003, Extreme Markup Languages®.

[32]  Maarten de Rijke,et al.  A Modal Perspective on Path Constraints , 2003, J. Log. Comput..

[33]  Rance Cleaveland,et al.  A linear-time model-checking algorithm for the alternation-free modal mu-calculus , 1993, Formal Methods Syst. Des..

[34]  Carsten Lutz,et al.  PDL with negation of atomic programs , 2004, J. Appl. Non Class. Logics.

[35]  Alberto O. Mendelzon,et al.  A graphical query language supporting recursion , 1987, SIGMOD '87.

[36]  Algebraic logic , 1985, Problem books in mathematics.

[37]  Wenfei Fan,et al.  Graph pattern matching revised for social network analysis , 2012, ICDT '12.

[38]  Luc Segoufin,et al.  Static analysis of XML processing with data values , 2007, SGMD.

[39]  Alberto O. Mendelzon,et al.  GraphLog: a visual formalism for real life recursion , 1990, PODS '90.

[40]  Wim Martens,et al.  The complexity of evaluating path expressions in SPARQL , 2012, PODS '12.

[41]  Carsten Lutz,et al.  The complexity of query containment in expressive fragments of XPath 2.0 , 2007, PODS.

[42]  Diego Calvanese,et al.  Containment of Conjunctive Regular Path Queries with Inverse , 2000, KR.

[43]  Marcelo Arenas,et al.  nSPARQL: A navigational language for RDF , 2010, J. Web Semant..

[44]  Georg Gottlob,et al.  Efficient Algorithms for Processing XPath Queries , 2002, VLDB.

[45]  Jorge Pérez,et al.  Relative Expressiveness of Nested Regular Expressions , 2012, AMW.

[46]  Slawomir Lasota,et al.  An Extension of Data Automata that Captures XPath , 2010, LICS.