Querying graph databases

Graph databases have gained renewed interest in the last years, due to its applications in areas such as the Semantic Web and Social Networks Analysis. We study the problem of querying graph databases, and, in particular, the expressiveness and complexity of evaluation for several general-purpose query languages, such as the regular path queries and its extensions with conjunctions and inverses. We distinguish between two semantics for these languages. The first one, based on simple paths, easily leads to intractability, while the second one, based on arbitrary paths, allows tractable evaluation for an expressive family of languages. We also study two recent extensions of these languages that have been motivated by modern applications of graph databases. The first one allows to treat paths as first-class citizens, while the second one permits to express queries that combine the topology of the graph with its underlying data.

[1]  Jorge E. Mezei,et al.  On Relations Defined by Generalized Finite Automata , 1965, IBM J. Res. Dev..

[2]  Albert R. Meyer,et al.  Word problems requiring exponential time(Preliminary Report) , 1973, STOC.

[3]  Ashok K. Chandra,et al.  Optimal implementation of conjunctive queries in relational data bases , 1977, STOC '77.

[4]  Jean Berstel,et al.  Transductions and context-free languages , 1979, Teubner Studienbücher : Informatik.

[5]  Mihalis Yannakakis,et al.  Algorithms for Acyclic Database Schemes , 1981, VLDB.

[6]  Moshe Y. Vardi The complexity of relational query languages (Extended Abstract) , 1982, STOC '82.

[7]  Christos H. Papadimitriou,et al.  The even-path problem for graphs and digraphs , 1984, Networks.

[8]  Alberto O. Mendelzon,et al.  A graphical query language supporting recursion , 1987, SIGMOD '87.

[9]  Alberto O. Mendelzon,et al.  Expressing structural hypertext queries in graphlog , 1989, Hypertext.

[10]  Alberto O. Mendelzon,et al.  Low Complexity Aggregation in GraphLog and Datalog , 1990, Theor. Comput. Sci..

[11]  Marc Gyssens,et al.  A graph-oriented object database model , 1990, IEEE Trans. Knowl. Data Eng..

[12]  Alberto O. Mendelzon,et al.  GraphLog: a visual formalism for real life recursion , 1990, PODS '90.

[13]  Jacques Sakarovitch,et al.  Rational Ralations with Bounded Delay , 1991, STACS.

[14]  Noga Alon,et al.  Finding and Counting Given Length Cycles (Extended Abstract) , 1994, ESA.

[15]  Serge Abiteboul,et al.  Foundations of Databases , 1994 .

[16]  Nissim Francez,et al.  Finite-Memory Automata , 1994, Theor. Comput. Sci..

[17]  H. James Hoover,et al.  Limits to Parallel Computation: P-Completeness Theory , 1995 .

[18]  Alberto O. Mendelzon,et al.  Finding Regular Simple Paths in Graph Databases , 1989, SIAM J. Comput..

[19]  Tao Jiang,et al.  Decision Problems for Patterns , 1995, J. Comput. Syst. Sci..

[20]  Moshe Y. Vardi On the Complexity of Bounded-Variable Queries. , 1995, PODS 1995.

[21]  Letizia Tanca,et al.  G-Log: A Graph-Based Query Language , 1995, IEEE Trans. Knowl. Data Eng..

[22]  Dan Gusfield Algorithms on Strings, Trees, and Sequences - Computer Science and Computational Biology , 1997 .

[23]  Jennifer Widom,et al.  The Lorel query language for semistructured data , 1997, International Journal on Digital Libraries.

[24]  Peter Buneman,et al.  Semistructured data , 1997, PODS.

[25]  Serge Abiteboul,et al.  Regular path queries with constraints , 1997, J. Comput. Syst. Sci..

[26]  Dan Suciu,et al.  Query containment for conjunctive queries with regular expressions , 1998, PODS.

[27]  Madhav V. Marathe,et al.  Formal Language Constrained Path Problems , 1998, SWAT.

[28]  RalfHiutmut Gtiting,et al.  GraphDB : Modeling and Querying Graphs in Databases , 1998 .

[29]  Dan Suciu,et al.  Optimizing regular path expressions using graph schemas , 1998, Proceedings 14th International Conference on Data Engineering.

[30]  Steven J. DeRose,et al.  XML Path Language (XPath) , 1999 .

[31]  Dan Suciu,et al.  Data on the Web: From Relations to Semistructured Data and XML , 1999 .

[32]  Mihalis Yannakakis,et al.  On the Complexity of Database Queries , 1999, J. Comput. Syst. Sci..

[33]  Moshe Y. Vardi,et al.  Rewriting of Regular Expressions and Regular Path , 1999 .

[34]  Diego Calvanese,et al.  Rewriting of regular expressions and regular path queries , 1999, PODS '99.

[35]  Dan Suciu,et al.  Declarative specification of Web sites with Strudel , 2000, The VLDB Journal.

[36]  Wenfei Fan,et al.  Path Constraints in Semistructured Databases , 2000, J. Comput. Syst. Sci..

[37]  Dan Suciu,et al.  UnQL: a query language and algebra for semistructured data based on structural recursion , 2000, The VLDB Journal.

[38]  Diego Calvanese,et al.  Containment of Conjunctive Regular Path Queries with Inverse , 2000, KR.

[39]  Achim Blumensath,et al.  Automatic structures , 2000, Proceedings Fifteenth Annual IEEE Symposium on Logic in Computer Science (Cat. No.99CB36332).

[40]  Jerzy Tiuryn,et al.  Dynamic logic , 2001, SIGA.

[41]  Alin Deutsch,et al.  Optimization Properties for Classes of Conjunctive Regular Path Queries , 2001, DBPL.

[42]  M. Fowler Patterns , 2021, IEEE Softw..

[43]  Diego Calvanese,et al.  View-based query containment , 2003, PODS '03.

[44]  Amit P. Sheth,et al.  Ρ-Queries: enabling querying for semantic associations on the semantic web , 2003, WWW '03.

[45]  Alex Thomo,et al.  Query containment and rewriting using views for regular path queries under constraints , 2003, PODS.

[46]  Diego Calvanese,et al.  Reasoning on regular path queries , 2003, SGMD.

[47]  Thomas Schwentick,et al.  Finite state machines for strings over infinite alphabets , 2004, TOCL.

[48]  Felix Naumann,et al.  Links and Paths through Life Sciences Data Sources , 2004, DILS.

[49]  Amit P. Sheth,et al.  SPARQ2L: towards support for subgraph extraction queries in rdf databases , 2007, WWW '07.

[50]  Philippe Schnoebelen,et al.  Post Embedding Problem Is Not Primitive Recursive, with Applications to Channel Systems , 2007, FSTTCS.

[51]  Krys J. Kochut,et al.  SPARQLeR: Extended Sparql for Semantic Association Discovery , 2007, ESWC.

[52]  Josep-Lluís Larriba-Pey,et al.  Dex: high-performance exploration on large graphs for information retrieval , 2007, CIKM '07.

[53]  Margo I. Seltzer,et al.  Choosing a Data Model and Query Language for Provenance , 2008, IPAW 2008.

[54]  Dominik D. Freydenberger,et al.  Bad News on Decision Problems for Patterns , 2008, Developments in Language Theory.

[55]  Claudio Gutierrez,et al.  Survey of graph database models , 2008, CSUR.

[56]  Two-variable logic on data trees and XML reasoning , 2009, JACM.

[57]  Luc De Raedt,et al.  A query language for analyzing networks , 2009, CIKM.

[58]  Oded Shmueli,et al.  SoQL: A Language for Querying and Creating Data in Social Networks , 2009, 2009 IEEE 25th International Conference on Data Engineering.

[59]  Gerhard Weikum,et al.  Database and information-retrieval methods for knowledge discovery , 2009, CACM.

[60]  Bertram Ludäscher,et al.  Techniques for efficiently querying scientific workflow provenance graphs , 2010, EDBT '10.

[61]  Mikolaj Bojanczyk Automata for Data Words and Data Trees , 2010, RTA.

[62]  Marcelo Arenas,et al.  nSPARQL: A navigational language for RDF , 2010, J. Web Semant..

[63]  Pablo Barceló,et al.  Querying graph patterns , 2011, PODS.

[64]  Claudio Gutiérrez,et al.  SNQL: A Social Networks Query and Transformation Language , 2011, AMW.

[65]  Marcelo Arenas,et al.  Querying semantic web data with SPARQL , 2011, PODS.

[66]  Nicole Schweikardt,et al.  Expressiveness and Static Analysis of Extended Conjunctive Regular Path Queries , 2013, AMW.

[67]  Pablo Barceló,et al.  Parameterized Regular Expressions and Their Languages , 2011, FSTTCS.

[68]  Pablo Barceló,et al.  Graph Logics with Rational Relations and the Generalized Intersection Problem , 2012, 2012 27th Annual IEEE Symposium on Logic in Computer Science.

[69]  Marcelo Arenas,et al.  Counting beyond a Yottabyte, or how SPARQL 1.1 property paths will prevent adoption of the standard , 2012, WWW.

[70]  Wim Martens,et al.  The complexity of evaluating path expressions in SPARQL , 2012, PODS '12.

[71]  Anthony Widjaja Lin,et al.  Expressive Languages for Path Queries over Graph-Structured Data , 2012, TODS.

[72]  Jorge Pérez,et al.  Relative Expressiveness of Nested Regular Expressions , 2012, AMW.

[73]  Leonid Libkin,et al.  Regular path queries on graphs with data , 2012, ICDT '12.

[74]  Wenfei Fan,et al.  Graph pattern matching revised for social network analysis , 2012, ICDT '12.

[75]  Pablo Barceló,et al.  Efficient approximations of conjunctive queries , 2012, PODS '12.

[76]  Jan Van den Bussche,et al.  Walk logic as a framework for path query languages on graph databases , 2013, ICDT '13.

[77]  Juan L. Reutter Containment of Nested Regular Expressions , 2013, ArXiv.

[78]  Moshe Y. Vardi,et al.  Semantic acyclicity on graph databases , 2013, SIAM J. Comput..

[79]  Wim Martens,et al.  Querying graph databases with XPath , 2013, ICDT '13.

[80]  Jan Van den Bussche,et al.  Relative expressive power of navigational querying on graphs , 2011, ICDT '11.