Enumeration Problems for Regular Path Queries

Evaluation of regular path queries (RPQs) is a central problem in graph databases. We investigate the corresponding enumeration problem, that is, given a graph and an RPQ, enumerate all paths in the graph that match the RPQ. We consider several versions of this problem, corresponding to different semantics of RPQs that have recently been considered: arbitrary paths, shortest paths, simple paths, and trails. Whereas arbitrary and shortest paths can be enumerated in polynomial delay, the situation is much more intricate for simple paths and trails. For instance, already the question if a given graph contains a simple path or trail of a certain length has cases with highly non-trivial solutions and cases that are long-standing open problems. In this setting, we study RPQ evaluation from a parameterized complexity perspective. We define a class of simple transitive expressions that is prominent in practice and for which we can prove two dichotomy-like results: one for simple paths and one for trails paths. We observe that, even though simple path semantics and trail semantics are intractable for RPQs in general, they are feasible for the vast majority of the kinds of RPQs that users use in practice. At the heart of this study is a result of independent interest on the parameterized complexity of finding disjoint paths in graphs: the two disjoint paths problem is W[1]-hard if parameterized by the length of one of the two paths.

[1]  Alberto O. Mendelzon,et al.  Finding Regular Simple Paths in Graph Databases , 1989, SIAM J. Comput..

[2]  Mihalis Yannakakis,et al.  Graph-theoretic methods in database theory , 1990, PODS.

[3]  Diego Calvanese,et al.  Rewriting of regular expressions and regular path queries , 1999, PODS '99.

[4]  Christos H. Papadimitriou,et al.  The even-path problem for graphs and digraphs , 1984, Networks.

[5]  Janusz A. Brzozowski,et al.  Derivatives of Regular Expressions , 1964, JACM.

[6]  Michael R. Fellows,et al.  Fixed-Parameter Tractability and Completeness II: On Completeness for W[1] , 1995, Theor. Comput. Sci..

[7]  Ronald L. Rivest,et al.  The Subgraph Homeomorphism Problem , 1980, J. Comput. Syst. Sci..

[8]  John E. Hopcroft,et al.  The Directed Subgraph Homeomorphism Problem , 1978, Theor. Comput. Sci..

[9]  Diego Calvanese,et al.  View-based query processing for regular path queries with inverse , 2000, PODS '00.

[10]  Benny Kimelfeld,et al.  Flexible Caching in Trie Joins , 2016, EDBT.

[11]  Michael R. Fellows,et al.  FIXED-PARAMETER TRACTABILITY AND COMPLETENESS , 2022 .

[12]  J. Y. Yen,et al.  Finding the K Shortest Loopless Paths in a Network , 2007 .

[13]  Katta G. Murty,et al.  Letter to the Editor - An Algorithm for Ranking all the Assignments in Order of Increasing Cost , 1968, Oper. Res..

[14]  Wim Martens,et al.  An analytical study of large SPARQL query logs , 2017, VLDB 2017.

[15]  Pablo Barceló,et al.  Querying graph databases , 2013, PODS '13.

[16]  Dan Suciu,et al.  Query containment for conjunctive queries with regular expressions , 1998, PODS.

[17]  Erkki Mäkinen,et al.  On Lexicographic Enumeration of Regular and Context-Free Languages , 1997, Acta Cybern..

[18]  Alberto O. Mendelzon,et al.  GraphLog: a visual formalism for real life recursion , 1990, PODS '90.

[19]  Marcelo Arenas,et al.  Counting beyond a Yottabyte, or how SPARQL 1.1 property paths will prevent adoption of the standard , 2012, WWW.

[20]  E. Lawler A PROCEDURE FOR COMPUTING THE K BEST SOLUTIONS TO DISCRETE OPTIMIZATION PROBLEMS AND ITS APPLICATION TO THE SHORTEST PATH PROBLEM , 1972 .

[21]  Marcelo Arenas,et al.  Foundations of Modern Graph Query Languages , 2016, ArXiv.

[22]  Leslie G. Valiant,et al.  The Complexity of Computing the Permanent , 1979, Theor. Comput. Sci..

[23]  Leizhen Cai,et al.  Finding Two Edge-Disjoint Paths with Length Constraints , 2016, WG.

[24]  Aleksandrs Slivkins,et al.  Parameterized Tractability of Edge-Disjoint Paths on Directed Acyclic Graphs , 2003, SIAM J. Discret. Math..

[25]  Alberto O. Mendelzon,et al.  A graphical query language supporting recursion , 1987, SIGMOD '87.

[26]  Serge Abiteboul,et al.  Regular path queries with constraints , 1997, J. Comput. Syst. Sci..

[27]  Jörg Flum,et al.  Parameterized Complexity Theory (Texts in Theoretical Computer Science. An EATCS Series) , 2006 .

[28]  Wim Martens,et al.  The complexity of regular expressions and property paths in SPARQL , 2013, TODS.

[29]  Jeffrey Shallit,et al.  Efficient enumeration of words in regular languages , 2009, Theor. Comput. Sci..

[30]  Domagoj Vrgoc,et al.  Querying Graphs with Data , 2016, J. ACM.

[31]  Fahad Panolan,et al.  Efficient Computation of Representative Families with Applications in Parameterized and Exact Algorithms , 2016, J. ACM.

[32]  Yehoshua Sagiv,et al.  Extracting minimum-weight tree patterns from a schema with neighborhood constraints , 2013, ICDT '13.

[33]  Yehoshua Sagiv,et al.  Optimizing and parallelizing ranked enumeration , 2011, Proc. VLDB Endow..

[34]  Alin Deutsch,et al.  Optimization Properties for Classes of Conjunctive Regular Path Queries , 2001, DBPL.

[35]  Martin Grohe,et al.  Parameterized Approximability of the Disjoint Cycle Problem , 2007, ICALP.

[36]  Diego Calvanese,et al.  Containment of Conjunctive Regular Path Queries with Inverse , 2000, KR.

[37]  Sanjeev Khanna,et al.  Approximating Longest Directed Paths and Cycles , 2004, ICALP.