From relation algebra to semi-join algebra: an approach for graph query optimization

Many graph query languages rely on the composition operator to navigate graphs and select nodes of interests, even though evaluating compositions of relations can be costly. Often, this need for composition can be reduced by rewriting towards queries that use semi-joins instead. In this way, the cost of evaluating queries can be significantly reduced. We study techniques to recognize and apply such rewritings. Concretely, we study the relationship between the expressive power of the relation algebras, that heavily rely on composition, and the semi-join algebras, that replace the composition operator in favor of the semi-join operators. As our main result, we show that each fragment of the relation algebras where intersection and/or difference is only used on edges (and not on complex compositions) is expressively equivalent to a fragment of the semi-join algebras. This expressive equivalence holds for node queries that evaluate to sets of nodes. For practical relevance, we exhibit constructive steps for rewriting relation algebra queries to semi-join algebra queries, and prove that these steps lead to only a well-bounded increase in the number of steps needed to evaluate the rewritten queries. In addition, on node-labeled graphs that are sibling-ordered trees, we establish new relationships among the expressive power of Regular XPath, Conditional XPath, FO-logic, and the semi-join algebra augmented with restricted fixpoint operators.

[1]  Jan Van den Bussche,et al.  The Semijoin Algebra and the Guarded Fragment , 2004, J. Log. Lang. Inf..

[2]  Michael Benedikt,et al.  XPath leashed , 2009, CSUR.

[3]  Moshe Y. Vardi The complexity of relational query languages (Extended Abstract) , 1982, STOC '82.

[4]  Jan Van den Bussche,et al.  On the expressive power of semijoin queries , 2003, Inf. Process. Lett..

[5]  Hamid Pirahesh,et al.  Cost-based optimization for magic: algebra and implementation , 1996, SIGMOD '96.

[6]  M. de Rijke,et al.  Semantic characterizations of navigational XPath , 2005, SGMD.

[7]  Jan Van den Bussche,et al.  On the complexity of division and set joins in the relational algebra , 2005, PODS '05.

[8]  Marc Gyssens,et al.  A Study of a Positive Fragment of Path Queries: Expressiveness, Normal Form and Minimization , 2009, Comput. J..

[9]  Alfred Tarski,et al.  Relational selves as self-affirmational resources , 2008 .

[10]  Rance Cleaveland,et al.  A linear-time model-checking algorithm for the alternation-free modal mu-calculus , 1993, Formal Methods Syst. Des..

[11]  Steven J. DeRose,et al.  XML Path Language (XPath) , 1999 .

[12]  Jan Van den Bussche,et al.  Relative expressive power of navigational querying on graphs , 2011, ICDT '11.

[13]  Maarten Marx,et al.  Conditional XPath , 2005, TODS.

[15]  Jan Van den Bussche,et al.  Relative expressive power of downward fragments of navigational query languages on trees and chains , 2015, DBPL.

[16]  Maarten Marx,et al.  Multi-dimensional modal logic , 1997, Applied logic series.

[17]  Martin Grohe,et al.  Finite Variable Logics in Descriptive Complexity Theory , 1998, Bulletin of Symbolic Logic.

[18]  Connolly,et al.  Database Systems , 2004 .

[19]  Maarten Marx,et al.  Navigational XPath: calculus and algebra , 2007, SGMD.

[20]  Jennifer Widom,et al.  Database systems - the complete book (2. ed.) , 2009 .

[21]  Jan Van den Bussche,et al.  The impact of transitive closure on the expressiveness of navigational query languages on unlabeled graphs , 2013, Annals of Mathematics and Artificial Intelligence.

[22]  Wim Martens,et al.  Querying graph databases with XPath , 2013, ICDT '13.

[23]  Steven J. DeRose,et al.  XML Path Language (XPath) Version 1.0 , 1999 .

[24]  Mihalis Yannakakis,et al.  Algorithms for Acyclic Database Schemes , 1981, VLDB.

[25]  Jeffrey D. Ullman,et al.  Principles of Database and Knowledge-Base Systems, Volume II , 1988, Principles of computer science series.

[26]  Steven Givant,et al.  The Calculus of Relations as a Foundation for Mathematics , 2006, Journal of Automated Reasoning.

[27]  Jan Van den Bussche,et al.  Relative expressive power of navigational querying on graphs using transitive closure , 2015, Log. J. IGPL.

[28]  Balder ten Cate,et al.  The expressivity of XPath with transitive closure , 2006, PODS.

[29]  Sergio Greco,et al.  Querying Graph Databases , 2000, EDBT.

[30]  Gabriel M. Kuper,et al.  Structural properties of XPath fragments , 2003, Theor. Comput. Sci..

[31]  Philip A. Bernstein,et al.  Using Semi-Joins to Solve Relational Queries , 1981, JACM.

[32]  Dexter Kozen,et al.  Kleene algebra with tests , 1997, TOPL.

[33]  Nathan Goodman,et al.  Multirelations - Semantice and Languages , 1985, VLDB.

[34]  Jan Van den Bussche The Semijoin Algebra , 2006, FoIKS.

[35]  Pablo Barceló Baeza Querying graph databases , 2013, PODS 2013.