Efficient evaluation of n-ary conjunctive queries over trees and graphs

N-ary conjunctive queries, i.e., queries with any number of answer variables, are the formal core of many Web query languages including XSLT, XQuery, SPARQL, and Xcerpt. Despite a considerable body of research on the optimization of such queries over tree-shaped XML data, little attention has been paid so far to efficient access to graph-shaped XML, RDF, or Topic Maps. We propose the first evaluation technique for n-ary conjunctive queries that applies to both tree- and graph-shaped data and retains the same complexity as the best known approaches that are restricted to tree-shaped data only. Furthermore, the approach treats tree and graph-shaped queries uniformly without sacrificing evaluation complexity on the restricted query class. The core of the evaluation technique is based on dynamic programming using a memoization data structure, called "memoization matrix". It can be populated and consumed in different ways. For each of population and consumption, we propose two resp. three algorithms each having their own advantages. The complexity of the algorithms is compared analytically and experimentally.

[1]  Georg Gottlob,et al.  The complexity of XPath query evaluation and XML typing , 2005, JACM.

[2]  Rina Dechter,et al.  Network-Based Heuristics for Constraint-Satisfaction Problems , 1987, Artif. Intell..

[3]  Laks V. S. Lakshmanan,et al.  On the evaluation of tree pattern queries , 2006, ICSOFT.

[4]  Christoph Koch,et al.  On the complexity of nonrecursive XQuery and functional query languages on complex values , 2006, TODS.

[5]  François Bry,et al.  Querying the Web Reconsidered: A Practical Introduction to Xcerpt , 2004, Extreme Markup Languages®.

[6]  David J. DeWitt,et al.  Duplicate record elimination in large data files , 1983, TODS.

[7]  Tok Wang Ling,et al.  From Region Encoding To Extended Dewey: On Efficient Processing of XML Twig Pattern Matching , 2005, VLDB.

[8]  Sebastian Schaffert,et al.  Xcerpt: a rule-based query and transformation language for the web , 2004 .

[9]  Maarten Marx,et al.  Conditional XPath, the first order complete XPath dialect , 2004, PODS.

[10]  Jennifer Widom,et al.  The Lorel query language for semistructured data , 1997, International Journal on Digital Libraries.

[11]  Klaus U. Schulz,et al.  Complete answer aggregates for treelike databases: a novel approach to combine querying and navigation , 2001, TOIS.

[12]  Peter Fankhauser,et al.  Editors , 2016 .

[13]  Tok Wang Ling,et al.  On boosting holism in XML twig pattern matching using structural indexing techniques , 2005, SIGMOD '05.

[14]  Georg Gottlob,et al.  Monadic queries over tree-structured data , 2002, Proceedings 17th Annual IEEE Symposium on Logic in Computer Science.

[15]  H. V. Jagadish,et al.  Pattern Tree Algebras: Sets or Sequences? , 2005, VLDB.

[16]  Georg Gottlob,et al.  Conjunctive queries over trees , 2004, JACM.

[17]  Divesh Srivastava,et al.  Holistic twig joins: optimal XML pattern matching , 2002, SIGMOD '02.

[18]  Dan Suciu,et al.  UnQL: a query language and algebra for semistructured data based on structural recursion , 2000, The VLDB Journal.

[19]  Serge Abiteboul,et al.  Foundations of Databases , 1994 .

[20]  Jörg Flum,et al.  Query evaluation via tree-decompositions , 2001, JACM.

[21]  Scott Boag,et al.  XQuery 1.0 : An XML Query Language , 2007 .

[22]  Georg Gottlob,et al.  Hypertree decompositions and tractable queries , 1998, J. Comput. Syst. Sci..

[23]  Torsten Grust,et al.  Staircase Join: Teach a Relational DBMS to Watch its (Axis) Steps , 2003, VLDB.