Order-preserving optimization of twig queries with structural preferences

Efficient query processing using XPath or XQuery has inspired a lot of research. In contrast to classical exact match retrieval, in today's systems, specifying preferences rather than simple hard constraints is essential. As the structure of XML documents plays a major part in retrieval, recently approximate query matching on structure has received attention. However, query processing of structural user preferences has not yet been considered. In this paper we enable users to express structural preferences and consider the problem of optimizing XML twig queries while preserving the ordering induced on the result set by such user preferences. Evaluating such queries generally needs a rewriting into a set of queries, where each leaf node can be expanded by combinations of structural elements derived from the preference information. Since such structure expansions typically contain redundancies and the efficiency of query evaluation strongly depends on the size of the set of rewritten queries, it is important to identify and simplify necessary expansions. We give a detailed analysis of this process and present an optimization algorithm that determines a minimal set of queries, which in turn are minimal in their expanded nodes, while maintaining the ordering induced by the preference structure. Finally, we provide a comprehensive practical evaluation of our optimization against the XMark benchmark dataset.

[1]  Laks V. S. Lakshmanan,et al.  Minimization of tree pattern queries , 2001, SIGMOD '01.

[2]  Gao Jun,et al.  QUERY REWRITING FOR SEMI-STRUCTURED DATA , 2002 .

[3]  Rakesh Agrawal,et al.  A framework for expressing and combining preferences , 2000, SIGMOD '00.

[4]  Werner Kießling,et al.  Preference XPATH: A Query Language for E-Commerce , 2001, Wirtschaftsinformatik.

[5]  Jan Chomicki,et al.  Querying with Intrinsic Preferences , 2002, EDBT.

[6]  Norbert Fuhr,et al.  XIRQL: a query language for information retrieval in XML documents , 2001, SIGIR '01.

[7]  Werner Kießling,et al.  Optimization of Relational Preference Queries , 2005, ADC.

[8]  Werner Kießling,et al.  Foundations of Preferences in Database Systems , 2002, VLDB.

[9]  Peter T. Wood,et al.  Containment for XPath Fragments under DTD Constraints , 2003, ICDT.

[10]  Jennifer Widom,et al.  Query Optimization for XML , 1999, VLDB.

[11]  Jignesh M. Patel,et al.  Structural joins: a primitive for efficient XML query pattern matching , 2002, Proceedings 18th International Conference on Data Engineering.

[12]  Jan Chomicki,et al.  Semantic Optimization of Preference Queries , 2004, CDB.

[13]  Gerhard Weikum,et al.  The Index-Based XXL Search Engine for Querying XML Data with Relevance Ranking , 2002, EDBT.

[14]  Yannis Papakonstantinou,et al.  Query rewriting for semistructured data , 1999, SIGMOD '99.

[15]  Stefanie Scherzinger,et al.  FluXQuery: An Optimizing XQuery Processor for Streaming XML Data , 2004, VLDB.

[16]  Dan Suciu,et al.  Containment and equivalence for an XPath fragment , 2002, PODS.

[17]  Wolf-Tilo Balke,et al.  Through different eyes: assessing multiple conceptual views for querying web services , 2004, WWW Alt. '04.