Extending XQuery for analytics

XQuery is a query language under development by the W3C XML Query Working Group. The language contains constructs for navigating, searching, and restructuring XML data. With XML gaining importance as the standard for representing business data, XQuery must support the types of queries that are common in business analytics. One such class of queries is OLAP-style aggregation queries. Although these queries are expressible in XQuery Version 1, the lack of explicit grouping constructs makes the construction of these queries non-intuitive and places a burden on the XQuery engine to recognize and optimize the implicit grouping constructs. Furthermore, although the flexibility of the XML data model provides an opportunity for advanced forms of grouping that are not easily represented in relational systems, these queries are difficult to express using the current XQuery syntax. In this paper, we provide a proposal for extending the XQuery FLWOR expression with explicit syntax for grouping and for numbering of results. We show that these new XQuery constructs not only simplify the construction and evaluation of queries requiring grouping and ranking but also enable complex analytic queries such as moving-window aggregation and rollups along dynamic hierarchies to be expressed without additional language extensions.

[1]  Won Kim,et al.  On optimizing an SQL-like nested query , 1982, TODS.

[2]  Raghu Ramakrishnan,et al.  Bottom-up computation of sparse and Iceberg CUBE , 1999, SIGMOD '99.

[3]  David Levine,et al.  Query processing of streamed XML data , 2002, CIKM '02.

[4]  Norman May,et al.  Three Cases for Query Decorrelation in XQuery , 2003, Xsym.

[5]  Guido Moerkotte,et al.  Nested Queries in Object Bases , 1993, DBPL.

[6]  Laks V. S. Lakshmanan,et al.  Grouping in XML , 2002, EDBT Workshops.

[7]  Hamid Pirahesh,et al.  XQery for Analytics: Challenges and Requirements , 2004, XIME-P.

[8]  Hamid Pirahesh,et al.  Data Cube: A Relational Aggregation Operator Generalizing Group-By, Cross-Tab, and Sub-Totals , 1996, Data Mining and Knowledge Discovery.

[9]  Hamid Pirahesh,et al.  Complex query decorrelation , 1996, Proceedings of the Twelfth International Conference on Data Engineering.

[10]  RamakrishnanRaghu,et al.  Bottom-up computation of sparse and Iceberg CUBE , 1999 .

[11]  S. Boag,et al.  XQuery 1.0 : An XML query language, W3C Working Draft 12 November 2003 , 2003 .

[12]  Guido Moerkotte,et al.  Algebraic XML construction in Natix , 2001, Proceedings of the Second International Conference on Web Information Systems Engineering.

[13]  Alin Deutsch,et al.  The next+ framework for logical xquery optimization , 2004, VLDB 2004.

[14]  Hamid Pirahesh,et al.  System RX: one part relational, one part XML , 2005, SIGMOD '05.