XML is often used to represent objects that expose different sets of properties. This "property bag" scenario is a prominent use case for the XML support added to Microsoft SQL Server 2005. However, each property extraction in our initial implementation executed as a separate relational subquery. This was problematic since query performance became unacceptable even for small data sizes when returning an increasing number of properties. We addressed this problem by developing an interesting generalization of common subexpressions. This paper makes the following contributions: (1) it introduces an equivalence rewrite for relational query optimization to fold similar scalar subqueries. Several such subqueries are merged into a single equivalent multi-column subquery using both predicate disjunction and rowset pivoting. The rewrite operates at the logical operator level which makes it equally applicable to XML queries and SQL queries. (2) We explain how this optimization can be applied to the XML property bag scenario and how it has been implemented for the XML index in Microsoft SQL Server 2005. (3) An experimental investigation with Microsoft SQL Server 2005 studies the performance characteristics of the optimization. It shows that the optimization yields significant performance improvements - without limiting essential optimizer execution plan choices.
[1]
Guido Moerkotte,et al.
Nested Queries in Object Bases
,
1993,
DBPL.
[2]
Timos K. Sellis,et al.
Multiple-query optimization
,
1988,
TODS.
[3]
Shankar Pal,et al.
XQuery Implementation in a Relational Database System
,
2005,
VLDB.
[4]
Prasan Roy,et al.
Efficient and extensible algorithms for multi query optimization
,
1999,
SIGMOD '00.
[5]
César A. Galindo-Legaria,et al.
Orthogonal optimization of subqueries and aggregation
,
2001,
SIGMOD '01.
[6]
Goetz Graefe,et al.
Microsoft SQL Server (Chapter 27)
,
2001,
Database System Concepts, 4th Edition..
[7]
Goetz Graefe,et al.
PIVOT and UNPIVOT: Optimization and Execution Strategies in an RDBMS
,
2004,
VLDB.
[8]
Matthias Jarke,et al.
Query Optimization in Database Systems
,
1984,
CSUR.
[9]
Shankar Pal,et al.
Indexing XML Data Stored in a Relational Database
,
2004,
VLDB.
[10]
Patrick A. V. Hall.
Common Subexpression Identification in General Algebraic Systems
,
1974,
Technical Rep. UKSC 0060, IBM United Kingdom Scientific Centre.
[11]
Sheldon J. Finkelstein.
Common expression analysis in database applications
,
1982,
SIGMOD '82.