As XML becomes the standard of data presentation and information exchange, how to efficiently query information from XML documents becomes a hot topic. However, for larger XML documents and complicated XQueries, the performance of query processing which executes in a single node can seldom meet the needs of users. In this paper, algebra PPXA (Pure Parallel XQuery Algebra) is proposed to support parallel processing for XQuery statements. Based on the Algebra, a strategy for query plan decomposition is proposed for complex path queries and Twig queries. Then, we propose three optimization algorithms based on PPXA. The logical parallel execution plan is optimized by rules on operators, which reduce the local query execution costs. We implement the algebra and the query decomposition strategy in a native XML database system PureXBase. The experimental results show that it supports the XQuery parallel query processing effectively, and can significantly improve the efficiency of query processing.
[1]
Jin Yan.
ArithBi~+ --An XML Index Structure on Reverse Arithmetic Comparessed XML Data
,
2005
.
[2]
Atsushi Takeshita,et al.
Recursive Application of Structural Templates to Efficiently Compress Parsed XML
,
2005,
ICWE.
[3]
Gonzalo Navarro,et al.
Lempel-Ziv compression of structured text
,
2004,
Data Compression Conference, 2004. Proceedings. DCC 2004.
[4]
Barry E. Mullins,et al.
An analysis of XML compression efficiency
,
2007,
ExpCS '07.
[5]
Tang Shiwei,et al.
Interval+—An Index Structure on Compressed XML Data Based on Interval Tree
,
2006
.
[6]
Sebastian Maneth,et al.
Efficient Memory Representation of XML Documents
,
2005,
DBPL.