Web-Age Information Management

As XML becomes the standard of data presentation and information exchange, how to efficiently query information from XML documents becomes a hot topic. However, for larger XML documents and complicated XQueries, the performance of query processing which executes in a single node can seldom meet the needs of users. In this paper, algebra PPXA (Pure Parallel XQuery Algebra) is proposed to support parallel processing for XQuery statements. Based on the Algebra, a strategy for query plan decomposition is proposed for complex path queries and Twig queries. Then, we propose three optimization algorithms based on PPXA. The logical parallel execution plan is optimized by rules on operators, which reduce the local query execution costs. We implement the algebra and the query decomposition strategy in a native XML database system PureXBase. The experimental results show that it supports the XQuery parallel query processing effectively, and can significantly improve the efficiency of query processing.