Algebra for Parallel XQuery Processing

As XML becomes the standard of data presentation and information exchange, how to efficiently query information from XML documents becomes a hot topic. However, for larger XML documents and complicated XQueries, the performance of query processing which executes in a single node can seldom meet the needs of users. In this paper, algebra PPXA (Pure Parallel XQuery Algebra) is proposed to support parallel processing for XQuery statements. Based on the Algebra, a strategy for query plan decomposition is proposed for complex path queries and Twig queries. Then, we propose three optimization algorithms based on PPXA. The logical parallel execution plan is optimized by rules on operators, which reduce the local query execution costs. We implement the algebra and the query decomposition strategy in a native XML database system PureXBase. The experimental results show that it supports the XQuery parallel query processing effectively, and can significantly improve the efficiency of query processing.

[1]  Laks V. S. Lakshmanan,et al.  TAX: A Tree Algebra for XML , 2001, DBPL.

[2]  Flavius Frasincar,et al.  XAL: An Algebra For XML Query Optimization , 2002, Australasian Database Conference.

[3]  Sebastian Maneth,et al.  Efficient Memory Representation of XML Documents , 2005, DBPL.

[4]  Antonio Albano,et al.  Yet another query algebra for XML data , 2002, Proceedings International Database Engineering and Applications Symposium.

[5]  Tang Nan,et al.  A Data Placement Strategy for Parallel XML Databases , 2006 .

[6]  Guo Ruihua,et al.  XML Database Technology , 2004 .

[7]  Jiang Yu,et al.  OrientXA: An Effective XQuery Algebra , 2004 .

[8]  Zhou Li-zhu Survey of Research on Native XML Databases , 2006 .

[9]  Gottfried Vossen,et al.  The World Wide Web and Databases , 2001, Lecture Notes in Computer Science.

[10]  Daniela Florescu,et al.  Quilt: An XML Query Language for Heterogeneous Data Sources , 2000, WebDB.