Performance Evaluation of a DOM-Based XML Database: Storage, Indexing and Query Optimization

DOM is an XML access interface proposed by W3C. XML documents can be stored and accessed through it and XML queries can be evaluated based on DOM. In this paper, a persistent DOM storage method is designed with two kinds of clustering strategies, filiationclustering and sibling-clustering to improve DOM interface-based query performance. Furthermore, XML indexing and query optimization techniques are also explored. Four structural indexes are proposed to speed up the basic operations on persistent DOM trees, and two value indexes are introduced to improve the performance of queries with predicates. Moreover, some RPE optimization rules are proposed by using the pathshorten and path-complementing principles. Path-shorten reduces the number of joins by shortening the path and path-complementing is a technique to use a different RPE to substitute the path specified in a user query. Experimental results show that the proposed algorithms are quite efficient.