LMIX: A Dynamic XML Index Method Using Line Model

A new way of indexing XML document is proposed, which supports twig queries and queries with wildcards. An once-over index construction algorithm is also given. According to the Line Model we design, we consider XML document as a line, and every elements of the document as the line’s segments. To query an XML document is to identify the corresponding segments. Using a range-based dynamic tree labeling scheme, each segment of the line is given a range. We put all the paths of XML document into a trie, and organize the range sets with B+-trees grouping by the nodes on the trie. Three operations are defined, which enable the range sets on the B+-trees corresponding to different nodes in the trie to operate with each other. The worst-case time complexity of the algorithm we designed for the operations is O(m+n). The final results of twig queries can be got through these operations directly at a speed similar to the simple path query. Through extensive experiments, we compare our method with other popular techniques. In particular, we show that the processing cost and disk I/O of our index method is linearly proportional to the complexity of query and the size of query results. Experimental results demostrate the great performance benefits of our proposed techniques.