An XML Document Retrieval System Supporting Structure- and Content-Based Queries

In this paper, we design and implement an XML document retrieval system which can support both structure- and content-based queries. In order to support a structure-based query, we design four efficient index structures, i.e., keyword, structure, element, and attribute ones, and implement them by using the O2-Store storage system. In order to support a content-based query, we design and implement a high-dimensional index structure based on the CBF method so as to store and retrieve both color and shape feature vectors efficiently. Finally, we do the performance analysis of our XML document retrieval system in terms of system efficiency, such as retrieval time, insertion time, and storage overhead, as well as system effectiveness, such as recall and precision measures.