Bitmap-Based Indexing for Multi-dimensional Multimedia XML Documents

XML is a new standard for exchanging and representing information on the Internet. Documents can be hierarchically represented in XML-elements and also available for sophisticated content-based retrieval. For fast retrieval, XML documents may be indexed. Typical indexing techniques, however, are not satisfactory for multi-dimensional and irregularly hierarchical XML documents. In this paper, we propose a scalable bitmap indexing that can index not only document-path-content (or -word) information but also additional information such as the occurrence and reference/de-reference information of words and paths, or multimedia features in digital libraries. Querying XML document collections can be performed based on combinations of primitive operations such as slice, project, and dice. Bit-wise operations are outperformed in bitmap indexes. We also define the notion of distances in bitmap indexes suitable for sophisticated or proximity approximation retrievals. Experiments show that the bitmap-based indexing for multiple features of XML documents can be constructed efficiently, and the distance operations can be performed more efficiently with the BitCube than with other alternatives.