A new indexing strategy for XML keyword search

With the rapid increase of XML documents on the web, how to index, store and retrieve these documents has become a very popular and valuable problem. At present, there are two normal ways of retrieving XML documents. One is structure-based retrieval; the other is keyword-based retrieval. However, XML keyword search is becoming more and more popular because it is easy to master and manipulate. In XML keyword search system, a key problem is how to store the structure information into XML indices efficiently. At present, Dewey numbers are often used to label XML nodes in XML indices. However, Dewey numbers may lead to redundancy in XML indices. In this paper, we propose a new labeling method called LAF numbers for XML indices and we device a new indexing structure called Two-Layer index for XML keyword retrieval systems. At last, we have conducted an extensive experimental study and the experimental results show that our indexing method achieves better space efficiency than prevailing Dewey-number-based indexing method