Efficient XML Keyword Search: From Graph Model to Tree Model

Keyword search, as opposed to traditional structured query, has been becoming more and more popular on querying XML data in recent years. XML documents usually contain some ID nodes and IDREF nodes to represent reference relationships among the data. An XML document with ID/IDREF is modeled as a graph by existing works, where the keyword query results are computed by graph traversal. As a comparison, if ID/IDREF is not considered, an XML document can be modeled as a tree. Keyword search on XML tree can be much more efficient using tree-based labeling techniques. A nature question is whether we need to abandon the efficient XML tree search methods and invent new, but less efficient search methods for XML graph. To address this problem, we propose a novel method to transform an XML graph to a tree model such that we can exploit existing XML tree search methods. The experimental results show that our solution can outperform the traditional XML graph search methods by orders of magnitude in efficiency while generating a similar set of results as existing XML graph search methods.

[1]  Xudong Lin,et al.  Fast SLCA and ELCA Computation for XML Keyword Queries Based on Set Intersection , 2012, 2012 IEEE 28th International Conference on Data Engineering.

[2]  Jianxin Li,et al.  Fast ELCA computation for keyword queries on XML data , 2010, EDBT '10.

[3]  Shan Wang,et al.  Finding Top-k Min-Cost Connected Trees in Databases , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[4]  Tok Wang Ling,et al.  An Effective Object-Level XML Keyword Search , 2010, DASFAA.

[5]  Vagelis Hristidis,et al.  Keyword proximity search on XML graphs , 2003, Proceedings 19th International Conference on Data Engineering (Cat. No.03CH37405).

[6]  Philip S. Yu,et al.  BLINKS: ranked keyword searches on graphs , 2007, SIGMOD '07.

[7]  S. E. Dreyfus,et al.  The steiner problem in graphs , 1971, Networks.

[8]  Yannis Papakonstantinou,et al.  Efficient keyword search for smallest LCAs in XML databases , 2005, SIGMOD '05.

[9]  Lin Guo XRANK : Ranked Keyword Search over XML Documents , 2003 .

[10]  S. Sudarshan,et al.  Keyword searching and browsing in databases using BANKS , 2002, Proceedings 18th International Conference on Data Engineering.

[11]  Tok Wang Ling,et al.  Effective XML Keyword Search with Relevance Oriented Ranking , 2009, 2009 IEEE 25th International Conference on Data Engineering.

[12]  S. Sudarshan,et al.  Bidirectional Expansion For Keyword Search on Graph Databases , 2005, VLDB.