Efficient subtree results computation for XML keyword queries

In this paper, we focus on efficient construction of restricted subtree (RSubtree) results for XML keyword queries on amulticore system. We firstly show that the performance bottlenecks for existing methods lie in 1) computing the set of relevant keyword nodes (RKNs) for each subtree root node, 2) constructing the corresponding RSubtree, and 3) parallel execution. We then propose a two-step generic top-down subtree construction algorithm, which computes SLCA/ELCA nodes in the first step, and parallelly gets RKNs and generates RSubtree results in the second step, where genericmeans that 1) our method can be used to compute different kinds of subtree results, 2) our method is independent of the query semantics; top-down means that our method constructs each RSubtree by visiting nodes of the subtree constructed based on an RKN set level-by-level from left to right, such that to avoid visiting as many useless nodes as possible. The experimental results show that our method is much more efficient than existing ones according to various metrics.

[1]  Yannis Papakonstantinou,et al.  Supporting top-K keyword search in XML databases , 2010, 2010 IEEE 26th International Conference on Data Engineering (ICDE 2010).

[2]  Feng Shao,et al.  XRANK: ranked keyword search over XML documents , 2003, SIGMOD '03.

[3]  Xudong Lin,et al.  Top-Down SLCA Computation Based on List Partition , 2012, DASFAA.

[4]  Lipyeow Lim,et al.  Statistics-based parallelization of XPath queries in shared memory systems , 2010, EDBT '10.

[5]  Chee Yong Chan,et al.  Multiway SLCA-based keyword search in XML data , 2007, WWW '07.

[6]  Yannis Papakonstantinou,et al.  Efficient LCA based keyword search in xml data , 2007, CIKM '07.

[7]  Ippokratis Pandis,et al.  Data-oriented transaction execution , 2010, Proc. VLDB Endow..

[8]  Rémi Gilleron,et al.  Retrieving meaningful relaxed tightest fragments for XML keyword search , 2009, EDBT '09.

[9]  Peter M. Stocker,et al.  Proceedings of the 13th International Conference on Very Large Data Bases , 1987 .

[10]  Bo Wang,et al.  Efficient MSubtree Results Computation for XML Keyword Queries , 2013, WAIM.

[11]  Xudong Lin,et al.  Fast SLCA and ELCA Computation for XML Keyword Queries Based on Set Intersection , 2012, 2012 IEEE 28th International Conference on Data Engineering.

[12]  Yannis Papakonstantinou,et al.  Efficient keyword search for smallest LCAs in XML databases , 2005, SIGMOD '05.

[13]  Sudipto Guha,et al.  Improving the Performance of List Intersection , 2009, Proc. VLDB Endow..

[14]  Chun Zhang,et al.  Storing and querying ordered XML using a relational database system , 2002, SIGMOD '02.

[15]  Jianxin Li,et al.  Fast ELCA computation for keyword queries on XML data , 2010, EDBT '10.

[16]  Divesh Srivastava,et al.  Keyword proximity search in XML trees , 2006, IEEE Transactions on Knowledge and Data Engineering.

[17]  Cong Yu,et al.  Schema-Free XQuery , 2004, VLDB.

[18]  Tok Wang Ling,et al.  Fast Result Enumeration for Keyword Queries on XML Data , 2012, J. Comput. Sci. Eng..

[19]  Yi Chen,et al.  Reasoning and identifying relevant matches for XML keyword search , 2008, Proc. VLDB Endow..

[20]  Aoying Zhou,et al.  Hash-Search: An Efficient SLCA-Based Keyword Search Algorithm on XML Documents , 2009, DASFAA.

[21]  Jeffrey Xu Yu,et al.  Ten thousand SQLs , 2010, Proc. VLDB Endow..