A review on XML keyword query processing

Keyword search is gaining popularity for querying XML data now days as it relieves user from understanding the complex schemas of XML document and query languages such as XQuery and XPath. Various query processing techniques and efficient algorithms have been proposed in recent days to address the keyword search over XML data. The most popular techniques for XML keyword search today use query semantics ELCA (Exclusive LCA) and SLCA (Smallest LCA), both based on LCA (Lowest Common Ancestor). Among these ELCA captures more meaningful results compared with LCA and ELCA. However these techniques can result in redundant computation due to problems like common-ancestor-repetition (CAR) and visiting-useless-node (VUN). Irregular schemas of given XML document and missing elements in it are also problems of consideration in keyword query processing over XML data. In this paper we try to make an attempt to review various XML keyword query processing techniques. We also highlight some of the important issues associated with respective techniques and improvements done in order to address the issues and thereby improving overall efficiency of the XML keyword search query processing.

[1]  Jeffrey Xu Yu,et al.  Top-down keyword query processing on XML data , 2013, CIKM.

[2]  Rémi Gilleron,et al.  Retrieving meaningful relaxed tightest fragments for XML keyword search , 2009, EDBT '09.

[3]  Jianyong Wang,et al.  Effective keyword search for valuable lcas over xml documents , 2007, CIKM '07.

[4]  Yehoshua Sagiv,et al.  XSEarch: A Semantic Search Engine for XML , 2003, VLDB.

[5]  Chee Yong Chan,et al.  Multiway SLCA-based keyword search in XML data , 2007, WWW '07.

[6]  Xiaofeng Meng,et al.  Efficient query processing for XML keyword queries based on the IDList index , 2013, The VLDB Journal.

[7]  Yannis Papakonstantinou,et al.  Supporting top-K keyword search in XML databases , 2010, 2010 IEEE 26th International Conference on Data Engineering (ICDE 2010).

[8]  Andrew Chi-Chih Yao,et al.  An Almost Optimal Algorithm for Unbounded Searching , 1976, Inf. Process. Lett..

[9]  Sivaji Yerraguntla,et al.  CONTEXT-BASED DIVERSIFICATION FOR KEYWORD QUERIES OVER XML DATA , 2016 .

[10]  Curtis E. Dyreson,et al.  Querying virtual hierarchies using virtual prefix-based numbers , 2014, SIGMOD Conference.

[11]  Xudong Lin,et al.  Fast SLCA and ELCA Computation for XML Keyword Queries Based on Set Intersection , 2012, 2012 IEEE 28th International Conference on Data Engineering.

[12]  Sudipto Guha,et al.  Improving the Performance of List Intersection , 2009, Proc. VLDB Endow..

[13]  Jeffrey Xu Yu,et al.  Top-Down XML Keyword Query Processing , 2016, IEEE Transactions on Knowledge and Data Engineering.

[14]  Yi Chen,et al.  Processing keyword search on XML: a survey , 2011, World Wide Web.

[15]  Krithi Ramamritham,et al.  Enabling generic keyword search over raw XML data , 2015, 2015 IEEE 31st International Conference on Data Engineering.

[16]  Alejandro López-Ortiz,et al.  Faster Adaptive Set Intersections for Text Searching , 2006, WEA.

[17]  Cong Yu,et al.  Schema-Free XQuery , 2004, VLDB.

[18]  Bolin Ding,et al.  Fast Set Intersection in Memory , 2011, Proc. VLDB Endow..

[19]  Tok Wang Ling,et al.  From Region Encoding To Extended Dewey: On Efficient Processing of XML Twig Pattern Matching , 2005, VLDB.

[20]  Jianxin Li,et al.  Suggestion of promising result types for XML keyword search , 2010, EDBT '10.

[21]  Michael Grossniklaus,et al.  Efficient structural bulk updates on the Pre/Dist/Size XML encoding , 2015, 2015 IEEE 31st International Conference on Data Engineering.

[22]  Aoying Zhou,et al.  Hash-Search: An Efficient SLCA-Based Keyword Search Algorithm on XML Documents , 2009, DASFAA.

[23]  Chun Zhang,et al.  Storing and querying ordered XML using a relational database system , 2002, SIGMOD '02.

[24]  Jianxin Li,et al.  Fast ELCA computation for keyword queries on XML data , 2010, EDBT '10.

[25]  Wei Wang,et al.  Keyword-based search and exploration on databases , 2011, 2011 IEEE 27th International Conference on Data Engineering.

[26]  Lin Guo XRANK : Ranked Keyword Search over XML Documents , 2003 .

[27]  Divesh Srivastava,et al.  Keyword proximity search in XML trees , 2006 .

[28]  Curtis E. Dyreson,et al.  MESSIAH: missing element-conscious SLCA nodes search in XML data , 2013, SIGMOD '13.

[29]  Yi Chen,et al.  Reasoning and identifying relevant matches for XML keyword search , 2008, Proc. VLDB Endow..

[30]  Yannis Papakonstantinou,et al.  Efficient keyword search for smallest LCAs in XML databases , 2005, SIGMOD '05.