Object Semantics for XML Keyword Search

It is well known that some XML elements correspond to objects (in the sense of object-orientation) and others do not. The question we consider in this paper is what benefits we can derive from paying attention to such object semantics, particularly for the problem of keyword queries. Keyword queries against XML data have been studied extensively in recent years, with several lowest-common-ancestor based schemes proposed for this purpose, including SLCA, MLCA, VLCA, and ELCA. It can be seen that identifying objects can help these techniques return more meaningful answers than just the LCA node (or subtree) by returning objects instead of nodes. It is more interesting to see that object semantics can also be used to benefit the search itself. For this purpose, we introduce a novel Nearest Common Object Node semantics (NCON), which includes not just common object ancestors but also common object descendants. We have developed XRich, a system for our NCON-based approach, and used it in our extensive experimental evaluation. The experimental results show that our proposed approach outperforms the state-of-the-art approaches in terms of both effectiveness and efficiency.

[1]  Aijun An,et al.  Keyword Search in Graphs: Finding r-cliques , 2011, Proc. VLDB Endow..

[2]  Theo Härder,et al.  Entity Identification in XML Documents , 2006, Grundlagen von Datenbanken.

[3]  Stavros Papadopoulos,et al.  Nearest keyword search in XML documents , 2011, SIGMOD '11.

[4]  Cong Yu,et al.  Schema-Free XQuery , 2004, VLDB.

[5]  Wenfei Fan,et al.  Keys with Upward Wildcards for XML , 2001, DEXA.

[6]  Curtis E. Dyreson,et al.  MESSIAH: missing element-conscious SLCA nodes search in XML data , 2013, SIGMOD '13.

[7]  Huayu Wu,et al.  Object-Oriented XML Keyword Search , 2011, ER.

[8]  Jianxin Li,et al.  Fast ELCA computation for keyword queries on XML data , 2010, EDBT '10.

[9]  Tok Wang Ling,et al.  Conceptual Modeling - ER 2011, 30th International Conference, ER 2011, Brussels, Belgium, October 31 - November 3, 2011. Proceedings , 2011, ER.

[10]  Yannis Papakonstantinou,et al.  Efficient keyword search for smallest LCAs in XML databases , 2005, SIGMOD '05.

[11]  Jinhyung Kim,et al.  A Translation Algorithm for Effective RDB-to-XML Schema Conversion Considering Referential Integrity Information , 2009, J. Inf. Sci. Eng..

[12]  Xudong Lin,et al.  Fast SLCA and ELCA Computation for XML Keyword Queries Based on Set Intersection , 2012, 2012 IEEE 28th International Conference on Data Engineering.

[13]  Tok Wang Ling,et al.  From Structure-Based to Semantics-Based: Towards Effective XML Keyword Search , 2013, ER.

[14]  Marianne Winslett,et al.  EXTRUCT: Using Deep Structural Information in XML Keyword Search , 2010, Proc. VLDB Endow..

[15]  Shan Wang,et al.  Finding Top-k Min-Cost Connected Trees in Databases , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[16]  Feng Shao,et al.  XRANK: ranked keyword search over XML documents , 2003, SIGMOD '03.

[17]  Yi Chen,et al.  Identifying meaningful return information for XML keyword search , 2007, SIGMOD '07.

[18]  Stéphane Bressan,et al.  Discovering Semantics from Data-Centric XML , 2013, DEXA.

[19]  Tok Wang Ling,et al.  Effective XML Keyword Search with Relevance Oriented Ranking , 2009, 2009 IEEE 25th International Conference on Data Engineering.

[20]  Joseph Fong,et al.  Converting relational database into XML documents with DOM , 2003, Inf. Softw. Technol..

[21]  Philip S. Yu,et al.  BLINKS: ranked keyword searches on graphs , 2007, SIGMOD '07.

[22]  S. E. Dreyfus,et al.  The steiner problem in graphs , 1971, Networks.

[23]  Jianyong Wang,et al.  Effective keyword search for valuable lcas over xml documents , 2007, CIKM '07.

[24]  Beng Chin Ooi,et al.  EASE: an effective 3-in-1 keyword search method for unstructured, semi-structured and structured data , 2008, SIGMOD Conference.