Exploiting semantics for XML keyword search

XML keyword search has attracted a lot of interests with typical search based on lowest common ancestor (LCA). However, in this paper, we show several problems of the LCA-based approaches, including meaningless answers, incomplete answers, duplicated answers, missing answers, and schema-dependent answers. To handle these problems, we exploit the semantics of object, object identifier, relationship, and attribute (referred to as the ORA-semantics). Based on the ORA-semantics, we introduce new ways of labeling and matching. More importantly, we propose a new semantics, called CR (Common Relative) for XML keyword search, which can return answers independent from schema designs. To find answers based on the CR semantics, we discover properties of common relative and propose an efficient algorithms. Experimental results show the seriousness of the problems of the LCA-based approaches. They also show that the CR semantics possesses the properties of completeness, soundness and independence while the response time of our approach is faster than the LCA-based approaches thanks to our techniques.

[1]  Aijun An,et al.  Keyword Search in Graphs: Finding r-cliques , 2011, Proc. VLDB Endow..

[2]  Tok Wang Ling,et al.  An Effective Object-Level XML Keyword Search , 2010, DASFAA.

[3]  Tok Wang Ling,et al.  Group-by and Aggregate Functions in XML Keyword Search , 2014, DEXA.

[4]  Jinhyung Kim,et al.  A Translation Algorithm for Effective RDB-to-XML Schema Conversion Considering Referential Integrity Information , 2009, J. Inf. Sci. Eng..

[5]  Huayu Wu,et al.  Object-Oriented XML Keyword Search , 2011, ER.

[6]  Lin Guo XRANK : Ranked Keyword Search over XML Documents , 2003 .

[7]  Xudong Lin,et al.  Fast SLCA and ELCA Computation for XML Keyword Queries Based on Set Intersection , 2012, 2012 IEEE 28th International Conference on Data Engineering.

[8]  Tok Wang Ling,et al.  From Structure-Based to Semantics-Based: Towards Effective XML Keyword Search , 2013, ER.

[9]  Curtis E. Dyreson,et al.  MESSIAH: missing element-conscious SLCA nodes search in XML data , 2013, SIGMOD '13.

[10]  Jianxin Li,et al.  Fast ELCA computation for keyword queries on XML data , 2010, EDBT '10.

[11]  Jianyong Wang,et al.  Effective keyword search for valuable lcas over xml documents , 2007, CIKM '07.

[12]  Yehoshua Sagiv,et al.  XSEarch: A Semantic Search Engine for XML , 2003, VLDB.

[13]  Yi Chen,et al.  Reasoning and identifying relevant matches for XML keyword search , 2008, Proc. VLDB Endow..

[14]  Beng Chin Ooi,et al.  EASE: an effective 3-in-1 keyword search method for unstructured, semi-structured and structured data , 2008, SIGMOD Conference.

[15]  Cong Yu,et al.  Schema-Free XQuery , 2004, VLDB.

[16]  Yi Chen,et al.  Identifying meaningful return information for XML keyword search , 2007, SIGMOD '07.

[17]  Stéphane Bressan,et al.  Discovering Semantics from Data-Centric XML , 2013, DEXA.

[18]  Tok Wang Ling,et al.  Effective XML Keyword Search with Relevance Oriented Ranking , 2009, 2009 IEEE 25th International Conference on Data Engineering.

[19]  Cong Yu,et al.  TIMBER: A native XML database , 2002, The VLDB Journal.

[20]  Vagelis Hristidis,et al.  Keyword proximity search on XML graphs , 2003, Proceedings 19th International Conference on Data Engineering (Cat. No.03CH37405).

[21]  Tok Wang Ling,et al.  Object Semantics for XML Keyword Search , 2014, DASFAA.

[22]  Yannis Papakonstantinou,et al.  Efficient keyword search for smallest LCAs in XML databases , 2005, SIGMOD '05.

[23]  Ioana Manolescu,et al.  Building Large XML Stores in the Amazon Cloud , 2012, 2012 IEEE 28th International Conference on Data Engineering Workshops.

[24]  Shan Wang,et al.  Finding Top-k Min-Cost Connected Trees in Databases , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[25]  Philip S. Yu,et al.  BLINKS: ranked keyword searches on graphs , 2007, SIGMOD '07.

[26]  S. E. Dreyfus,et al.  The steiner problem in graphs , 1971, Networks.