Towards an intelligent keyword search over XML and relational databases

Keyword search has been the major form of retrieval method in information retrieval system, and has become an important way for novice to explore data-centric XML and relational databases (RDB). Recent years have witnessed many approaches proposed for keyword search over XML and RDB. However, those approaches cannot intelligently exploit hidden semantics in XML or RDB, and thus encounter serious problems in processing keyword queries. In this paper, we point out mismatches between query answers returned by existing approaches and the common expectations in keyword search over XML and RDB. We analyze these mismatches and discover that the main reasons are due to the unawareness of semantics of object, relationship and attribute in databases. To capture these semantics, we construct Object Relationship (OR) data graph for XML and Object Relationship Mixed (ORM) data graph for RDB, and propose an intelligent keyword search based on OR and ORM data graph model to retrieve more informative answers. Finally, to further facilitate the usability of keyword search, we also show our ongoing work to enhance the expressive power of keyword queries. Particularly, we 1) enable users to explicitly indicate their search intentions by relation, attribute and tag names in keyword queries; 2) handle recursive relationships and identifier-dependency relationships (IDD) in databases; and 3) incorporate aggregate function into keyword queries so that users can explore databases with aggregate queries.

[1]  Xuemin Lin,et al.  SPARK2: Top-k Keyword Query in Relational Databases , 2007, IEEE Transactions on Knowledge and Data Engineering.

[2]  Philip S. Yu,et al.  BLINKS: ranked keyword searches on graphs , 2007, SIGMOD '07.

[3]  Stéphane Bressan,et al.  Discovering Semantics from Data-Centric XML , 2013, DEXA.

[4]  Yannis Papakonstantinou,et al.  Efficient keyword search for smallest LCAs in XML databases , 2005, SIGMOD '05.

[5]  Cong Yu,et al.  Schema-Free XQuery , 2004, VLDB.

[6]  Tok Wang Ling,et al.  From Structure-Based to Semantics-Based: Towards Effective XML Keyword Search , 2013, ER.

[7]  Jianxin Li,et al.  Fast ELCA computation for keyword queries on XML data , 2010, EDBT '10.

[8]  Yehoshua Sagiv,et al.  Keyword proximity search in complex data graphs , 2008, SIGMOD Conference.

[9]  Tok Wang Ling,et al.  A Semantic Approach to Keyword Search over Relational Databases , 2013, ER.

[10]  Vagelis Hristidis,et al.  DISCOVER: Keyword Search in Relational Databases , 2002, VLDB.

[11]  Beng Chin Ooi,et al.  EASE: an effective 3-in-1 keyword search method for unstructured, semi-structured and structured data , 2008, SIGMOD Conference.

[12]  Sonia Bergamaschi,et al.  Keyword search over relational databases: a metadata approach , 2011, SIGMOD '11.

[13]  S. Sudarshan,et al.  Keyword searching and browsing in databases using BANKS , 2002, Proceedings 18th International Conference on Data Engineering.

[14]  Shan Wang,et al.  Finding Top-k Min-Cost Connected Trees in Databases , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[15]  Jianyong Wang,et al.  Effective keyword search for valuable lcas over xml documents , 2007, CIKM '07.