PreCN: Preprocessing Candidate Networks for Efficient Keyword Search over Databases

Keyword Search Over Relational Databases(KSORD) has attracted much research interest since casual users or Web users can use the techniques to easily access databases through free-form keyword queries, just like searching the Web. However, it is a critical issue that how to improve the performance of KSORD systems. In this paper, we focus on the performance improvement of schema-graph-based online KSORD systems and propose a novel Preprocessing Candidate Network(PreCN) approach to support efficient keyword search over relational databases. Based on a given database schema, PreCN reduces CN generation time by preprocessing the maximum Tuple Sets Graph(Gts) to generate CNs in advance and to store them in the database. When a user query comes, its CNs will be quickly retrieved from the database instead of being temporarily generated through a breadth-first traversal of its Gts. Extensive experiments show that the approach PreCN is efficient and effective.

[1]  Martin Bergman,et al.  The deep web:surfacing the hidden value , 2000 .

[2]  Jennifer Widom,et al.  Indexing relational database content offline for efficient keyword-based search , 2005, 9th International Database Engineering & Application Symposium (IDEAS'05).

[3]  S. Sudarshan,et al.  Keyword searching and browsing in databases using BANKS , 2002, Proceedings 18th International Conference on Data Engineering.

[4]  Ji-Jun Wen SEEKER: Keyword-Based Information Retrieval over Relational Databases , 2005 .

[5]  Shan Wang,et al.  Searching Databases with Keywords , 2005, Journal of Computer Science and Technology.

[6]  Vagelis Hristidis,et al.  DISCOVER: Keyword Search in Relational Databases , 2002, VLDB.

[7]  Michael K. Bergman White Paper: The Deep Web: Surfacing Hidden Value , 2001 .

[8]  Surajit Chaudhuri,et al.  DBXplorer: a system for keyword-based search over relational databases , 2002, Proceedings 18th International Conference on Data Engineering.

[9]  Jun Zhang,et al.  CLASCN: Candidate Network Selection for Efficient Top-k Keyword Queries over Databases , 2007, Journal of Computer Science and Technology.

[10]  Luis Gravano,et al.  Efficient IR-Style Keyword Search over Relational Databases , 2003, VLDB.

[11]  Amanda Spink,et al.  Real life, real users, and real needs: a study and analysis of user queries on the web , 2000, Inf. Process. Manag..

[12]  S. Sudarshan,et al.  Bidirectional Expansion For Keyword Search on Graph Databases , 2005, VLDB.