论文信息 - Universal Top-k Keyword Search over Relational Databases

Universal Top-k Keyword Search over Relational Databases

Keyword search is one of the most effective paradigms for information discovery. One of the key advantages of keyword search querying is its simplicity. There is an increasing need for allowing ordinary users to issue keyword queries without any knowledge of the database schema. The retrieval unit of keyword search queries over relational databases is different than in IR systems. While the retrieval unit in those IR systems is a document, in our case, the result is a synthesized document formed by joining a number of tuples. We measure result quality using two metrics: structural quality and content quality. The content quality of a JTT is an IR-style score that indicates how well the information nodes match the keywords, while the structural quality of JTT is a score that evaluates the meaningfulness/semantics of connecting information nodes, for example, the closeness of the corresponding relationship. We design a hybrid approach and develop a buffer system that dynamically maintains a partial data graph in memory. To reuse intermediate results of SQL queries, we break complex SQL queries into two types of simple queries. This allow us to support very large databases and reduce redundant computation. In addition, we conduct extensive experiments on large-scale real datasets to study the performance of the proposed approaches. Experiments show that our approach is better than previous approaches, especially in terms of result quality.

John F. Kennedy | Joseph P. Kennedy | Caroline Kennedy | Joan Bennett Kennedy

[1] Jianmin Wang,et al. SPARK2: Top-k Keyword Query in Relational Databases , 2011, IEEE Trans. Knowl. Data Eng..

[2] Luis Gravano,et al. Efficient IR-Style Keyword Search over Relational Databases , 2003, VLDB.

[3] Rajeev Motwani,et al. The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.

[4] Clement T. Yu,et al. Effective keyword search in relational databases , 2006, SIGMOD Conference.

[5] Hans Jürgen Prömel,et al. The Steiner Tree Problem , 2002 .

[6] S. Sudarshan,et al. Keyword searching and browsing in databases using BANKS , 2002, Proceedings 18th International Conference on Data Engineering.

[7] Surajit Chaudhuri,et al. DBXplorer: a system for keyword-based search over relational databases , 2002, Proceedings 18th International Conference on Data Engineering.

[8] Vagelis Hristidis,et al. DISCOVER: Keyword Search in Relational Databases , 2002, VLDB.

[9] S. Sudarshan,et al. Bidirectional Expansion For Keyword Search on Graph Databases , 2005, VLDB.

[10] Philip S. Yu,et al. BLINKS: ranked keyword searches on graphs , 2007, SIGMOD '07.