In recent years, there has been a lot of interest in the field of keyword querying relational databases. A variety of systems such as DBXplorer [ACD02], Discover [HP02] and ObjectRank [BHP04] have been proposed. Another such system is BANKS, which enables data and schema browsing together with keyword-based search for relational databases. It models tuples as nodes in a graph, connected by links induced by foreign key and other relationships. The size of the database graph that BANKS uses is proportional to the sum of the number of nodes and edges in the graph. Systems such as SPIN, which search on Personal Information Networks and use BANKS as the backend, maintain a lot of information about the users' data. Since these systems run on the user workstation which have other demands of memory, such a heavy use of memory is unreasonable and if possible, should be avoided. In order to alleviate this problem, we introduce EMBANKS (acronym for External Memory BANKS), a framework for an optimized disk-based BANKS system. The complexity of this framework poses many questions, some of which we try to answer in this thesis. We demonstrate that the cluster representation proposed in EMBANKS enables in-memory processing of very large database graphs. We also present detailed experiments that show that EMBANKS can significantly reduce database load time and query execution times when compared to the original BANKS algorithms.
[1]
Vagelis Hristidis,et al.
ObjectRank: Authority-Based Keyword Search in Databases
,
2004,
VLDB.
[2]
M. F.,et al.
Bibliography
,
1985,
Experimental Gerontology.
[3]
J. Leon Zhao,et al.
Spatial data traversal in road map databases: a graph indexing approach
,
1994,
CIKM '94.
[4]
Vagelis Hristidis,et al.
DISCOVER: Keyword Search in Relational Databases
,
2002,
VLDB.
[5]
Surajit Chaudhuri,et al.
DBXplorer: enabling keyword search over relational databases
,
2002,
SIGMOD '02.
[6]
Micah Adler,et al.
Towards compressing Web graphs
,
2001,
Proceedings DCC 2001. Data Compression Conference.
[7]
S. Sudarshan,et al.
Bidirectional Expansion For Keyword Search on Graph Databases
,
2005,
VLDB.
[8]
Per-Åke Larson,et al.
A file structure supporting traversal recursion
,
1989,
SIGMOD '89.
[9]
Gerhard Weikum,et al.
The SphereSearch Engine for Unified Ranked Retrieval of Heterogeneous XML and Web Documents
,
2005,
VLDB.
[10]
Suresh Venkatasubramanian,et al.
On external memory graph traversal
,
2000,
SODA '00.
[11]
S. Sudarshan,et al.
Keyword searching and browsing in databases using BANKS
,
2002,
Proceedings 18th International Conference on Data Engineering.
[12]
Sriram Raghavan,et al.
Representing Web graphs
,
2003,
Proceedings 19th International Conference on Data Engineering (Cat. No.03CH37405).