Searching for Related Objects in Relational Databases

To discover knowledge or retrieve information from a relational database, a user often needs to find objects related to certain source objects. There are two main challenges in building an effective object search system: the huge amount of objects in the database and the large number of different relationships between objects. In this paper we introduce Ross, an efficient and accurate relational object search system. Ross accepts complex queries that enable users to specify the relationships among objects. To measure the relationships of join paths, Ross considers the different semantics of different joins, and combines both selectivity and lengths of join paths to measure their strengths. A novel approach is used to find the best join paths between relations, which converts the database schema into a graph, so that the shortest paths in the graph correspond to best join paths in the database. Ross uses a stream-based system architecture to handle complex queries containing logical operators, which can find the most related objects upon users’ requests. Comprehensive experiments are conducted to show the high scalability and effectiveness of Ross.

[1]  Vagelis Hristidis,et al.  DISCOVER: Keyword Search in Relational Databases , 2002, VLDB.

[2]  Jose Luis Esteves dos Santos,et al.  A New Shortest Paths Ranking Algorithm , 1999 .

[3]  S. Sudarshan,et al.  Keyword searching and browsing in databases using BANKS , 2002, Proceedings 18th International Conference on Data Engineering.

[4]  Surajit Chaudhuri,et al.  DBXplorer: a system for keyword-based search over relational databases , 2002, Proceedings 18th International Conference on Data Engineering.

[5]  J. Y. Yen,et al.  Finding the K Shortest Loopless Paths in a Network , 2007 .

[6]  Feng Shao,et al.  XRANK: ranked keyword search over XML documents , 2003, SIGMOD '03.

[7]  Vagelis Hristidis,et al.  Keyword proximity search on XML graphs , 2003, Proceedings 19th International Conference on Data Engineering (Cat. No.03CH37405).

[8]  Jeffrey F. Naughton,et al.  Practical selectivity estimation through adaptive sampling , 1990, SIGMOD '90.

[9]  Clifford A. Lynch,et al.  Selectivity Estimation and Query Optimization in Large Databases with Highly Skewed Distribution of Column Values , 1988, VLDB.

[10]  Yannis E. Ioannidis,et al.  Selectivity Estimation Without the Attribute Value Independence Assumption , 1997, VLDB.

[11]  Cong Yu,et al.  Querying structured text in an XML database , 2003, SIGMOD '03.

[12]  Luis Gravano,et al.  Efficient IR-Style Keyword Search over Relational Databases , 2003, VLDB.

[13]  Jeffrey Scott Vitter,et al.  Wavelet-based histograms for selectivity estimation , 1998, SIGMOD '98.

[14]  Ben Taskar,et al.  Selectivity estimation using probabilistic models , 2001, SIGMOD '01.