Supporting ancillary values from user defined functions in Oracle

Most commercial SQL database systems support user-defined functions that can be used in WHERE clause filters, SELECT list items, or in sorting/grouping clauses. Often, user-defined functions are used as inexact search filters and then the filtered rows are sorted by a relevance measure. This is commonplace in Web search engines, multimedia, and personalization applications. We refer to the values, such as relevance measure, associated with the filtered rows as ancillary values, and address the problem of efficiently and expressively supporting queries involving them in Oracle. In our approach, the filtering operator is designated as the primary operator, and the associated ancillary values are modeled by additional operators that are declared to be ancillary to the primary operator. An ancillary operator can represent any auxiliary value for the filtered rows, including relevance values (e.g. a score which describes how well a document matches the text search query) and additional properties (e.g. the nature of spatial relationship for objects that overlap a given region). The query execution is optimized by allowing the primary and ancillary operator invocations to share computations via a shared context. Also, queries involving ancillary values can exploit user defined indexes and their capability to return results in the order of ancillary values. We present the key concepts, describes our implementation scheme and optimization techniques, and discusses alternative approaches for supporting ancillary values. Finally, we provide an experimental study that illustrates the scalability and effectiveness of our approach.

[1]  Michael Stonebraker,et al.  Inclusion of new types in relational data base systems , 1986, 1986 IEEE Second International Conference on Data Engineering.

[2]  Nelson Mendonça Mattos,et al.  Integrating SQL Databases with Content-Specific Search Engines , 1997, VLDB.

[3]  Surajit Chaudhuri,et al.  Optimization of queries with user-defined predicates , 1996, TODS.

[4]  Joseph M. Hellerstein,et al.  Optimization techniques for queries with expensive methods , 1998, TODS.

[5]  Samuel DeFazio,et al.  Extensible indexing: a framework for integrating domain-specific indexing schemes into Oracle8i , 2000, Proceedings of 16th International Conference on Data Engineering (Cat. No.00CB37073).

[6]  Ricardo Baeza-Yates,et al.  Information Retrieval: Data Structures and Algorithms , 1992 .

[7]  Ronald Fagin,et al.  Combining Fuzzy Information from Multiple Systems , 1999, J. Comput. Syst. Sci..

[8]  Rajiv Chopra,et al.  Indexing images in Oracle8i , 2000, SIGMOD '00.

[9]  Chad Carson,et al.  Optimizing queries over multimedia repositories , 1996, SIGMOD '96.

[10]  Hamid Pirahesh,et al.  Heterogeneous query processing through SQL table functions , 1999, Proceedings 15th International Conference on Data Engineering (Cat. No.99CB36337).

[11]  Yun Wang,et al.  High Level Indexing of User-Defined Types , 1999, VLDB.

[12]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[13]  Donald D. Chamberlin,et al.  Using the New DB2: IBM's Object-Relational Database System , 1996 .