Improving Federated Database Queries Using Declarative Rewrite Rules for Quantified Subqueries

Transforming queries for efficient execution is particularly important in federated database systems since a more efficient execution plan can require many fewer data requests to be sent to the component databases. Also, it is important to do as much as possible of the selection and processing close to where the data are stored, making best use of facilities provided by the federation's component database management systems. In this paper we address the problem of processing complex queries including quantifiers, which have to be executed against different databases in an expanding heterogeneous federation. This is done by transforming queries within a mediator for global query improvement, and within wrappers to make the best use of the query processing capabilities of external databases. Our approach is based on pattern matching and query rewriting. We introduce a high level language for expressing rewrite rules declaratively, and demonstrate the use and flexibility of such rules in improving query performance for existentially quantified subqueries. Extensions to this language that allow generic rewrite rules to be expressed are also presented. The value of performing final transformations within a wrapper for a given remote database is shown in several examples that use AMOS II—an SQL3-like system.

[1]  Hamid Pirahesh,et al.  Extensible/rule based query rewrite optimization in Starburst , 1992, SIGMOD '92.

[2]  Dean Daniels,et al.  Optimization of Nested Queries in a Distributed Relational Database , 1984, VLDB.

[3]  Johann-Christoph Freytag,et al.  A rule-based view of query optimization , 1987, SIGMOD '87.

[4]  Won Kim,et al.  On optimizing an SQL-like nested query , 1982, TODS.

[5]  Terry A. Landers,et al.  An Overview of MULTIBASE , 1986, DDB.

[6]  Peter M. D. Gray,et al.  A schema-based approach to building a bioinformatics database federation , 2000, Proceedings IEEE International Symposium on Bio-Informatics and Biomedical Engineering.

[7]  Suzanne M. Embury,et al.  The Evolving Role of Constraints in the Functional Data Model , 1999, Journal of Intelligent Information Systems.

[8]  David W. Shipman,et al.  The functional data model and the data languages DAPLEX , 1981, TODS.

[9]  M. Muralikrishna,et al.  Improved Unnesting Algorithms for Join Aggregate SQL Queries , 1992, VLDB.

[10]  Carole A. Goble,et al.  Query processing in the TAMBIS bioinformatics source integration system , 1999, Proceedings. Eleventh International Conference on Scientific and Statistical Database Management.

[11]  Harry K. T. Wong,et al.  Optimization of nested SQL queries revisited , 1987, SIGMOD '87.

[12]  Peter M. D. Gray,et al.  CORBA and XML: Design Choices for Database Federations , 2000, BNCOD.

[13]  Umeshwar Dayal,et al.  Processing queries with quantifiers a horticultural approach , 1983, PODS.

[14]  W. Shipman David,et al.  The functional data model and the data language DAPLEX , 1988 .

[15]  Tore Risch,et al.  Distributing semantic constraints between heterogeneous databases , 1997, Proceedings 13th International Conference on Data Engineering.

[16]  Limsoon Wong,et al.  Kleisli, a functional query system , 2000, J. Funct. Program..

[17]  Umeshwar Dayal,et al.  Of Nests and Trees: A Unified Approach to Processing Queries That Contain Nested Subqueries, Aggregates, and Quantifiers , 1987, VLDB.

[18]  Peter M. D. Gray,et al.  Optimization of Methods in a Navigational Query Language , 1991, DOOD.

[19]  Larry Kerschberg,et al.  Guest Editor Introduction: Functional Approach to Intelligent Information Systems , 2004, Journal of Intelligent Information Systems.

[20]  Jonathan J. King,et al.  Query optimization by semantic reasoning , 1981 .

[21]  Guido Moerkotte,et al.  Optimizing Queries with Universal Quantification in Object-Oriented and Object-Relational Databases , 1997, VLDB.

[22]  D. A. Turner,et al.  Miranda: A Non-Strict Functional language with Polymorphic Types , 1985, FPCA.

[23]  Laura M. Haas,et al.  Integrating life sciences data-with a little Garlic , 2000, Proceedings IEEE International Symposium on Bio-Informatics and Biomedical Engineering.

[24]  Matthias Jarke,et al.  Range nesting: a fast method to evaluate quantified queries , 1982, SIGMOD '83.

[25]  Peter M. D. Gray,et al.  Using the functional data model to integrate distributed biological data sources , 1996, Proceedings of 8th International Conference on Scientific and Statistical Data Base Management.

[26]  Norman W. Paton,et al.  Optimising and Executing DAPLEX Queries Using Prolog , 1990, Comput. J..

[27]  Suzanne M. Embury,et al.  A Modular Compiler Architecture for a Data Manipulation Language , 1996, BNCOD.

[28]  Zhuoan Jiao Optimisation studies in a Prolog object-oriented database , 1992 .

[29]  Umeshwar Dayal,et al.  Proceedings of Association for Computing Machinery, Special Interest Group on Management of Data : 1987 annual conference, San Francisco, May 27-29, 1987 , 1987 .

[30]  Limsoon Wong,et al.  A Data Transformation System for Biological Data Sources , 1995, VLDB.

[31]  Norman W. Paton,et al.  Object-oriented databases - a semantic data model approach , 1992, Prentice Hall International Series in Computer Science.

[32]  Norman W. Paton,et al.  A Prolog Interface to a Functional Data Model Database , 1988, EDBT.