Optimizing Queries Across Diverse Data Sources

Businessestoday need to interrelate data stored in diverse systems with differing capabilities, ideally via a single high-level query interface. We present the design of a query optimizer for Garlic [C 95], a middleware system designedto integrate data from a broad range of data sources with very different query capabilities. Garlic’s optimizer extends the rule-based approach of [Loh88] to work in a heterogeneous environment, by defining generic rules for the middleware and using wrapper-provided rules to encapsulate the capabilities of each data source. This approach offers great advantages in terms of plan quality, extensibility to new sources, incremental implementationof rules for new sources, and the ability to express the capabilities of a diverse set of sources. We describe the design and implementationof this optimizer, and illustrate its actions through an example.

[1]  Ronald Fagin,et al.  Combining Fuzzy Information from Multiple Systems , 1999, J. Comput. Syst. Sci..

[2]  Weimin Du,et al.  Query Optimization in a Heterogeneous DBMS , 1992, VLDB.

[3]  Patrick Valduriez,et al.  Using Heterogeneous Equivalences for Query Rewriting in Multidatabase Systems , 1995, CoopIS.

[4]  Jeffrey D. Ullman,et al.  Answering Queries Using Limited External Query Processors , 1999, J. Comput. Syst. Sci..

[5]  Jennifer Widom,et al.  Object exchange across heterogeneous information sources , 1995, Proceedings of the Eleventh International Conference on Data Engineering.

[6]  Jeffrey D. Ullman,et al.  Answering queries using limited external query processors (extended abstract) , 1996, PODS.

[7]  Patricia G. Selinger,et al.  Access path selection in a relational database management system , 1979, SIGMOD '79.

[8]  Hamid Pirahesh,et al.  Extensible query processing in starburst , 1989, SIGMOD '89.

[9]  Björn Þór Jónsson,et al.  Performance tradeoffs for client-server query processing , 1996, SIGMOD '96.

[10]  Weimin Du,et al.  Pegasus: A Heterogeneous Information Management System , 1995, Modern Database Systems.

[11]  Joann J. Ordille,et al.  Querying Heterogeneous Information Sources Using Source Descriptions , 1996, VLDB.

[12]  Eugene J. Shekita,et al.  Fundamental techniques for order optimization , 1996, SIGMOD '96.

[13]  ZhaoHui Tang,et al.  Calibrating the Query Optimizer Cost Model of IRO-DB, an Object-Oriented Federated Database System , 1996, VLDB.

[14]  Xiaolei Qian,et al.  Query folding , 1996, Proceedings of the Twelfth International Conference on Data Engineering.

[15]  Kyuseok Shim,et al.  Query Optimization in the Presence of Foreign Functions , 1993, VLDB.

[16]  Patrick Valduriez,et al.  Scaling heterogeneous databases and the design of Disco , 1996, Proceedings of 16th International Conference on Distributed Computing Systems.

[17]  Guy M. Lohman,et al.  Grammar-like functional rules for representing query optimization alternatives , 1988, SIGMOD '88.

[18]  Umeshwar Dayal,et al.  Processing Queries Over Generalization Hierarchies in a Multidatabase System , 1983, VLDB.

[19]  Gio Wiederhold,et al.  Intelligent integration of information , 1993, SIGMOD Conference.

[20]  Laura M. Haas,et al.  Towards heterogeneous multimedia information systems: the Garlic approach , 1995, Proceedings RIDE-DOM'95. Fifth International Workshop on Research Issues in Data Engineering-Distributed Object Management.

[21]  David Jordan,et al.  The Object Database Standard: ODMG 2.0 , 1997 .

[22]  Laura M. Haas,et al.  The Garlic project , 1996, SIGMOD '96.

[23]  Mary Roth,et al.  Don't Scrap It, Wrap It! A Wrapper Architecture for Legacy Data Sources , 1997, VLDB.

[24]  Tom Atwood,et al.  Object Database Standard: ODMG-93, Release 1.2 , 1995 .

[25]  R. G. G. Cattell,et al.  The Object Database Standard: ODMG-93 , 1993 .

[26]  Guy M. Lman Grammar-like Functional Rules for Representing Query Optimization Alternatives , 1998 .

[27]  Guy M. Lohman,et al.  Query Optimization in the IBM DB2 Family. , 1993 .

[28]  Jeffrey D. Ullman,et al.  A Query Translation Scheme for Rapid Implementation of Wrappers , 1995, DOOD.

[29]  Goetz Graefe,et al.  The Volcano optimizer generator: extensibility and efficient search , 1993, Proceedings of IEEE 9th International Conference on Data Engineering.

[30]  Christos Faloutsos,et al.  QBIC project: querying images by content, using color, texture, and shape , 1993, Electronic Imaging.

[31]  William J. McKenna,et al.  EROC: A Toolkit for Building NEATO Query Optimizers , 1996, VLDB.

[32]  K. Selçuk Candan,et al.  Query caching and optimization in distributed mediator systems , 1996, SIGMOD '96.

[33]  David J. DeWitt,et al.  The EXODUS optimizer generator , 1987, SIGMOD '87.

[34]  Guy M. Lohman,et al.  Implementing an Interpreter for Functional Rules in a Query Optimizer , 1988, VLDB.

[35]  José A. Blakeley,et al.  Data access for the masses through OLE DB , 1996, SIGMOD '96.