Garlic: a new flavor of federated query processing for DB2

In a large modern enterprise, information is almost inevitably distributed among several database management systems. Despite considerable attention from the research community, relatively few commercial systems have attempted to address this issue. This paper describes new technology that enables clients of IBM's DB2 Universal Database to access the data and specialized computational capabilities of a wide range of non-relational data sources. This technology, based on the Garlic prototype developed at the Almaden Research Center, complements and extends DB2's existing ability to federate relational data sources.The paper focuses on three topics. Firstly, we show how the DB2 catalogs are used as an extensible repository for the metadata needed to access remotely-stored information. Secondly, we describe how the Garlic approach to query planning, in which source-specific modules and the federated server cooperate to develop an optimized execution plan, has been realized in DB2. Lastly, we describe how DB2's query execution engine has been extended to support queries and functions that are evaluated remotely.

[1]  Hamid Pirahesh,et al.  Extensible/rule based query rewrite optimization in Starburst , 1992, SIGMOD '92.

[2]  Piyush Gupta,et al.  DataJoiner: a practical approach to multi-database access , 1994, Proceedings of 3rd International Conference on Parallel and Distributed Information Systems.

[3]  Laura M. Haas,et al.  Towards heterogeneous multimedia information systems: the Garlic approach , 1995, Proceedings RIDE-DOM'95. Fifth International Workshop on Research Issues in Data Engineering-Distributed Object Management.

[4]  Laura M. Haas,et al.  Optimizing Queries Across Diverse Data Sources , 1997, VLDB.

[5]  Mary Roth,et al.  Don't Scrap It, Wrap It! A Wrapper Architecture for Legacy Data Sources , 1997, VLDB.

[6]  Laura M. Haas,et al.  Loading a Cache with Query Results , 1999, VLDB.

[7]  Laura M. Haas,et al.  Cost Models DO Matter: Providing Cost Information for Diverse Data Sources in a Federated System , 1999, VLDB.

[8]  Laura M. Haas,et al.  Schema Mapping as Query Discovery , 2000, VLDB.