Business enterprises, and society in general, are becoming increasingly dependent on computer systems. As a result, we are now awash in a sea of data—data of all shapes and izes—making heterogeneous data management a tremendously relevant challenge today. Moreover, th problem of data heterogeneity is itself varied, with different applications posing a variety of requirements. S ome applications need to access and/or manage data in several, possibly many, different database systems—some w ith different data models. Other applications need to access and/or manage external data, e.g., data stored in file systems or other specialized data repositories, together with data sets residing in one or more databases. Still other applications need to compose business objects from a combination of legacy data and legacy transactions (e.g., t ravel reservation systems) provided by multiple legacy database management systems. Of course, the systems that co ntain all this data differ in many ways—they have different data access languages and APIs, different search c pabilities, different integrity guarantees, different data types (and even type systems), and so on. In this paper, we provide a brief and necessarily incomplete overview of what IBM is doing to address some of the aforementioned challenges. In particular, we descri be how IBM’s DB2 Universal Database product family is responding to these challenges in order to provide heterogeneous data access capabilities to DB2 customers. To address the problem of accessing and managing data across multiple databases and database systems, the DB2 family includesDataJoinertechnology; this technology provides transparent SQL-bas ed access to legacy data that may be managed by any of a number of vendors’ database sys tems. To address problems related to external data access and management, the DB2 family offers several re levant technologies; these include Table Functions for customized user-defined access to external data, DataLinksfor keeping file system data in synch with database data, and variousExtendersfor managing new types of data such as text, imagery, and spat ial d ta. To more naturally model new data being brought into the system, and t o extend the performance and transparency benefits of the DataJoiner technology to cover a broader range of sour ces, these systems are being extended with new Object-Relationalcapabilities and with support for Wrappers. In the remainder of this paper, we provide more information a bout each of the DB2 extensions mentioned above. We explain what each of the extensions is about, summa rizing the capabilities that DB2 currently provides, or will soon provide, in each area. Readers with an int erest in the related area of composing business objects from legacy transactions as well as data are encouraged to lo ok at theComponent Broker technology [IBM98] that IBM is developing to address that problem.
[1]
Jim Kleewein.
Practical Issues with Commercial Use of Federated Databases
,
1996,
VLDB.
[2]
Hamid Pirahesh,et al.
SQL open heterogeneous data access
,
1998,
SIGMOD '98.
[3]
Mary Roth,et al.
Don't Scrap It, Wrap It! A Wrapper Architecture for Legacy Data Sources
,
1997,
VLDB.
[4]
Laura M. Haas,et al.
Optimizing Queries Across Diverse Data Sources
,
1997,
VLDB.
[5]
R. Ramakrishnan,et al.
An Optimizer for Heterogeneous Systems with NonStandard Data and Search Capabilities
,
1996
.
[6]
Nelson Mendonça Mattos,et al.
Integrating SQL Databases with Content-Specific Search Engines
,
1997,
VLDB.
[7]
Hamid Pirahesh,et al.
Extensions to Starburst: objects, types, functions, and rules
,
1991,
CACM.