A knowledge-based architecture for query formulation and processing in federated heterogeneous databases

Existing (or legacy) databases are typified by differences in data representation, data access languages, and differing data models. Data representation differences include name, format and structural differences for identical and similar data stored in more than one legacy database. Data access language differences may require multiple queries to complete the retrieval of all values of a data element stored in more than one legacy database. And differences in data model constructs may result in similarly named data elements being represented at different levels of abstraction which exhibit different properties. These differences make access difficult for most users. To resolve such problems, this dissertation addresses a user's need to formulate queries to multiple heterogeneous databases easily, and to have confidence in the results that are returned. The Intelligent Heterogeneous Autonomous Database Architecture (InHead) approach involves the use of Artificial Intelligence tools and techniques to construct "domain models," that is data and knowledge representations of the constituent databases and an overall domain model of the semantic interactions among the databases. These domain models are represented as Knowledge Sources (KSs) in a blackboard architecture. The work described in this dissertation provides four major contributions. The first is the specification of an active and intelligent global thesaurus. The second contribution is the extension of the traditional notion of an export schema into that of an "Export Data/Knowledge/Task" schema. The third contribution is the specification and use of "Data/knowledge Packets," which are a means of encapsulating object structure, relationships, operations, constraints, and rules into a meaningful unit, or packet. The fourth contribution is the specification an intelligent heterogeneous database architecture that provides a framework for the above.