Intent: an integrated environment for distributed heterogeneous databases
暂无分享,去创建一个
Distributed database technology evolved from the need to integrate large volumes of corporate information to lower production and maintenance costs. Most of the contemporary distributed database systems usually lack the pervasive component furnishing the entire system with the appropriate structural and semantical capabilities. It is rather likely that these systems comprise a number of disconnected subsystems patched together to provide an ad hoc and temporal functionality. Such systems are poorly engineered and thus unreliable and expensive to maintain, increment, or modify. By contrast, the architectural framework envisaged tries to remedy this situation by suggesting that database management facilities should be the converging point in a data-intensive application environment, no matter if it is centralized or distributed.
INTENT is an ongoing project which reconsiders several of the long standing assumptions and perspectives that are pervading the field of distributed database management. INTENT proposes alternatives and solutions leading towards the development of an integrated architectural framework supporting independence from the physical distribution of the component databases while providing the users with a transparent view of the information that is scattered across the nodes of a common network. This can be achieved by introducing new software components which provide an assembly of well-defined logical interfaces for integration with the already existing data management systems.
INTENT relieves the users from the serious problem of integrated retrieval by providing them with a single integrated unifying view of the heterogeneous data stored in the diverse local databases. The INTENT distributed conceptual schema is a highly-logical view of the information content of the integrated system which does not require that the individual databases are physically integrated: rather, all global database access and manipulation operations are mediated through this new form of conceptual schema. As individual schemas may contain redundant or possibly conflicting information, the distributed conceptual schema will have to be aware of the logical relationships among seemingly disjoint components of the local schemas. This approach attempts to embed the advantages of logical centralization in a system presenting a semi-decentralized nature [1].
The initial distributed schema comprises a collection of global data properties and a set of appropriate transformation-translation rules between global and local data properties. Moreover, it contains the appropriate amount of metaknowledge which enables it to specify consistency and integrity constraints as well as to establish useful assertions concerning the locally defined properties. The distributed conceptual schema entails a dynamic nature, and by recording automatically any changes at the local levels it increases its amount of metaknowledge concerning the description of the entire database complex. Furthermore, it incorporates such properties that guarantee its potential self-adjustment to changes affected by its environment to meet the ever changing information requirements of its users[2], and provides the basis for the materialization of the unifying query language of the system [3].
During our research on heterogeneous distributed database management systems (HD-DBMS), we have postulated a number of advanced features to assist the designer in the formidable task of retrieving and manipulating information in such an environment. These features mainly emanate from the object-oriented paradigm, the functional programming, and from knowledge and rule based approaches. In general, we envisage a new generation DBMSs that should provide amongst other support for:Objects as advanced abstraction mechanisms.
Functional operations.
Uniform data and meta-data representation.
Rule-based consistency checks.
The approach for achieving this is by advocating the use of a higher-level object oriented data model to serve as the unifying data model of the system. One major benefit of this approach is that the common behaviors and descriptions of both object types and instances can be shared. In addition, the internal behavior of objects is hidden by external communication protocols. As processing in an object oriented programming environment emphasizes behavior. Application code can be shared by similar object types. More specifically, one can write application code to manipulate a particular data, but specify special handlers for each specialization of the data type that needs to perform different processing. The overall impact of this work is that multiple users can transparently, and concurrently manipulate heterogeneous databases distributed over several communicating nodes in a common network.
The aforementioned features can be thought of as the main structural components of a modeling facility coping with the predicament of distributed modeling. Under these assumptions, we currently explore the adaptation of a higher-level object oriented data model called the Extended Semantic Data Model (ESDM model) [3] to serve as the mapping front-end between diametrically different data representations.
The purpose of the ESDM is to support the richer semantic structures and dynamic aspects of database applications whose substance tends to evolve during their life-span. ESDM encompasses concepts and primitives underlying functional data models [4], [5] as well as fundamentals of object oriented programming languages. To provide a conceptually natural data manipulation language we have borrowed concepts loosely coupled with functional languages which provide the environment for syntactically more readable and concise forms of queries. The ESDM language constructs combine many important features such as rich typing facilities, increased expressive power, optimization facilities and finally interactive user interfaces that support high-level set-based query and update languages.
The innovative aspect of this research is to extend the expedience of a distributed data representation substrate, capable of supporting the coexistence of diverse data models and incorporating knowledge representation techniques, to manage distributed data integration and global information processing.
[1] Jay Banerjee,et al. Data model issues for object-oriented applications , 1987, TOIS.
[2] Malcolm P. Atkinson,et al. EFDM: Extended Functional Data Model , 1986, Comput. J..
[3] David W. Shipman,et al. The functional data model and the data languages DAPLEX , 1981, TODS.
[4] David W. Shipman. The functional data model and the data language DAPLEX , 1979, SIGMOD '79.