Quality-oriented and Metadata-driven Integration in Information Grids

The goal of information grids is to provide a virtually integrated view on information, which is physically stored in many distributed nodes of the grid. A user should be able to query the grid through a uniform query interface using a common data model, without knowing the details of the distribution of the data. Information grids that integrate information from heterogeneous resources have to resolve the problems of semantic and structural heterogeneity, and also take into account different quality characteristics of the sources. This paper addresses the integration in information grids, and adopts results for data integration in data warehouses to the information grid. The contributions of this paper are (i) an extended metadata framework to capture the different types of metadata of an information grid, (ii) the integration of quality aspects into this framework, and (iii) a methodology for a quality-oriented and metadata-driven integration of information.

[1]  Felix Naumann,et al.  Completeness of integrated information sources , 2004, Inf. Syst..

[2]  Chen Li,et al.  Efficient record linkage in large data sets , 2003, Eighth International Conference on Database Systems for Advanced Applications, 2003. (DASFAA 2003). Proceedings..

[3]  Vassilis Christophides,et al.  The ICS-FORTH Semantic Web Integration Middleware (SWIM) , 2003, IEEE Data Eng. Bull..

[4]  Gavin McCance,et al.  Metadata Management in the EU DataGrid , 2003, MMGPS.

[5]  Carole A. Goble,et al.  The Semantic Grid: Myth Busting and Bridge Building , 2004, ECAI.

[6]  Diego Calvanese,et al.  Data Integration in Data Warehousing (Keynote Address) , 2001, CAiSE Workshops.

[7]  Mario Cannataro,et al.  Architecture, Metadata and Ontologies in the Knowledge Grid , 2003, MMGPS.

[8]  Ian T. Foster,et al.  The data grid: Towards an architecture for the distributed management and analysis of large scientific datasets , 2000, J. Netw. Comput. Appl..

[9]  Felix Naumann,et al.  Quality-Driven Query Answering for Integrated Information Systems , 2002, Lecture Notes in Computer Science.

[10]  Victor R. Basili,et al.  Representing Software Engineering Models: The TAME Goal Oriented Approach , 1992, IEEE Trans. Software Eng..

[11]  John B. Shoven,et al.  I , Edinburgh Medical and Surgical Journal.

[12]  JOHAN GEIJER,et al.  Grid Computing For The Analysis Of Regulatory Elements In Co-Regulated Sets Of Genes , 2004, Parallel Process. Lett..

[13]  Fausto Giunchiglia,et al.  Data Management for Peer-to-Peer Computing : A Vision , 2002, WebDB.

[14]  Giri Kumar Tayi,et al.  Examining data quality , 1998, CACM.

[15]  Laks V. S. Lakshmanan,et al.  Information Integration and the Semantic Web , 2003, IEEE Data Eng. Bull..

[16]  Paul Avery,et al.  The griphyn project: towards petascale virtual data grids , 2001 .

[17]  Alon Y. Halevy,et al.  Answering queries using views: A survey , 2001, The VLDB Journal.

[18]  Ian T. Foster,et al.  The virtual data grid: a new model and architecture for data-intensive collaboration , 2003, 15th International Conference on Scientific and Statistical Database Management, 2003..

[19]  Steffen Staab,et al.  A Metadata Model for Semantics-Based Peer-to-Peer Systems , 2003 .

[20]  Nicholas R. Jennings,et al.  The Semantic Grid: A Future e‐Science Infrastructure , 2003 .

[21]  Christoph Bussler Semantic Web Services: The Future of Integration! , 2003, ADBIS.

[22]  Alon Y. Halevy,et al.  MiniCon: A scalable algorithm for answering queries using views , 2000, The VLDB Journal.

[23]  Felix Naumann,et al.  Quality-driven Integration of Heterogenous Information Systems , 1999, VLDB.

[24]  Dan Suciu,et al.  The Piazza peer data management project , 2003, SGMD.

[25]  Matthias Jarke,et al.  Architecture and Quality in Data Warehouses: An Extended Repository Approach , 1999, Information Systems.