A Service Oriented Approach for Distributed Data Mediation on the Grid

Seamless integrated access to data stored in globally distributed databases has become a major challenge in many scientific disciplines. In this paper, we describe a Grid-based service-oriented infrastructure that tackles this challenge through the provisioning of virtual data sources realized by means of distributed data mediation services. Virtual data sources offer to the user a single integrated view of distributed heterogeneous databases and information sources while hiding the details of data location, data formats, and access mechanisms of the underlying physical data sources. Personalized views of data sources, tailor-made for specific usage scenarios, may be offered by different virtual data sources. Virtual data sources rely on flexible mediation techniques and utilize distributed query processing to optimize complex data integration scenarios. Distributed data mediation services have been realized on top of standard Grid and Web Services technologies including OGSA-DAI and OGSA-DQP. The generic data service infrastructure described in this paper is being utilized in the context of the European @neurIST project, which develops an advanced service-oriented Grid infrastructure for the management and treatment of multi-factorial diseases. Details of distributed data mediation and query processing are presented in the context of an experimental scenario integrating clinical data bases.

[1]  Martin Boeker,et al.  The @neurIST Ontology of Intracranial Aneurysms: Providing Terminological Services for an Integrated IT Infrastructure , 2007, AMIA.

[2]  Alexander Wöhrer,et al.  Virtualizing Scientific Applications and Data Sources as Grid Services , 2008 .

[3]  Norman W. Paton,et al.  The design and implementation of Grid database services in OGSA‐DAI , 2005, Concurr. Pract. Exp..

[4]  Ramon Lawrence,et al.  Dynamic Database Integration in a JDBC Driver , 2005, ICEIS.

[5]  Mike P. Papazoglou,et al.  Service oriented architectures: approaches, technologies and research issues , 2007, The VLDB Journal.

[6]  Peter Brezany,et al.  Novel mediator architectures for Grid information systems , 2005, Future Gener. Comput. Syst..

[7]  Richard McClatchey,et al.  MammoGrid: A Service Oriented Architecture Based Medical Grid Application , 2004, GCC.

[8]  S E Middleton,et al.  GEMSS: grid-infrastructure for medical service provision. , 2005, Methods of information in medicine.

[9]  David W. Embley,et al.  Combining the Best of Global-as-View and Local-as-View for Data Integration , 2004, ISTA.

[10]  Alejandro F. Frangi,et al.  @neurIST - Chronic Disease Management through Integration of Heterogeneous Data and Computer-interpretable Guideline Services , 2008, HealthGrid.

[11]  Joann J. Ordille,et al.  Querying Heterogeneous Information Sources Using Source Descriptions , 1996, VLDB.

[12]  Norman W. Paton,et al.  Experience on Performance Evaluation with OGSA-DQP , 2005 .

[13]  Jim Smith,et al.  Distributed Query Processing on the Grid , 2003, Int. J. High Perform. Comput. Appl..

[14]  Michael R. Genesereth,et al.  The Conceptual Basis for Mediation Services , 1997, IEEE Expert.

[15]  Siegfried Benkner,et al.  A Semantic Mediation Architecture for a Clinical Data Grid , 2007, Grid Computing for Bioinformatics and Computational Biology.

[16]  Norman W. Paton,et al.  The design and implementation of Grid database services in OGSA-DAI: Research Articles , 2005 .

[17]  Manolis Tsiknakis,et al.  A Semantic Grid Infrastructure Enabling Integrated Access and Analysis of Multilevel Biomedical Data in Support of Postgenomic Clinical Trials on Cancer , 2008, IEEE Transactions on Information Technology in Biomedicine.

[18]  Alejandro F. Frangi,et al.  @neurIST - Towards a System Architecture for Advanced Disease Management through Integration of Heterogeneous Data, Computing, and Complex Processing Services , 2008, 2008 21st IEEE International Symposium on Computer-Based Medical Systems.