Using Complex Correspondences for Integrating Relational Data Sources

Data Integration (DI) is the problem of combining a set of heterogeneous, autonomous data sources and providing the user with a unified view of these data. Integrating data raises several challenges, since the designer usually encounters incompatible data models characterized by differences in structure and semantics. One of the hardest challenges is to define correspondences between schema elements (e.g., attributes) to determine how they relate to each other. Since most business data is currently stored in relational databases, here present a declarative and formal approach to specify 1-to-1, 1-m, and m-to-n correspondences between relational schema components. Differently from usual approaches, our (CAs) have semantics and can deal with outer-joins and data-metadata relationships. Finally, we demonstrate how to use the CAs to generate mapping expressions in the form of SQL queries, and we present some preliminary tests to verify the performance of the generated queries.

[1]  Jérôme Euzenat,et al.  A Survey of Schema-Based Matching Approaches , 2005, J. Data Semant..

[2]  Arnon Rosenthal,et al.  The Harmony Integration Workbench , 2008, J. Data Semant..

[3]  Laks V. S. Lakshmanan,et al.  SchemaSQL - A Language for Interoperability in Relational Multi-Database Systems , 1996, VLDB.

[4]  Valéria Magalhães Pequeno,et al.  Using Perspective Schemata to Model the ETL Process , 2009 .

[5]  Wenfei Fan,et al.  Putting context into schema matching , 2006, VLDB.

[6]  Laura M. Haas,et al.  Data-driven understanding and refinement of schema mappings , 2001, SIGMOD '01.

[7]  Erhard Rahm,et al.  Evolution of the COMA match system , 2011, OM.

[8]  Erhard Rahm,et al.  A survey of approaches to automatic schema matching , 2001, The VLDB Journal.

[9]  AnHai Doan,et al.  iMAP: Discovering Complex Mappings between Database Schemas. , 2004, SIGMOD 2004.

[10]  Vânia Maria Ponte Vidal,et al.  Updating Multiple Databases Through Mediators , 2000, ICEIS.

[11]  Alon Y. Halevy,et al.  Principles of Data Integration , 2012 .

[12]  Stefano Spaccapietra Journal on Data Semantics IV , 2005, Journal on Data Semantics IV.

[13]  Erhard Rahm,et al.  Schema Matching and Mapping , 2013, Schema Matching and Mapping.

[14]  Christopher Popfinger,et al.  Enhanced active databases for federated information systems , 2007 .

[15]  Edward L. Robertson,et al.  Relational languages for metadata integration , 2005, TODS.

[16]  Marco A. Casanova,et al.  Specifying complex correspondences between relational schemas and RDF models for generating customized R2RML mappings , 2014, IDEAS.

[17]  Pedro M. Domingos,et al.  iMAP: discovering complex semantic matches between database schemas , 2004, SIGMOD '04.

[18]  Wolfram Wöß,et al.  A Semantic Web middleware for Virtual Data Integration on the Web , 2008, ESWC.

[19]  Cosmin Stroe,et al.  AgreementMaker: Efficient Matching for Large Real-World Schemas and Ontologies , 2009, Proc. VLDB Endow..

[20]  Joaquim Nunes Aparício,et al.  Using Correspondence Assertions to Specify the Semantics of Views in an Object-Relational Data Warehouse , 2005, ICEIS.