Ontologies and Functional Dependencies for Data Integration and Reconciliation

Integrating data sources is the key success of business intelligence systems. The exponential growth of autonomous data sources over the Internet and enterprise intranets makes the development of integration solutions more complex. This is due to two main factors: (i) the management of the source heterogeneity and (ii) the reconciliation of query results. To deal with the first factor, several research efforts proposed the use of ontologies to explicit semantic of each source. Two main trends are used to reconcile the query results: (i) the supposition that different entities of sources representing the same concept have the same key - a strong hypothesis that violates the autonomy of sources. (ii) The use of statistical methods which are not usually suitable for sensitive-applications. In this paper, we propose a methodology integrating sources referencing shared domain ontology enriched with functional dependencies (FD) in a mediation architecture. The presence of FD gives more autonomy of sources in choosing their primary keys and facilitates the result reconciliation. Our methodology is validated using dataset of Lehigh University Benchmark.

[1]  Mukesh K. Mohania,et al.  Advances in Databases: Concepts, Systems and Applications , 2007 .

[2]  Alon Y. Halevy,et al.  Enterprise information integration: successes, challenges and controversies , 2005, SIGMOD '05.

[3]  Ladjel Bellatreche,et al.  A design methodology of ontology based database applications , 2011, Log. J. IGPL.

[4]  David Nelson,et al.  Database: Enterprise, Skills and Innovation, 22nd British National Conference on Databases, BNCOD 22, Sunderland, UK, July 5-7, 2005, Proceedings , 2005, BNCOD.

[5]  Stefano Spaccapietra,et al.  Conceptual Modeling — ER 2002 , 2002, Lecture Notes in Computer Science.

[6]  Felix Naumann,et al.  Data fusion , 2009, CSUR.

[7]  Farshad Hakimpour,et al.  Global Schema Generation Using Formal Ontologies , 2002, ER.

[8]  Stéphane Bressan,et al.  Context Interchange: New Features and Formalisms for the Intelligent Integration of Information Context Interchange: New Features and Formalisms for the Intelligent Integration of Information , 1997 .

[9]  Diego Calvanese,et al.  Discovering functional dependencies for multidimensional design , 2009, DOLAP.

[10]  Weiru Liu,et al.  Answering Queries Using Views in the Presence of Functional Dependencies , 2005, BNCOD.

[11]  Ladjel Bellatreche,et al.  OntoDB: An Ontology-Based Database for Data Intensive Applications , 2007, DASFAA.

[12]  Diego Calvanese,et al.  Identification Constraints and Functional Dependencies in Description Logics , 2001, IJCAI.

[13]  Nathalie Pernelle,et al.  Combining a Logical and a Numerical Method for Data Reconciliation , 2009, J. Data Semant..

[14]  Wenfei Fan,et al.  Dependencies revisited for improving data quality , 2008, PODS.

[15]  Serge Abiteboul,et al.  Foundations of Databases , 1994 .

[16]  David Toman,et al.  On Keys and Functional Dependencies as First-Class Citizens in Description Logics , 2007, Journal of Automated Reasoning.

[17]  Felix Naumann,et al.  Data Fusion – Resolving Data Conflicts for Integration , 2009 .

[18]  Vipul Kashyap,et al.  OBSERVER: An Approach for Query Processing in Global Information Systems Based on Interoperation Across Pre-Existing Ontologies , 2000, Distributed and Parallel Databases.

[19]  Ladjel Bellatreche,et al.  Contribution of ontology-based data modeling to automatic integration of electronic catalogues within engineering databases , 2006, Comput. Ind..

[20]  Mukesh K. Mohania,et al.  Functional Dependency Driven Auxiliary Relation Selection for Materialized Views Maintenance , 2005, COMAD.

[21]  Fabio Porto,et al.  Functional dependencies in OWL ABoxes , 2009, SBBD 2009.