Data Integration through DL-LiteA Ontologies

The goal of data integration is to provide a uniform access to a set of heterogeneous data sources, freeing the user from the knowledge about where the data are, how they are stored, and how they can be accessed. One of the outcomes of the research work carried out on data integration in the last years is a clear conceptual architecture, comprising a global schema, the source schema, and the mapping between the source and the global schema. In this paper, we present a comprehensive approach to, and a complete system for, ontology-based data inte- gration. In this system, the global schema is expressed in terms of a TBox of the tractable Description Logics DL-LiteA, the sources are relations, and the map- ping language allows for expressing GAV sound mappings between the sources and the global schema. The mapping language has specific mechanisms for ad- dressing the so-called impedance mismatch problem, arising from the fact that, while the data sources store values, the instances of concepts in the ontology are objects. By virtue of the careful design of the various languages used in our sys- tem, answering unions of conjunctive queries can be done through a very efficient technique (LOGSPACE with respect to data complexity) which reduces this task to standard SQL query evaluation. We also show that even very slight extensions of the expressive abilities of our system lead beyond this complexity bound.

[1]  Erhard Rahm,et al.  BioFuice: Mapping-Based Data Integration in Bioinformatics , 2006, DILS.

[2]  Stéphane Bressan,et al.  Context Interchange: New Features and Formalisms for the Intelligent Integration of Information Context Interchange: New Features and Formalisms for the Intelligent Integration of Information , 1997 .

[3]  Domenico Lembo,et al.  Consistent Query Answering over Description Logic Ontologies , 2007, Description Logics.

[4]  Maurizio Lenzerini,et al.  Data integration: a theoretical perspective , 2002, PODS.

[5]  Diego Calvanese,et al.  QuOnto: Querying Ontologies , 2005, AAAI.

[6]  Diego Calvanese,et al.  Data Complexity of Query Answering in Description Logics , 2006, Description Logics.

[7]  Riccardo Rosati,et al.  Consistent query answering under key and exclusion dependencies: algorithms and experiments , 2005, CIKM '05.

[8]  Maurizio Lenzerini,et al.  Representing and Using Interschema Knowledge in Cooperative Information Systems , 1993, Int. J. Cooperative Inf. Syst..

[9]  Andrea Calì,et al.  On the Expressive Power of Data Integration Systems , 2002, ER.

[10]  Laura M. Haas,et al.  Towards heterogeneous multimedia information systems: the Garlic approach , 1995, Proceedings RIDE-DOM'95. Fifth International Workshop on Research Issues in Data Engineering-Distributed Object Management.

[11]  Diego Calvanese,et al.  Linking Data to Ontologies: The Description Logic DL-Lite_A , 2006, OWLED.

[12]  Maurizio Lenzerini,et al.  On the Approximation of Instance Level Update and Erasure in Description Logics , 2007, AAAI.

[13]  Michael R. Genesereth,et al.  Answering recursive queries using views , 1997, PODS '97.

[14]  Jeffrey D. Ullman,et al.  Information integration using logical views , 1997, Theor. Comput. Sci..

[15]  Maurizio Lenzerini,et al.  On the Update of Description Logic Ontologies at the Instance Level , 2006, AAAI.

[16]  Divesh Srivastava,et al.  The Information Manifold , 1995 .

[17]  Jennifer Widom,et al.  The TSIMMIS Approach to Mediation: Data Models and Languages , 1997, Journal of Intelligent Information Systems.

[18]  Wolfgang Faber,et al.  The INFOMIX system for advanced integration of incomplete and inconsistent data , 2005, SIGMOD '05.

[19]  Catriel Beeri,et al.  Ontology-Based Integration of XML Web Resources , 2002, SEMWEB.

[20]  Andrea Calì,et al.  Query rewriting and answering under constraints in data integration systems , 2003, IJCAI.

[21]  Alon Y. Halevy,et al.  Recursive Query Plans for Data Integration , 2000, J. Log. Program..

[22]  Joann J. Ordille,et al.  Querying Heterogeneous Information Sources Using Source Descriptions , 1996, VLDB.

[23]  Michael R. Genesereth,et al.  Infomaster: an information integration system , 1997, SIGMOD '97.

[24]  Fausto Giunchiglia,et al.  Data Management for Peer-to-Peer Computing : A Vision , 2002, WebDB.

[25]  Patrick Valduriez,et al.  Scaling Access to Heterogeneous Data Sources with DISCO , 1998, IEEE Trans. Knowl. Data Eng..

[26]  Diego Calvanese,et al.  Tractable Reasoning and Efficient Query Answering in Description Logics: The DL-Lite Family , 2007, Journal of Automated Reasoning.

[27]  Richard Hull,et al.  A Survey of Theoretical Research on Typed Complex Database Objects , 1988, XP7.52 Workshop on Database Theory.

[28]  Diego Calvanese,et al.  Linking Data to Ontologies , 2008, J. Data Semant..

[29]  Divesh Srivastava,et al.  Data model and query evaluation in global information systems , 1995, Journal of Intelligent Information Systems.