Data Integration throughDL-LiteA Ontologies

The goal of data integration is to provide a uniform access to a set of heterogeneous data sources, freeing the user from the knowledge about where the data are, how they are stored, and how they can be accessed. One of the outcomes of the research work carried out on data integration in the last years is a clear conceptual architecture, comprising a global schema, the source schema, and the mapping between the source and the global schema. In this paper, we present a comprehensive approach to, and a complete system for, ontology-based data integration. In this system, the global schema is expressed in terms of a TBox of the tractable Description Logics ${\textit{DL-Lite}_{\mathcal A}}$, the sources are relations, and the mapping language allows for expressing GAV sound mappings between the sources and the global schema. The mapping language has specific mechanisms for addressing the so-called impedance mismatch problem, arising from the fact that, while the data sources store values, the instances of concepts in the ontology are objects. By virtue of the careful design of the various languages used in our system, answering unions of conjunctive queries can be done through a very efficient technique ( LogSpace with respect to data complexity) which reduces this task to standard SQL query evaluation. We also show that even very slight extensions of the expressive abilities of our system lead beyond this complexity bound.

[1]  Dan Suciu,et al.  Adding Structure to Unstructured Data , 1997, ICDT.

[2]  James A. Hendler,et al.  The Semantic Web — ISWC 2002 , 2002, Lecture Notes in Computer Science.

[3]  Joann J. Ordille,et al.  Querying Heterogeneous Information Sources Using Source Descriptions , 1996, VLDB.

[4]  Wolfgang Faber,et al.  The INFOMIX system for advanced integration of incomplete and inconsistent data , 2005, SIGMOD '05.

[5]  Michael R. Genesereth,et al.  Infomaster: an information integration system , 1997, SIGMOD '97.

[6]  Erhard Rahm,et al.  BioFuice: Mapping-Based Data Integration in Bioinformatics , 2006, DILS.

[7]  Stéphane Bressan,et al.  Context Interchange: New Features and Formalisms for the Intelligent Integration of Information Context Interchange: New Features and Formalisms for the Intelligent Integration of Information , 1997 .

[8]  Maurizio Lenzerini,et al.  On the Approximation of Instance Level Update and Erasure in Description Logics , 2007, AAAI.

[9]  Catriel Beeri,et al.  Ontology-Based Integration of XML Web Resources , 2002, SEMWEB.

[10]  Jeffrey D. Ullman,et al.  Information integration using logical views , 1997, Theor. Comput. Sci..

[11]  Diego Calvanese,et al.  Linking Data to Ontologies: The Description Logic DL-Lite_A , 2006, OWLED.

[12]  Diego Calvanese,et al.  Tractable Reasoning and Efficient Query Answering in Description Logics: The DL-Lite Family , 2007, Journal of Automated Reasoning.

[13]  Michael R. Genesereth,et al.  Answering recursive queries using views , 1997, PODS '97.

[14]  Fausto Giunchiglia,et al.  Data Management for Peer-to-Peer Computing : A Vision , 2002, WebDB.

[15]  Maurizio Lenzerini,et al.  Data integration: a theoretical perspective , 2002, PODS.

[16]  Diego Calvanese,et al.  QuOnto: Querying Ontologies , 2005, AAAI.

[17]  Patrick Valduriez,et al.  Scaling Access to Heterogeneous Data Sources with DISCO , 1998, IEEE Trans. Knowl. Data Eng..

[18]  Diego Calvanese,et al.  Data Complexity of Query Answering in Description Logics , 2006, Description Logics.

[19]  Riccardo Rosati,et al.  Consistent query answering under key and exclusion dependencies: algorithms and experiments , 2005, CIKM '05.

[20]  Maurizio Lenzerini,et al.  Representing and Using Interschema Knowledge in Cooperative Information Systems , 1993, Int. J. Cooperative Inf. Syst..

[21]  Divesh Srivastava,et al.  Data model and query evaluation in global information systems , 1995, Journal of Intelligent Information Systems.

[22]  Andrea Calì,et al.  Query rewriting and answering under constraints in data integration systems , 2003, IJCAI.

[23]  Domenico Lembo,et al.  Consistent Query Answering over Description Logic Ontologies , 2007, Description Logics.

[24]  Laura M. Haas,et al.  Towards heterogeneous multimedia information systems: the Garlic approach , 1995, Proceedings RIDE-DOM'95. Fifth International Workshop on Research Issues in Data Engineering-Distributed Object Management.

[25]  Andrea Calì,et al.  On the Expressive Power of Data Integration Systems , 2002, ER.

[26]  Diego Calvanese,et al.  Data Integration in Data Warehousing (Keynote Address) , 2001, CAiSE Workshops.

[27]  Richard Hull,et al.  A Survey of Theoretical Research on Typed Complex Database Objects , 1988, XP7.52 Workshop on Database Theory.

[28]  Diego Calvanese,et al.  Linking Data to Ontologies , 2008, J. Data Semant..

[29]  Stefano Spaccapietra,et al.  Conceptual Modeling — ER 2002 , 2002, Lecture Notes in Computer Science.

[30]  Divesh Srivastava,et al.  The Information Manifold , 1995 .

[31]  Jennifer Widom,et al.  The TSIMMIS Approach to Mediation: Data Models and Languages , 1997, Journal of Intelligent Information Systems.

[32]  Maurizio Lenzerini,et al.  On the Update of Description Logic Ontologies at the Instance Level , 2006, AAAI.

[33]  Alon Y. Halevy,et al.  Recursive Query Plans for Data Integration , 2000, J. Log. Program..