Data Quality in Ontology-based Data Access: The Case of Consistency

Ontology-based data access (OBDA) is a new paradigm aiming at accessing and managing data by means of an ontology, i.e., a conceptual representation of the domain of interest in the underlying information system. In the last years, this new paradigm has been used for providing users with abstract (independent from technological and system-oriented aspects), effective, and reasoning-intensive mechanisms for querying the data residing at the information system sources. In this paper we argue that OBDA, besides querying data, provides the right principles for devising a formal approach to data quality. In particular, we concentrate on one of the most important dimensions considered both in the literature and in the practice of data quality, namely consistency. We define a general framework for data consistency in OBDA, and present algorithms and complexity analysis for several relevant tasks related to the problem of checking data quality under this dimension, both at the extensional level (content of the data sources), and at the intensional level (schema of the data sources).

[1]  Diego Calvanese,et al.  The Description Logic Handbook: Theory, Implementation, and Applications , 2003, Description Logic Handbook.

[2]  Maurizio Lenzerini Ontology-based data management , 2011, CIKM '11.

[3]  Andrea Calì,et al.  New Expressive Languages for Ontological Query Answering , 2011, AAAI.

[4]  Carlo Batini,et al.  Data Quality: Concepts, Methodologies and Techniques (Data-Centric Systems and Applications) , 2006 .

[5]  Catriel Beeri,et al.  The Implication Problem for Data Dependencies , 1981, ICALP.

[6]  Carsten Lutz,et al.  The Combined Approach to Ontology-Based Data Access , 2011, IJCAI.

[7]  Diego Calvanese,et al.  Linking Data to Ontologies , 2008, J. Data Semant..

[8]  Andrea Calì,et al.  Datalog+/-: A Family of Logical Knowledge Representation and Query Languages for New Applications , 2010, 2010 25th Annual IEEE Symposium on Logic in Computer Science.

[9]  Diego Calvanese,et al.  DL-Lite: Tractable Description Logics for Ontologies , 2005, AAAI.

[10]  Wenfei Fan,et al.  Foundations of Data Quality Management , 2012, Foundations of Data Quality Management.

[11]  Diego Calvanese,et al.  The DL-Lite Family and Relations , 2009, J. Artif. Intell. Res..

[12]  Ronald Fagin,et al.  Data exchange: semantics and query answering , 2003, Theor. Comput. Sci..

[13]  Diego Calvanese,et al.  Quest, an OWL 2 QL Reasoner for Ontology-based Data Access , 2012, OWLED.

[14]  Diego Calvanese,et al.  The MASTRO system for ontology-based data access , 2011, Semantic Web.

[15]  Maurizio Lenzerini,et al.  Data integration: a theoretical perspective , 2002, PODS.

[16]  Diego Calvanese,et al.  High Performance Query Answering over DL-Lite Ontologies , 2012, KR.

[17]  Richard Y. Wang,et al.  Anchoring data quality dimensions in ontological foundations , 1996, CACM.

[18]  Carlo Batini,et al.  Data Quality: Concepts, Methodologies and Techniques , 2006, Data-Centric Systems and Applications.

[19]  Phokion G. Kolaitis,et al.  The complexity of data exchange , 2006, PODS '06.

[20]  Divesh Srivastava,et al.  Big Data Integration , 2015, Synthesis Lectures on Data Management.

[21]  Alon Y. Halevy,et al.  Answering queries using views: A survey , 2001, The VLDB Journal.

[22]  Serge Abiteboul,et al.  Foundations of Databases , 1994 .