论文信息 - High-Level Why-Not Explanations using Ontologies

High-Level Why-Not Explanations using Ontologies

We propose a novel foundational framework for why-not explanations, that is, explanations for why a tuple is missing from a query result. Our why-not explanations leverage concepts from an ontology to provide high-level and meaningful reasons for why a tuple is missing from the result of a query. A key algorithmic problem in our framework is that of computing a most-general explanation for a why-not question, relative to an ontology, which can either be provided by the user, or it may be automatically derived from the data and/or schema. We study the complexity of this problem and associated problems, and present concrete algorithms for computing why-not explanations. In the case where an external ontology is provided, we first show that the problem of deciding the existence of an explanation to a why-not question is NP-complete in general. However, the problem is solvable in polynomial time for queries of bounded arity, provided that the ontology is specified in a suitable language, such as a member of the DL-Lite family of description logics, which allows for efficient concept subsumption checking. Furthermore, we show that a most-general explanation can be computed in polynomial time in this case. In addition, we propose a method for deriving a suitable (virtual) ontology from a database and/or a schema, and we present an algorithm for computing a most-general explanation to a why-not question, relative to such ontologies. This algorithm runs in polynomial-time in the case when concepts are defined in a selection-free language, or if the underlying schema is fixed. Finally, we also study the problem of computing short most-general explanations, and we briefly discuss alternative definitions of what it means to be an explanation, and to be most general.

[1] John C. Mitchell. The Implication Problem for Functional and Inclusion Dependencies , 1984, Inf. Control..

[2] Wang Chiew Tan,et al. Artemis: A System for Analyzing Missing Answers , 2009, Proc. VLDB Endow..

[3] Val Tannen,et al. Provenance semirings , 2007, PODS.

[4] Melanie Herschel,et al. Query-Based Why-Not Provenance with NedExplain , 2014, EDBT.

[5] Moshe Y. Vardi,et al. The Implication Problem for Functional and Inclusion Dependencies is Undecidable , 1985, SIAM J. Comput..

[6] Oded Shmueli,et al. Equivalence of DATALOG Queries is Undecidable , 1993, J. Log. Program..

[7] Terry Halpin,et al. LogiQL: A Query Language for Smart Databases , 2014 .

[8] Sergio Tessaris,et al. Automatic Extraction of Ontologies Wrapping Relational Data Sources , 2009, DEXA.

[9] Todd J. Green. LogiQL: A Declarative Language for Enterprise Applications , 2015, PODS.

[10] Diego Calvanese,et al. Reasoning about Explanations for Negative Query Answers in DL-Lite , 2013, J. Artif. Intell. Res..

[11] Zahir Tari,et al. On the Move to Meaningful Internet Systems. OTM 2018 Conferences , 2018, Lecture Notes in Computer Science.

[12] Melanie Herschel,et al. Explaining missing answers to SPJUA queries , 2010, Proc. VLDB Endow..

[13] Georg Gottlob,et al. The impact of virtual views on containment , 2010, Proc. VLDB Endow..

[14] James Cheney,et al. Provenance in Databases: Why, How, and Where , 2009, Found. Trends Databases.

[15] Adriane Chapman,et al. Why Not? , 1965, SIGMOD Conference.

[16] Emir Pasalic,et al. Design and Implementation of the LogicBlox System , 2015, SIGMOD Conference.

[17] Diego Calvanese,et al. Tractable Reasoning and Efficient Query Answering in Description Logics: The DL-Lite Family , 2007, Journal of Automated Reasoning.

[18] Maurizio Lenzerini,et al. Optimizing query rewriting in ontology-based data access , 2013, EDBT '13.

[19] Serge Abiteboul,et al. Foundations of Databases , 1994 .

[20] Jens Lehmann,et al. Triplify: light-weight linked data publication from relational databases , 2009, WWW '09.

[21] Todd J. Green,et al. LogicBlox, Platform and Language: A Tutorial , 2012, Datalog.

[22] SuciuDan,et al. The complexity of causality and responsibility for query answers and non-answers , 2010, VLDB 2010.

[23] Fernando Pereira,et al. Yedalog: Exploring Knowledge at Scale , 2015, SNAPL.

[24] Jeffrey F. Naughton,et al. On Debugging Non-Answers in Keyword Search Systems , 2015, EDBT.

[25] Divesh Srivastava,et al. Explaining Program Execution in Deductive Systems , 1993, DOOD.

[26] Dan Suciu,et al. A formal approach to finding explanations for database queries , 2014, SIGMOD Conference.

[27] Diego Calvanese,et al. Linking Data to Ontologies , 2008, J. Data Semant..

[28] Dan Suciu,et al. The Complexity of Causality and Responsibility for Query Answers and non-Answers , 2010, Proc. VLDB Endow..

[29] Jeffrey F. Naughton,et al. On the provenance of non-answers to queries over extracted data , 2008, Proc. VLDB Endow..

[30] Quoc Trung Tran,et al. How to ConQueR why-not questions , 2010, SIGMOD Conference.

[31] Diego Calvanese,et al. The DL-Lite Family and Relations , 2009, J. Artif. Intell. Res..