The role of reasoning for RDF validation

For data practitioners embracing the world of RDF and Linked Data, the openness and flexibility is a mixed blessing. For them, data validation according to predefined constraints is a much sought-after feature, particularly as this is taken for granted in the XML world. Based on our work in the DCMI RDF Application Profiles Task Group and in cooperation with the W3C Data Shapes Working Group, we published by today 81 types of constraints that are required by various stakeholders for data applications. These constraint types form the basis to investigate the role that reasoning and different semantics play in practical data validation, why reasoning is beneficial for RDF validation, and how to overcome the major shortcomings when validating RDF data by performing reasoning prior to validation. For each constraint type, we examine (1) if reasoning may improve data quality, (2) how efficient in terms of runtime validation is performed with and without reasoning, and (3) if validation results depend on underlying semantics which differs between reasoning and validation. Using these findings, we determine for the most common constraint languages which constraint types they enable to express and give directions for the further development of constraint languages.

[1]  Diego Calvanese,et al.  Tractable Reasoning and Efficient Query Answering in Description Logics: The DL-Lite Family , 2007, Journal of Automated Reasoning.

[2]  Bernardo Cuenca Grau,et al.  OWL 2 Web Ontology Language: Profiles , 2009 .

[3]  Yarden Katz,et al.  Pellet: A practical OWL-DL reasoner , 2007, J. Web Semant..

[4]  Kai Eckert,et al.  Towards Description Set Profiles for RDF using SPARQL as Intermediate Language , 2014, Dublin Core Conference.

[5]  Jiao Tao,et al.  Towards Integrity Constraints in OWL , 2009, OWLED.

[6]  Ian Horrocks,et al.  FaCT++ Description Logic Reasoner: System Description , 2006, IJCAR.

[7]  Sanjeev Arora,et al.  Computational Complexity: A Modern Approach , 2009 .

[8]  German Nemirovski,et al.  ELITE: An Entailment-Based Federated Query Engine for Complete and Transparent Semantic Data Integration , 2013, Description Logics.

[9]  Boris Motik,et al.  The HermiT OWL Reasoner , 2012, ORE.

[10]  Kai Eckert,et al.  Requirements on RDF Constraint Formulation and Validation , 2014, Dublin Core Conference.

[11]  Kai Eckert,et al.  RDF Validation Requirements - Evaluation and Logical Underpinning , 2015, ArXiv.

[12]  Carsten Lutz,et al.  The Combined Approach to Ontology-Based Data Access , 2011, IJCAI.

[13]  Martin Hepp,et al.  Using SPARQL and SPIN for Data Quality Management on the Semantic Web , 2010, BIS.

[14]  Jiao Tao,et al.  Integrity constraints for the semantic web: an OWL 2 DL extension , 2012 .

[15]  Diego Calvanese,et al.  The DL-Lite Family and Relations , 2009, J. Artif. Intell. Res..

[16]  Marcelo Arenas,et al.  Semantics and complexity of SPARQL , 2006, TODS.

[17]  Heiner Stuckenschmidt,et al.  Efficient Federated Debugging of Lightweight Ontologies , 2014, RR.

[18]  Volker Haarslev,et al.  RACER System Description , 2001, IJCAR.