Validating RDF Data

Abstract RDF and Linked Data have broad applicability across many fields, from aircraft manufacturing to zoology. Requirements for detecting bad data differ across communities, fields, and tasks, but nearly all involve some form of data validation. This book introduces data validation and describes its practical use in day-to-day data exchange. The Semantic Web offers a bold, new take on how to organize, distribute, index, and share data. Using Web addresses (URIs) as identifiers for data elements enables the construction of distributed databases on a global scale. Like the Web, the Semantic Web is heralded as an information revolution, and also like the Web, it is encumbered by data quality issues. The quality of Semantic Web data is compromised by the lack of resources for data curation, for maintenance, and for developing globally applicable data models. At the enterprise scale, these problems have conventional solutions. Master data management provides an enterprise-wide vocabulary, while constraint l...

[1]  Arthur G. Ryman,et al.  OSLC Resource Shape: A language for defining constraints on Linked Data , 2013, LDOW.

[2]  Joachim Wackerow,et al.  Validating RDF Data Quality Using Constraints to Direct the Development of Constraint Languages , 2016, 2016 IEEE Tenth International Conference on Semantic Computing (ICSC).

[3]  Harold R. Solbrig,et al.  Shape expressions: an RDF validation and transformation language , 2014, SEM '14.

[4]  Rik Van de Walle,et al.  RML: A Generic Language for Integrated RDF Mappings of Heterogeneous Data , 2014, LDOW.

[5]  Iovka Boneva,et al.  Complexity and Expressiveness of ShEx for RDF , 2015, ICDT.

[6]  Jens Lehmann,et al.  LODStats: The Data Web Census Dataset , 2016, SEMWEB.

[7]  Daniel Fernández-Álvarez,et al.  Inference and Serialization of Latent Graph Schemata Using ShEx , 2016 .

[8]  Mark A. Musen,et al.  The protégé project: a look back and a look forward , 2015, SIGAI.

[9]  Martin Hepp,et al.  Using SPARQL and SPIN for Data Quality Management on the Semantic Web , 2010, BIS.

[10]  Kai Eckert,et al.  The role of reasoning for RDF validation , 2015, SEMANTICS.

[11]  E. Prud hommeaux,et al.  SPARQL query language for RDF , 2011 .

[12]  Colin Evans,et al.  Programming the Semantic Web , 2009 .

[13]  Thomas B. Passin,et al.  Explorer's guide to the semantic web , 2004 .

[14]  Pierre Bourhis,et al.  JSON: Data model, Query languages and Schema specification , 2017, PODS.

[15]  Frank van Harmelen,et al.  A semantic web primer , 2004 .

[16]  Jiao Tao,et al.  Integrity Constraints in OWL , 2010, AAAI.

[17]  Peter F. Patel-Schneider,et al.  Using Description Logics for RDF Constraint Checking and Closed-World Recognition , 2014, AAAI.

[18]  Murali Mani,et al.  Taxonomy of XML schema languages using formal language theory , 2005, TOIT.

[19]  Johan Hjelm Creating the Semantic Web with RDF: Professional Developer's Guide , 2001 .

[20]  Charles F. Goldfarb,et al.  SGML handbook , 1990 .

[21]  Ioana Manolescu,et al.  Web Data Management , 2011 .

[22]  Martin J. Dürst,et al.  Internationalized Resource Identifiers (IRIs) , 2005, RFC.

[23]  Boris Motik,et al.  Adding Integrity Constraints to OWL , 2007, OWLED.

[24]  Karen Coyle,et al.  Guidelines for Dublin Core application profiles , 2009 .

[25]  Shelley Powers,et al.  Practical RDF , 2003 .

[26]  Matthew Fisher,et al.  Semantic Web Programming , 2009 .

[27]  Jens Lehmann,et al.  Test-driven evaluation of linked data quality , 2014, WWW.

[28]  Marcelo Arenas,et al.  Semantics and Complexity of SPARQL , 2006, International Semantic Web Conference.

[29]  C. M. Sperberg-McQueen,et al.  W3C XML Schema Definition Language (XSD) 1.1 Part 1: Structures , 2012 .

[30]  Steven Pemberton Web Annotation Vocabulary , 2017 .

[31]  Ivar Jacobson,et al.  Unified Modeling Language Reference Manual, The (2nd Edition) , 2004 .

[32]  Milan Dojchinovski DBpedia Links : The Hub of Links for the Web of Data , 2016 .

[33]  José Emilio Labra Gayo,et al.  Representing Statistical Indexes as Linked Data Including Metadata about Their Computation Process , 2014, MTSR.

[34]  Dan Brickley,et al.  Rdf vocabulary description language 1.0 : Rdf schema , 2004 .

[35]  Jens Lehmann,et al.  DBpedia - A large-scale, multilingual knowledge base extracted from Wikipedia , 2015, Semantic Web.

[36]  Roberto Chinnici,et al.  Web Services Description Language (WSDL) Version 2.0 Part 1: Core Language , 2007 .

[37]  Martín Ugarte,et al.  Foundations of JSON Schema , 2016, WWW.

[38]  Sebastian Rudolph,et al.  Foundations of Semantic Web Technologies , 2009 .

[39]  Bob DuCharme,et al.  Learning SPARQL , 2013 .

[40]  G. Box Science and Statistics , 1976 .

[41]  Tim Berners-Lee,et al.  Linked data on the web (LDOW2008) , 2008, WWW.

[42]  Eric van der Vlist RELAX NG - a simpler schema language for XML , 2004 .

[43]  Deborah L. McGuinness,et al.  OWL Web ontology language overview , 2004 .

[44]  Rik Van de Walle,et al.  Assessing and Refining Mappings to RDF to Improve Dataset Quality , 2015, SEMWEB.

[45]  Deborah L. McGuinness,et al.  PROV-O: The PROV Ontology , 2013 .

[46]  Pierre-Antoine Champin,et al.  JSON-LD 1.1 – A JSON-based Serialization for Linked Data , 2019 .

[47]  Arvind Malhotra,et al.  XML Schema Part 2: Datatypes Second Edition , 2004 .

[48]  José Emilio Labra Gayo,et al.  Semantics and Validation of Shapes Schemas for RDF , 2014, SEMWEB.

[49]  Dan Brickley,et al.  Resource Description Framework (RDF) Model and Syntax Specification , 2002 .

[50]  Thomas Keays,et al.  Semantic Web for the Working Ontologist , 2008 .

[51]  Jose Emilio Labra Gayo Validating statistical index data represented in RDF using SPARQL queries , 2013 .

[52]  Tom Heath,et al.  Linked Data: Evolving the Web into a Global Data Space , 2011, Linked Data.

[53]  Robert J. Full The discipline of organizing , 2013 .

[54]  Serge Abiteboul,et al.  Foundations of Databases: The Logical Level , 1995 .

[55]  Fabio Vitali,et al.  Schemapath, a minimal extension to xml schema for conditional constraints , 2004, WWW '04.

[56]  Jesse C. J. van Dam,et al.  RDF2Graph a tool to recover, understand and validate the ontology of an RDF resource , 2015, J. Biomed. Semant..

[57]  Mark Davis,et al.  Tags for Identifying Languages , 2009, RFC.

[58]  Gerhard Weikum,et al.  Scalable join processing on very large RDF graphs , 2009, SIGMOD Conference.

[59]  Jose María Álvarez Rodríguez,et al.  Validating and Describing Linked Data Portals using RDF Shape Expressions , 2014, LDQ@SEMANTICS.

[60]  Dan Brickley,et al.  Resource Description Framework (RDF) , 2017, Encyclopedia of GIS.