Shape expressions: an RDF validation and transformation language

RDF is a graph based data model which is widely used for semantic web and linked data applications. In this paper we describe a Shape Expression definition language which enables RDF validation through the declaration of constraints on the RDF model. Shape Expressions can be used to validate RDF data, communicate expected graph patterns for interfaces and generate user interface forms. In this paper we describe the syntax and the formal semantics of Shape Expressions using inference rules. Shape Expressions can be seen as domain specific language to define Shapes of RDF graphs based on regular expressions. Attached to Shape Expressions are semantic actions which provide an extension point for validation or for arbitrary code execution such as those in parser generators. Using semantic actions, it is possible to augment the validation expressiveness of Shape Expressions and to transform RDF graphs in a easy way. We have implemented several validation tools that check if an RDF graph matches against a Shape Expressions schema and infer the corresponding Shapes. We have also implemented two extensions, called GenX and GenJ that leverage the predictability of the graph traversal and create ordered, closed content, XML/Json documents, providing a simple, declarative mapping from RDF data to XML and Json documents.

[1]  Janusz A. Brzozowski,et al.  Derivatives of Regular Expressions , 1964, JACM.

[2]  J. Clark,et al.  RELAX NG specification , 2001 .

[3]  Oasis RELAX NG Specification , 2001 .

[4]  Jose Emilio Labra Gayo Reusable Semantic Specifications of Programming Languages , 2002 .

[5]  J. E. Labra Gayo,et al.  Specification of Logic Programming Languages from Reusable Semantic Building Blocks , 2002, Electron. Notes Theor. Comput. Sci..

[6]  Eric van der Vlist RELAX NG - a simpler schema language for XML , 2004 .

[7]  Murali Mani,et al.  Taxonomy of XML schema languages using formal language theory , 2005, TOIT.

[8]  Boris Motik,et al.  Adding Integrity Constraints to OWL , 2007, OWLED.

[9]  Axel Polleres,et al.  XSPARQL: Traveling between the XML and RDF Worlds - and Avoiding the XSLT Pilgrimage , 2008, ESWC.

[10]  Ivan Herman,et al.  XSLT+SPARQL : Scripting the semantic web with SPARQL embedded into XSLT stylesheets , 2008 .

[11]  Jiao Tao,et al.  Integrity Constraints in OWL , 2010, AAAI.

[12]  Stefan Decker,et al.  Mapping between RDF and XML with XSPARQL , 2012, Journal on Data Semantics.

[13]  Jose Emilio Labra Gayo Validating statistical index data represented in RDF using SPARQL queries , 2013 .

[14]  Arthur G. Ryman,et al.  OSLC Resource Shape: A language for defining constraints on Linked Data , 2013, LDOW.

[15]  Harold R. Solbrig,et al.  Validating RDF with Shape Expressions , 2014, ArXiv.

[16]  Jens Lehmann,et al.  Test-driven evaluation of linked data quality , 2014, WWW.