Toward RDF Normalization

Billions of RDF triples are currently available on the Web through the Linked Open Data cloud (e.g., DBpedia, LinkedGeoData and New York Times). Governments, universities as well as companies (e.g., BBC, CNN) are also producing huge collections of RDF triples and exchanging them through different serialization formats (e.g., RDF/XML, Turtle, N-Triple, etc.). However, RDF descriptions (i.e., graphs and serializations) are verbose in syntax, often contain redundancies, and could be generated differently even when describing the same resources, which would have a negative impact on their processing. Hence, we propose here an approach to clean and eliminate redundancies from such RDF descriptions as a means of transforming different descriptions of the same information into one representation, which can then be tuned, depending on the target application (information retrieval, compression, etc.). Experimental tests show significant improvements, namely in reducing RDF description loading time and file size.

[1]  Michel Dumontier,et al.  Building an HIV data mashup using Bio2RDF , 2012, Briefings Bioinform..

[2]  Tim Berners-Lee,et al.  Linked Data - The Story So Far , 2009, Int. J. Semantic Web Inf. Syst..

[3]  Abraham Bernstein,et al.  Hexastore: sextuple indexing for semantic web data management , 2008, Proc. VLDB Endow..

[4]  Richard Chbeir,et al.  SVG-to-RDF Image Semantization , 2014, SISAP.

[5]  Axel Polleres,et al.  Binary RDF representation for publication and exchange (HDT) , 2013, J. Web Semant..

[6]  Asunción Gómez-Pérez,et al.  Interoperability results for Semantic Web technologies using OWL as the interchange language , 2010, J. Web Semant..

[7]  Jeremy J. Carroll,et al.  Resource description framework (rdf) concepts and abstract syntax , 2003 .

[8]  Nicole Tourigny,et al.  Bio2RDF: Towards a mashup to build bioinformatics knowledge systems , 2008, J. Biomed. Informatics.

[9]  Christopher G. Chute,et al.  Implementation Brief: LexGrid: A Framework for Representing, Storing, and Querying Biomedical Terminologies from Simple to Sublime , 2009, J. Am. Medical Informatics Assoc..

[10]  José Francisco Aldana Montes,et al.  A Semantic Mediation Architecture for RDF Data Integration , 2008, SWAP.

[11]  Csaba Veres,et al.  Integrating Semantic Web Technology, Web Services, and Workflow Modeling: Achieving System and Business Interoperability , 2007, Int. J. Enterp. Inf. Syst..

[12]  Jeffrey D. Ullman,et al.  Introduction to Automata Theory, Languages and Computation , 1979 .

[13]  P. Cochat,et al.  Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[14]  Alberto O. Mendelzon,et al.  Foundations of Semantic Web databases , 2011, J. Comput. Syst. Sci..

[15]  Christopher G. Chute,et al.  Using semantic web technology to support ICD-11 textual definitions authoring , 2011, SWAT4LS.

[16]  Sebastian Rudolph,et al.  RDF syntax normalization using XML validation , 2009 .

[17]  Claudio Gutiérrez,et al.  Bipartite Graphs as Intermediate Model for RDF , 2004, SEMWEB.

[18]  Dan Brickley,et al.  Resource Description Framework (RDF) Model and Syntax Specification , 2002 .