Graph Pattern Based RDF Data Compression

The growing volume of RDF documents and their inter-linking raise a challenge on the storage and transferring of such documents. One solution to this problem is to reduce the size of RDF documents via compression. Existing approaches either apply well-known generic compression technologies but seldom exploit the graph structure of RDF documents. Or, they focus on minimized compact serialisations leaving the graph nature inexplicit, which leads obstacles for further applying higher level compression techniques. In this paper we propose graph pattern based technologies, which on the one hand can reduce the numbers of triples in RDF documents and on the other hand can serialise RDF graph in a data pattern based way, which can deal with syntactic redundancies which are not eliminable to existing techniques. Evaluation on real world datasets shows that our approach can substantially reduce the size of RDF documents by complementing the abilities of existing approaches. Furthermore, the evaluation results on rule mining operations show the potentials of the proposed serialisation format in supporting efficient data access.

[1]  Boris Motik,et al.  Query Answering for OWL-DL with Rules , 2004, International Semantic Web Conference.

[2]  Amit P. Sheth,et al.  The SSN ontology of the W3C semantic sensor network incubator group , 2012, J. Web Semant..

[3]  Miguel A. Martínez-Prieto,et al.  Compact Representation of Large RDF Data Sets for Publishing and Exchange , 2010, SEMWEB.

[4]  Ian Horrocks,et al.  The Semantic Web – ISWC 2010: 9th International Semantic Web Conference, ISWC 2010, Shanghai, China, November 7-11, 2010, Revised Selected Papers, Part I , 2010, SEMWEB.

[5]  Cristina Sirangelo,et al.  Reasoning About Pattern-Based XML Queries , 2013, RR.

[6]  Luigi Iannone,et al.  Optimizing RDF Storage Removing Redundancies: An Algorithm , 2005, IEA/AIE.

[7]  P. Cochat,et al.  Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[8]  Huajun Chen,et al.  Semantic Web meets Integrative Biology: a survey , 2013, Briefings Bioinform..

[9]  Jens Wissmann,et al.  Elimination of Redundancy in Ontologies , 2011, ESWC.

[10]  Moonis Ali,et al.  Innovations in Applied Artificial Intelligence , 2005 .

[11]  Pascal Hitzler,et al.  Logical Linked Data Compression , 2013, ESWC.

[12]  Nieves R. Brisaboa,et al.  Compressed k2-Triples for Full-In-Memory RDF Engines , 2011, AMCIS.

[13]  Michael Meier,et al.  Towards Rule-Based Minimization of RDF Graphs under Constraints , 2008, RR.

[14]  Miguel A. Martínez-Prieto,et al.  RDF compression: basic approaches , 2010, WWW '10.

[15]  Lora Aroyo,et al.  The Semantic Web: Research and Applications , 2009, Lecture Notes in Computer Science.

[16]  Stefan Woltran,et al.  Redundancy Elimination on RDF Graphs in the Presence of Rules, Constraints, and Queries , 2010, RR.

[17]  Oscar Corcho,et al.  The Semantic Web: Semantics and Big Data , 2013, Lecture Notes in Computer Science.