HDTcrypt: Compression and encryption of RDF datasets

The publication and interchange of RDF datasets online has experienced significant growth in recent years, promoted by different but complementary efforts, such as Linked Open Data, the Web of Things and RDF stream processing systems. However, the current Linked Data infrastructure does not cater for the storage and exchange of sensitive or private data. On the one hand, data publishers need means to limit access to confidential data (e.g. health, financial, personal, or other sensitive data). On the other hand, the infrastructure needs to compress RDF graphs in a manner that minimises the amount of data that is both stored and transferred over the wire. In this paper, we demonstrate how HDT - a compressed serialization format for RDF - can be extended to cater for supporting encryption. We propose a number of different graph partitioning strategies and discuss the benefits and tradeoffs of each approach.

[1]  S. Gerbracht,et al.  Possibilities to Encrypt an RDF-Graph , 2008, 2008 3rd International Conference on Information and Communication Technologies: From Theory to Applications.

[2]  Vincent Rijmen,et al.  The Design of Rijndael , 2002, Information Security and Cryptography.

[3]  Mihir Bellare,et al.  Deterministic and Efficiently Searchable Encryption , 2007, CRYPTO.

[4]  Craig Gentry,et al.  Fully homomorphic encryption using ideal lattices , 2009, STOC '09.

[5]  Miguel A. Martínez-Prieto,et al.  HDT-MR: A Scalable Solution for RDF Compression with HDT and MapReduce , 2015, ESWC.

[6]  Miguel A. Martínez-Prieto,et al.  Serializing RDF in Compressed Space , 2015, 2015 Data Compression Conference.

[7]  Nieves R. Brisaboa,et al.  Compressed String Dictionaries , 2011, SEA.

[8]  James A. Hendler,et al.  Analyzing web access control policies , 2007, WWW '07.

[9]  Pascal Hitzler,et al.  Logical Linked Data Compression , 2013, ESWC.

[10]  Axel Polleres,et al.  Everything you always wanted to know about blank nodes , 2014, J. Web Semant..

[11]  Miguel A. Martínez-Prieto,et al.  Querying RDF dictionaries in compressed space , 2012, SIAP.

[12]  Vincent Rijmen,et al.  The Design of Rijndael: AES - The Advanced Encryption Standard , 2002 .

[13]  Jürgen Umbrich,et al.  Towards Efficient Archiving of Dynamic Linked Open Data , 2015, DIACRON@ESWC.

[14]  Serena Villata,et al.  Linked Data Access Goes Mobile: Context-Aware Authorization for Graph Stores , 2012, LDOW.

[15]  Olivier Curé,et al.  WaterFowl: A Compact, Self-indexed and Inference-Enabled Immutable RDF Store , 2014, ESWC.

[16]  Mark Giereth PRE 4 JA Partial RDF Encryption API for Jena , 2006 .

[17]  Mark Giereth,et al.  On Partial Encryption of RDF-Graphs , 2005, SEMWEB.

[18]  Dietrich Rebholz-Schuhmann,et al.  SAFE: SPARQL Federation over RDF Data Cubes with Access Control , 2017, J. Biomed. Semant..

[19]  Miguel A. Martínez-Prieto,et al.  Exchange and Consumption of Huge RDF Data , 2012, ESWC.

[20]  Melissa Chase,et al.  Pattern Matching Encryption , 2014, IACR Cryptol. ePrint Arch..

[21]  Axel Polleres,et al.  Self-Enforcing Access Control for Encrypted RDF , 2017, ESWC.

[22]  Simon Steyskal,et al.  If you can't enforce it, contract it: Enforceability in Policy-Driven (Linked) Data Markets , 2015, SEMANTiCS.

[23]  Sebastian Maneth,et al.  Grammar-Based Graph Compression , 2017, Inf. Syst..

[24]  Frederik Armknecht,et al.  Towards Search on Encrypted Graph Data , 2013, PrivOn@ISWC.

[25]  Alberto O. Mendelzon,et al.  Foundations of Semantic Web databases , 2011, J. Comput. Syst. Sci..

[26]  Jonathan Katz,et al.  Predicate Encryption Supporting Disjunctions, Polynomial Equations, and Inner Products , 2008, Journal of Cryptology.

[27]  Javier D. Fernández Binary RDF for scalable publishing, exchanging and consumption in the web of data , 2012, WWW.

[28]  Qiang Tang,et al.  On Using Encryption Techniques to Enhance Sticky Policies Enforcement , 2008 .

[29]  Timothy W. Finin,et al.  A Policy Based Approach to Security for the Semantic Web , 2003, SEMWEB.

[30]  Siani Pearson,et al.  Sticky Policies: An Approach for Managing Privacy across Multiple Parties , 2011, Computer.

[31]  Stefan Schlobach,et al.  LOD Laundromat: A Uniform Way of Publishing Other People's Dirty Data , 2014, SEMWEB.

[32]  Stefan Decker,et al.  An Access Control Framework for the Web of Data , 2011, 2011IEEE 10th International Conference on Trust, Security and Privacy in Computing and Communications.

[33]  D. T. Vernon,et al.  Attitudes and opinions of faculty tutors about problem‐based learning , 1995, Academic medicine : journal of the Association of American Medical Colleges.

[34]  Nieves R. Brisaboa,et al.  Compressed vertical partitioning for efficient RDF management , 2014, Knowledge and Information Systems.

[35]  Axel Polleres,et al.  Binary RDF representation for publication and exchange (HDT) , 2013, J. Web Semant..

[36]  Miguel A. Martínez-Prieto,et al.  Management of Big Semantic Data , 2014 .

[37]  Rafail Ostrovsky,et al.  Searchable symmetric encryption: improved definitions and efficient constructions , 2006, CCS '06.

[38]  Dan Brickley,et al.  Rdf vocabulary description language 1.0 : Rdf schema , 2004 .

[39]  Jens Lehmann,et al.  DBpedia: A Nucleus for a Web of Open Data , 2007, ISWC/ASWC.

[40]  Stefan Woltran,et al.  Complexity of redundancy detection on RDF graphs in the presence of rules, constraints, and queries , 2013, Semantic Web.

[41]  Li Ding,et al.  Enhancing Web privacy protection through declarative policies , 2005, Sixth IEEE International Workshop on Policies for Distributed Systems and Networks (POLICY'05).

[42]  Jeff Heflin,et al.  LUBM: A benchmark for OWL knowledge base systems , 2005, J. Web Semant..