PURE: A Privacy Aware Rule-Based Framework over Knowledge Graphs

Open data initiatives and FAIR data principles have encouraged the publication of large volumes of data, encoding knowledge relevant for the advance of science and technology. However, to mine knowledge, it is usually required the processing of data collected from sources regulated by diverse access and privacy policies. We address the problem of enforcing data privacy and access regulations (EDPR) and propose PURE, a framework able to solve this problem during query processing. PURE relies on the local as view approach for defining the rules that represent the access control policies imposed over a federation of RDF knowledge graphs. Moreover, PURE maps the problem of checking if a query meets the privacy regulations to the problem of query rewriting (QRP) using views; it resorts to state-of-the-art QRP solutions for determining if a query violates or not the defined policies. We have evaluated the efficiency of PURE over the Berlin SPARQL Benchmark (BSBM). Observed results suggest that PURE is able to scale up to complex scenarios where a large number of rules represents diverse types of policies.

[1]  Bradley Malin,et al.  Technical and Policy Approaches to Balancing Patient Privacy and Data Sharing in Clinical and Translational Research , 2010, Journal of Investigative Medicine.

[2]  U. Dinesh Acharya,et al.  Secured Ontology Matching Using Graph Matching , 2012, ACITY.

[3]  Maria-Esther Vidal,et al.  BOUNCER: Privacy-Aware Query Processing over Federations of RDF Datasets , 2018, DEXA.

[4]  Alon Y. Halevy,et al.  Answering queries using views: A survey , 2001, The VLDB Journal.

[5]  Bart Selman,et al.  Satisfiability Solvers , 2008, Handbook of Knowledge Representation.

[6]  Isaac S. Kohane,et al.  A translational engine at the national scale: informatics for integrating biology and the bedside , 2012, J. Am. Medical Informatics Assoc..

[7]  Efthimios Tambouris,et al.  The linked medical data access control framework , 2014, J. Biomed. Informatics.

[8]  Dietrich Rebholz-Schuhmann,et al.  SAFE: SPARQL Federation over RDF Data Cubes with Access Control , 2017, J. Biomed. Semant..

[9]  Jens Lehmann,et al.  DBpedia - A large-scale, multilingual knowledge base extracted from Wikipedia , 2015, Semantic Web.

[10]  Lonnie Blevins,et al.  A system for sharing routine surgical pathology specimens across institutions: the Shared Pathology Informatics Network. , 2007, Human pathology.

[11]  F. Gargouri,et al.  A proposal for a geographic ontology merging methodology , 2009, 2009 International Conference on the Current Trends in Information Technology (CTIT).

[12]  Michael Martin,et al.  Enforcing scalable authorization on SPARQL queries , 2016, SEMANTiCS.

[13]  Anas M. Saad,et al.  Suicidal death within a year of a cancer diagnosis: A population‐based study , 2019, Cancer.

[14]  Morteza Amini,et al.  Multi-level authorisation model and framework for distributed semantic-aware environments , 2010, IET Inf. Secur..

[15]  Xiaoqian Jiang,et al.  Privacy Preserving Federated Big Data Analysis , 2018 .

[16]  Joann J. Ordille,et al.  Querying Heterogeneous Information Sources Using Source Descriptions , 1996, VLDB.

[17]  Serena Villata,et al.  Context-Aware Access Control for RDF Graph Stores , 2012, ECAI.

[18]  M. Schell,et al.  Suicide in patients with pancreatic cancer , 2011, Cancer.

[19]  Jimeng Sun,et al.  Publishing data from electronic health records while preserving privacy: A survey of algorithms , 2014, J. Biomed. Informatics.

[20]  Marcelo Arenas,et al.  Semantics and complexity of SPARQL , 2006, TODS.

[21]  M. Schatz,et al.  Big Data: Astronomical or Genomical? , 2015, PLoS biology.

[22]  Serena Villata,et al.  Privacy, security and policies: A review of problems and solutions with semantic web technologies , 2018, Semantic Web.

[23]  Barbara J. Kenner Early Detection of Pancreatic Cancer , 2018, Pancreas.

[24]  Rinku Dewri,et al.  Linking Health Records for Federated Query Processing , 2016, Proc. Priv. Enhancing Technol..

[25]  Maria-Esther Vidal,et al.  Compilation of Query-Rewriting Problems into Tractable Fragments of Propositional Logic , 2006, AAAI.