Formal Policy-Based Provenance Audit

Data processing within large organisations is often complex, impeding both the traceability of data and the compliance of processing with usage policies. The chronology of the ownership, custody, or location of data—its provenance—provides the necessary information to restore traceability. However, to be of practical use, provenance records should include sufficient expressiveness by design with a posteriori analysis in mind, e.g. the verification of their compliance with usage policies. Additionally, they ought to be combined with systematic reasoning about their correctness. In this paper, we introduce a formal framework for policy-based provenance audit. We show how it can be used to demonstrate correctness, consistency, and compliance of provenance records with machine-readable usage policies. We also analyse the suitability of our framework for the special case of privacy protection. A formalised perspective on provenance is also useful in this area, but it must be integrated into a larger accountability process involving data protection authorities to be effective. The practical applicability of our approach is demonstrated using a provenance record involving medical data and corresponding privacy policies with personal data protection as a goal.

[1]  Sanjeev Khanna,et al.  On provenance and privacy , 2010, ICDT '11.

[2]  Lawrence Charles Paulson,et al.  Isabelle: A Generic Theorem Prover , 1994 .

[3]  Christoph Bier How Usage Control and Provenance Tracking Get Together - A Data Protection Perspective , 2013, 2013 IEEE Security and Privacy Workshops.

[4]  Olaf Hartig Provenance Information in the Web of Data , 2009, LDOW.

[5]  Paolo Missier,et al.  Proceedings of the First International Workshop on the role of Semantic Web in Provenance Management (SWPM 2009), collocated with the 8th International Semantic Web Conference (ISWC-2009), Washington DC, USA, October 25, 2009 , 2009, SWPM.

[6]  Daniel Le Métayer,et al.  Log Analysis for Data Protection Accountability , 2013, FM.

[7]  Mihir Bellare,et al.  Forward Integrity For Secure Audit Logs , 1997 .

[8]  Beng Chin Ooi,et al.  Privacy and ownership preserving of outsourced medical data , 2005, 21st International Conference on Data Engineering (ICDE'05).

[9]  Kurt Tutschku,et al.  Future Internet - FIS 2010 - Third Future Internet Symposium, Berlin, Germany, September 20-22, 2010. Proceedings , 2010, FIS.

[10]  Stephen Chong Towards Semantics for Provenance Security , 2009, Workshop on the Theory and Practice of Provenance.

[11]  Deborah L. McGuinness,et al.  PROV-O: The PROV Ontology , 2013 .

[12]  Daniel Le Métayer,et al.  Log Design for Accountability , 2013, 2013 IEEE Security and Privacy Workshops.

[13]  Koen Decroix,et al.  Model-Based Analysis of Privacy in Electronic Services , 2015 .

[14]  T. Nies Constraints of the PROV Data Model , 2013 .

[15]  Yolanda Gil,et al.  PROV-DM: The PROV Data Model , 2013 .

[16]  Elisa Bertino,et al.  A Comprehensive Model for Provenance , 2012, IPAW.

[17]  James Cheney,et al.  A Formal Framework for Provenance Security , 2011, 2011 IEEE 24th Computer Security Foundations Symposium.

[18]  Jun Sun,et al.  Proceedings of the 19th International Symposium on FM 2014: Formal Methods - Volume 8442 , 2014 .

[19]  Luc Moreau,et al.  A Provenance-Based Compliance Framework , 2010, FIS.

[20]  Robert W. Proctor,et al.  Examining Usability of Web Privacy Policies , 2008, Int. J. Hum. Comput. Interact..

[21]  Noboru Sonehara,et al.  Privacy by Data Provenance with Digital Watermarking - A Proof-of-Concept Implementation for Medical Services with Electronic Health Records , 2010, 2010 Sixth International Conference on Intelligent Information Hiding and Multimedia Signal Processing.

[22]  Yolanda Gil,et al.  Reasoning about the Appropriate Use of Private Data through Computational Workflows , 2010, AAAI Spring Symposium: Intelligent Information Privacy Management.

[23]  Benjamin Greschbach,et al.  The devil is in the metadata — New privacy challenges in Decentralised Online Social Networks , 2012, 2012 IEEE International Conference on Pervasive Computing and Communications Workshops.

[24]  Huseyin Polat,et al.  A survey: deriving private information from perturbed data , 2015, Artificial Intelligence Review.

[25]  Siani Pearson,et al.  Sticky Policies: An Approach for Managing Privacy across Multiple Parties , 2011, Computer.

[26]  Alfred Kobsa,et al.  Provenance and Annotation of Data and Processes , 2012, Lecture Notes in Computer Science.

[27]  Sudha Ram,et al.  A New Perspective on Semantics of Data Provenance , 2009, SWPM.

[28]  Ian T. Foster,et al.  The virtual data grid: a new model and architecture for data-intensive collaboration , 2003, 15th International Conference on Scientific and Statistical Database Management, 2003..

[29]  Jorge Lobo,et al.  A Survey of Privacy Policy Languages , 2007 .

[30]  Yogesh L. Simmhan,et al.  The Open Provenance Model core specification (v1.1) , 2011, Future Gener. Comput. Syst..