Hiding Data and Structure in Workflow Provenance

In this paper we discuss the use of views to address the problem of providing useful answers to provenance queries while ensuring that privacy concerns are met. In particular, we propose a hierarchical workflow model, based on context-free graph grammars, in which fine-grained dependencies between the inputs and outputs of a module are explicitly specified. Using this model, we examine how privacy concerns surrounding data, module function, and workflow structure can be addressed.

[1]  Jon M. Kleinberg,et al.  Wherefore art thou R3579X? , 2011, Commun. ACM.

[2]  Carole A. Goble,et al.  A formal semantics for the Taverna 2 workflow model , 2010, J. Comput. Syst. Sci..

[3]  Luc Moreau,et al.  Report on the International Provenance and Annotation Workshop: (IPAW'06) 3-5 May 2006, Chicago , 2006, SGMD.

[4]  Bertram Ludäscher,et al.  CONCURRENCY AND COMPUTATION : PRACTICE AND EXPERIENCE Concurrency Computat , 2008 .

[5]  Carmem S. Hara,et al.  Querying and Managing Provenance through User Views in Scientific Workflows , 2008, 2008 IEEE 24th International Conference on Data Engineering.

[6]  Julia Stoyanovich,et al.  MutaGeneSys: estimating individual disease susceptibility based on genome-wide SNP array data , 2008, Bioinform..

[7]  Cynthia Dwork,et al.  Differential Privacy , 2006, ICALP.

[8]  Rajeev Motwani,et al.  Link Privacy in Social Networks , 2008, 2008 IEEE 24th International Conference on Data Engineering.

[9]  FosterIan,et al.  Report on the International Provenance and Annotation Workshop , 2006 .

[10]  Bertram Ludäscher,et al.  Provenance in Scientific Workflow Systems , 2007, IEEE Data Eng. Bull..

[11]  Cynthia Dwork,et al.  Wherefore art thou r3579x?: anonymized social networks, hidden patterns, and structural steganography , 2007, WWW '07.

[12]  James Frew,et al.  Lineage retrieval for scientific data processing: a survey , 2005, CSUR.

[13]  Debmalya Panigrahi,et al.  Preserving Module Privacy in Workflow Provenance , 2010, ArXiv.

[14]  ASHWIN MACHANAVAJJHALA,et al.  L-diversity: privacy beyond k-anonymity , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[15]  Ashwin Machanavajjhala,et al.  Personalized Social Recommendations - Accurate or Private? , 2011, Proc. VLDB Endow..

[16]  Dan Suciu,et al.  Relationship privacy: output perturbation for queries with joins , 2009, PODS.

[17]  PlaleBeth,et al.  A survey of data provenance in e-science , 2005 .

[18]  Cláudio T. Silva,et al.  Managing Rapidly-Evolving Scientific Workflows , 2006, IPAW.

[19]  Debmalya Panigrahi,et al.  Provenance views for module privacy , 2010, PODS.

[20]  Bertram Ludäscher,et al.  Actor-Oriented Design of Scientific Workflows , 2005, ER.

[21]  Simon Miles Electronically Querying for the Provenance of Entities , 2006, IPAW.

[22]  Matthew R. Pocock,et al.  Taverna: a tool for the composition and enactment of bioinformatics workflows , 2004, Bioinform..

[23]  Catriel Beeri,et al.  Querying business processes , 2006, VLDB.

[24]  Tova Milo,et al.  A Fine-Grained Workflow Model with Provenance-Aware Security Views , 2011, TaPP.

[25]  Sabrina De Capitani di Vimercati,et al.  Access control: principles and solutions , 2003, Softw. Pract. Exp..

[26]  Alina Campan,et al.  Data and Structural k-Anonymity in Social Networks , 2009, PinKDD.

[27]  Alina Campan,et al.  A Clustering Approach for Data and Structural Anonymity in Social Networks , 2008 .

[28]  Simon Miles Automatically Adapting Source Code to Document Provenance , 2010, IPAW.

[29]  Luc Moreau,et al.  The Open Provenance Model: An Overview , 2008, IPAW.

[30]  Robin Milner,et al.  On Observing Nondeterminism and Concurrency , 1980, ICALP.