Provenance for collaboration: Detecting suspicious behaviors and assessing trust in information

Data collaborations allow users to draw upon diverse resources to solve complex problems. While collaborations enable a greater ability to manipulate data and services, they also create new security vulnerabilities. Collaboration participants need methods to detect suspicious behaviors (potentially caused by malicious insiders) and assess trust in information when it passes through many hands. In this work, we describe these challenges and introduce provenance as a way to solve them. We describe a provenance system, PLUS, and show how it can be used to assist in assessing trust and detecting suspicious behaviors. A preliminary study shows this to be a promising direction for future research.

[1]  Luc Moreau,et al.  The Open Provenance Model , 2007 .

[2]  Elisa Bertino Protecting information systems from insider threats - concepts and issues , 2011, IRI.

[3]  James Cheney,et al.  Provenance management in curated databases , 2006, SIGMOD Conference.

[4]  Margo I. Seltzer,et al.  Provenance-Aware Storage Systems , 2006, USENIX ATC, General Track.

[5]  James Frew,et al.  Automatic capture and reconstruction of computational provenance , 2008, Concurr. Comput. Pract. Exp..

[6]  Jennifer Widom,et al.  ULDBs: databases with uncertainty and lineage , 2006, VLDB.

[7]  Adriane Chapman,et al.  Scalable Access Controls for Lineage , 2009, Workshop on the Theory and Practice of Provenance.

[8]  Val Tannen,et al.  Annotated XML: queries and provenance , 2008, PODS.

[9]  Adriane Chapman,et al.  PLUS: A provenance manager for integrated information , 2011, 2011 IEEE International Conference on Information Reuse & Integration.

[10]  Elisa Bertino,et al.  An Approach to Evaluate Data Trustworthiness Based on Data Provenance , 2008, Secure Data Management.

[11]  Devan Ray Donaldson,et al.  Provenance, End-User Trust and Reuse: An Empirical Investigation , 2011, TaPP.

[12]  Cláudio T. Silva,et al.  Querying and re-using workflows with VsTrails , 2008, SIGMOD Conference.

[13]  Adriane Chapman,et al.  Efficient provenance storage , 2008, SIGMOD Conference.

[14]  Yogesh L. Simmhan,et al.  Karma2: Provenance Management for Data-Driven Workflows , 2008, Int. J. Web Serv. Res..

[15]  Carole A. Goble,et al.  Data Lineage Model for Taverna Workflows with Lightweight Annotation Requirements , 2008, IPAW.

[16]  Alun D. Preece,et al.  Managing information quality in e-science: the qurator workbench , 2007, SIGMOD '07.

[17]  Susan B. Davidson,et al.  Addressing the provenance challenge using ZOOM , 2008, Concurr. Comput. Pract. Exp..

[18]  Adriane Chapman,et al.  Getting It Together: Enabling Multi-organization Provenance Exchange , 2011, TaPP.

[19]  Adriane Chapman,et al.  Surrogate Parenthood: Protected and Informative Graphs , 2011, Proc. VLDB Endow..

[20]  Ilkay Altintas,et al.  Provenance Collection Support in the Kepler Scientific Workflow System , 2006, IPAW.

[21]  Marianne Winslett,et al.  The Case of the Fake Picasso: Preventing History Forgery with Secure Provenance , 2009, FAST.

[22]  Susan B. Davidson,et al.  Addressing the provenance challenge using ZOOM , 2008 .

[23]  Cláudio T. Silva,et al.  VisTrails: enabling interactive multiple-view visualizations , 2005, VIS 05. IEEE Visualization, 2005..

[24]  Paul T. Groth,et al.  PReServ: Provenance Recording for Services , 2005 .

[25]  Yong Zhao,et al.  Applying Chimera Virtual Data Concepts to Cluster Finding in the Sloan Sky Survey , 2002, ACM/IEEE SC 2002 Conference (SC'02).