Provenance Tipping Point

Capture is a known, difficult problem for provenance. Obtaining from the systems and programs exactly what happened has been a continuing struggle outside of database and workflow systems. The provenance research community has created libraries to log provenance, and has also embedded instances of capture agents within operating systems, specific programs, etc. However, it is impossible to know if we are inserting capture agents at both the optimal location and frequency in a given system for a high quality provenance graph. In this work, we develop an initial agent based model to simulate Activity and Entity interactions in a complex system of software. Using this model, we can attempt to define some generalized principles about type, frequency and distribution of provenance capture agents given a new system.

[1]  Adriane Chapman,et al.  Provenance Capture Disparities Highlighted through Datasets , 2014, TAPP.

[2]  Hazeline U. Asuncion Automated data provenance capture in spreadsheets, with case studies , 2013, Future Gener. Comput. Syst..

[3]  Margo I. Seltzer,et al.  A General-Purpose Provenance Library , 2012, TaPP.

[4]  Jennifer Widom,et al.  RAMP: A System for Capturing and Tracing Provenance in MapReduce Workflows , 2011, Proc. VLDB Endow..

[5]  Barbara Lerner,et al.  RDataTracker: Collecting Provenance in an Interactive Scripting Environment , 2014, TAPP.

[6]  Adriane Chapman,et al.  Capturing Provenance in the Wild , 2010, IPAW.

[7]  Steven F. Railsback,et al.  Agent-Based and Individual-Based Modeling: A Practical Introduction , 2011 .

[8]  Paula A. Mutchler,et al.  D ATA P ROVENANCE AND F INANCIAL S YSTEMIC R ISK (Case Study) , 2012 .

[9]  S. S. Ravi,et al.  Agent Based Modeling, Mathematical Formalism for , 2009, Encyclopedia of Complexity and Systems Science.

[10]  S. S. Ravi,et al.  A mathematical formalism for agent-based modeling , 2007, ArXiv.

[11]  Rik Van de Walle,et al.  Git2PROV: Exposing Version Control System Content as W3C PROV , 2013, International Semantic Web Conference.

[12]  Wilhelm Hasselbring,et al.  Start Smart and Finish Wise: The Kiel Marine Science Provenance-Aware Data Management Approach , 2014, TAPP.

[13]  Rahul Ramachandran,et al.  Introducing Provenance Capture into a Legacy Data System , 2013, IEEE Transactions on Geoscience and Remote Sensing.

[14]  Paolo Missier,et al.  Extracting PROV provenance traces from Wikipedia history pages , 2013, EDBT '13.

[15]  Paul T. Groth,et al.  Looking Inside the Black-Box: Capturing Data Provenance Using Dynamic Instrumentation , 2014, IPAW.

[16]  Margo I. Seltzer,et al.  Provenance-Aware Storage Systems , 2006, USENIX ATC, General Track.

[17]  Margo I. Seltzer,et al.  Provenance: a future history , 2009, OOPSLA Companion.

[18]  Paul T. Groth,et al.  PReServ: Provenance Recording for Services , 2005 .