ProvMark: A Provenance Expressiveness Benchmarking System

System level provenance is of widespread interest for applications such as security enforcement and information protection. However, testing the correctness or completeness of provenance capture tools is challenging and currently done manually. In some cases there is not even a clear consensus about what behavior is correct. We present an automated tool, ProvMark, that uses an existing provenance system as a black box and reliably identifies the provenance graph structure recorded for a given activity, by a reduction to subgraph isomorphism problems handled by an external solver. ProvMark is a beginning step in the much needed area of testing and comparing the expressiveness of provenance systems. We demonstrate ProvMark's usefuless in comparing three capture systems with different architectures and distinct design philosophies.

[1]  James Cheney,et al.  Expressiveness Benchmarking for System-Level Provenance , 2017, TaPP.

[2]  David M. Eyers,et al.  Runtime Analysis of Whole-System Provenance , 2018, CCS.

[3]  Ashish Gehani,et al.  SPADE: Support for Provenance Auditing in Distributed Environments , 2012, Middleware.

[4]  Bashar Nuseibeh,et al.  On evidence preservation requirements for forensic-ready systems , 2017, ESEC/SIGSOFT FSE.

[5]  Margo I. Seltzer,et al.  Provenance-Aware Storage Systems , 2006, USENIX ATC, General Track.

[6]  Margo I. Seltzer,et al.  FRAPpuccino: Fault-detection through Runtime Analysis of Provenance , 2017, HotCloud.

[7]  Danius T. Michaelides,et al.  The PROV-JSON serialization , 2013 .

[8]  Wei Wang,et al.  A Graph Based Approach Toward Network Forensics Analysis , 2008, TSEC.

[9]  David M. Eyers,et al.  Data provenance to audit compliance with privacy policy in the Internet of Things , 2017, Personal and Ubiquitous Computing.

[10]  Patrick D. McDaniel,et al.  Hi-Fi: collecting high-fidelity whole-system provenance , 2012, ACSAC '12.

[11]  Ashish Gehani,et al.  Policy-Based Integration of Provenance Metadata , 2011, 2011 IEEE International Symposium on Policies for Distributed Systems and Networks.

[12]  Kiran-Kumar Muniswamy-Reddy,et al.  Causality-based versioning , 2009, TOS.

[13]  Shan Shan Huang,et al.  Datalog and Recursive Query Processing , 2013, Found. Trends Databases.

[14]  Xi Wang,et al.  Hyperkernel: Push-Button Verification of an OS Kernel , 2017, SOSP.

[15]  Marius Thomas Lindauer,et al.  Potassco: The Potsdam Answer Set Solving Collection , 2011, AI Commun..

[16]  Margo Seltzer,et al.  If these data could talk , 2017, Scientific Data.

[17]  Jacobo Torán,et al.  Isomorphism Testing: Perspective and Open Problems , 2005, Bull. EATCS.

[18]  Yolanda Gil,et al.  PROV-DM: The PROV Data Model , 2013 .

[19]  Christof Fetzer,et al.  INSPECTOR: Data Provenance Using Intel Processor Trace (PT) , 2016, 2016 IEEE 36th International Conference on Distributed Computing Systems (ICDCS).

[20]  Qiang Fu,et al.  Learning to Log: Helping Developers Make Informed Logging Decisions , 2015, 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering.

[21]  Jatinder Singh,et al.  Camflow: Managed Data-Sharing for Cloud Services , 2015, IEEE Transactions on Cloud Computing.

[22]  Thomas Moyer,et al.  Trustworthy Whole-System Provenance for the Linux Kernel , 2015, USENIX Security Symposium.

[23]  Yogesh L. Simmhan,et al.  Special Issue: The First Provenance Challenge , 2008, Concurr. Comput. Pract. Exp..

[24]  David M. Eyers,et al.  Practical whole-system provenance capture , 2017, SoCC.

[25]  Andy Hopper,et al.  OPUS: A Lightweight System for Observational Provenance in User Space , 2013, TaPP.

[26]  Luc Moreau,et al.  The Foundations for Provenance on the Web , 2010, Found. Trends Web Sci..

[27]  Peter Alvaro,et al.  Abstracting the Geniuses Away from Failure Testing , 2017, ACM Queue.

[28]  Serge Abiteboul,et al.  Foundations of Databases , 1994 .

[29]  Thomas Moyer,et al.  Towards Scalable Cluster Auditing through Grammatical Inference over Provenance Graphs , 2018, NDSS.

[30]  Ian Foster,et al.  Special Issue: The First Provenance Challenge , 2008 .

[31]  Dalal Alrajeh,et al.  Towards Forensic-Ready Software Systems , 2018, 2018 IEEE/ACM 40th International Conference on Software Engineering: New Ideas and Emerging Technologies Results (ICSE-NIER).

[32]  Michael Norrish,et al.  seL4: formal verification of an OS kernel , 2009, SOSP '09.

[33]  Stephen A. Cook,et al.  The complexity of theorem-proving procedures , 1971, STOC.