Hi-Fi: collecting high-fidelity whole-system provenance

Data provenance---a record of the origin and evolution of data in a system---is a useful tool for forensic analysis. However, existing provenance collection mechanisms fail to achieve sufficient breadth or fidelity to provide a holistic view of a system's operation over time. We present Hi-Fi, a kernel-level provenance system which leverages the Linux Security Modules framework to collect high-fidelity whole-system provenance. We demonstrate that Hi-Fi is able to record a variety of malicious behavior within a compromised system. In addition, our benchmarks show the collection overhead from Hi-Fi to be less than 1% for most system calls and 3% in a representative workload, while simultaneously generating a system measurement that fully reflects system evolution. In this way, we show that we can collect broad, high-fidelity provenance data which is capable of supporting detailed forensic analysis.

[1]  James P Anderson,et al.  Computer Security Technology Planning Study , 1972 .

[2]  Leslie Lamport,et al.  Time, clocks, and the ordering of events in a distributed system , 1978, CACM.

[3]  Jeffrey Katcher,et al.  PostMark: A New File System Benchmark , 1997 .

[4]  Trent Jaeger,et al.  Runtime verification of authorization hook placement for the linux security modules framework , 2002, CCS '02.

[5]  Trent Jaeger,et al.  Using CQUAL for Static Analysis of Authorization Hook Placement , 2002, USENIX Security Symposium.

[6]  Crispin Cowan,et al.  Linux security modules: general security support for the linux kernel , 2002, Foundations of Intrusion Tolerant Systems, 2003 [Organically Assured and Survivable Information Systems].

[7]  Robert Wisniewski relayfs : An Efficient Unified Approach for Transmitting Data from Kernel to User Space , 2003 .

[8]  Wu-chi Feng,et al.  Forensix: a robust, high-performance reconstruction system , 2005, 25th IEEE International Conference on Distributed Computing Systems Workshops.

[9]  Jennifer Widom,et al.  Trio: A System for Integrated Management of Data, Accuracy, and Lineage , 2004, CIDR.

[10]  Somesh Jha,et al.  Automatic placement of authorization hooks in the linux security modules framework , 2005, CCS '05.

[11]  Margo I. Seltzer,et al.  Issues in Automatic Provenance Collection , 2006, IPAW.

[12]  Radu Sion,et al.  Strong WORM , 2008, 2008 The 28th International Conference on Distributed Computing Systems.

[13]  Xiao Ma,et al.  AutoISES: Automatically Inferring Security Specification and Detecting Violations , 2008, USENIX Security Symposium.

[14]  Margo Seltzer,et al.  Layering in Provenance-Aware Storage Systems , 2008 .

[15]  Ashvin Goel,et al.  Reconstructing system state for intrusion analysis , 2008, OPSR.

[16]  Erez Zadok,et al.  Selective Versioning in a Secure Disk System , 2008, USENIX Security Symposium.

[17]  Kiran-Kumar Muniswamy-Reddy,et al.  Causality-based versioning , 2009, TOS.

[18]  Jeffrey F. Naughton,et al.  Transparently Gathering Provenance with Provenance Aware Condor , 2009, Workshop on the Theory and Practice of Provenance.

[19]  Erez Zadok,et al.  Story Book: An Efficient Extensible Provenance Framework , 2009, Workshop on the Theory and Practice of Provenance.

[20]  Jennifer Widom,et al.  Panda: A System for Provenance and Data , 2010, IEEE Data Eng. Bull..

[21]  Yogesh L. Simmhan,et al.  The Open Provenance Model core specification (v1.1) , 2011, Future Gener. Comput. Syst..

[22]  Limin Jia,et al.  Policy auditing over incomplete logs: theory, implementation and applications , 2011, CCS '11.

[23]  Ethan L. Miller,et al.  Tracking Emigrant Data via Transient Provenance , 2011, TaPP.

[24]  W. Marsden I and J , 2012 .