RAIN: Refinable Attack Investigation with On-demand Inter-Process Information Flow Tracking

As modern attacks become more stealthy and persistent, detecting or preventing them at their early stages becomes virtually impossible. Instead, an attack investigation or provenance system aims to continuously monitor and log interesting system events with minimal overhead. Later, if the system observes any anomalous behavior, it analyzes the log to identify who initiated the attack and which resources were affected by the attack and then assess and recover from any damage incurred. However, because of a fundamental tradeoff between log granularity and system performance, existing systems typically record system-call events without detailed program-level activities (e.g., memory operation) required for accurately reconstructing attack causality or demand that every monitored program be instrumented to provide program-level information. To address this issue, we propose RAIN, a Refinable Attack INvestigation system based on a record-replay technology that records system-call events during runtime and performs instruction-level dynamic information flow tracking (DIFT) during on-demand process replay. Instead of replaying every process with DIFT, RAIN conducts system-call-level reachability analysis to filter out unrelated processes and to minimize the number of processes to be replayed, making inter-process DIFT feasible. Evaluation results show that RAIN effectively prunes out unrelated processes and determines attack causality with negligible false positive rates. In addition, the runtime overhead of RAIN is similar to existing system-call level provenance systems and its analysis overhead is much smaller than full-system DIFT.

[1]  Nickolai Zeldovich,et al.  Recovering from intrusions in distributed systems with DARE , 2012, APSys.

[2]  Andreas Haeberlen,et al.  Secure network provenance , 2011, SOSP.

[3]  James Cownie,et al.  PinPlay: a framework for deterministic replay and reproducible analysis of parallel programs , 2010, CGO '10.

[4]  Yunheung Paek,et al.  KI-Mon ARM: A Hardware-Assisted Event-triggered Monitoring Platform for Mutable Kernel Object , 2019, IEEE Transactions on Dependable and Secure Computing.

[5]  Alessandro Orso,et al.  Dytan: a generic dynamic taint analysis framework , 2007, ISSTA '07.

[6]  Margo I. Seltzer,et al.  Provenance-Aware Storage Systems , 2006, USENIX ATC, General Track.

[7]  Xiangyu Zhang,et al.  ProTracer: Towards Practical Provenance Tracing by Alternating Between Logging and Tainting , 2016, NDSS.

[8]  Ashish Gehani,et al.  SPADE: Support for Provenance Auditing in Distributed Environments , 2012, Middleware.

[9]  Thomas Moyer,et al.  Trustworthy Whole-System Provenance for the Linux Kernel , 2015, USENIX Security Symposium.

[10]  Adrian Perrig,et al.  SecVisor: a tiny hypervisor to provide lifetime kernel code integrity for commodity OSes , 2007, SOSP.

[11]  Jun Wang,et al.  StraightTaint: Decoupled offline symbolic taint analysis , 2016, 2016 31st IEEE/ACM International Conference on Automated Software Engineering (ASE).

[12]  Heng Yin,et al.  Panorama: capturing system-wide information flow for malware detection and analysis , 2007, CCS '07.

[13]  William A. Arbaugh,et al.  Copilot - a Coprocessor-based Kernel Runtime Integrity Monitor , 2004, USENIX Security Symposium.

[14]  Angelos D. Keromytis,et al.  ShadowReplica: efficient parallelization of dynamic data flow tracking , 2013, CCS.

[15]  Srikanth Kandula,et al.  Flashback: A Lightweight Extension for Rollback and Deterministic Replay for Software Debugging , 2004, USENIX Annual Technical Conference, General Track.

[16]  Atul Prakash,et al.  FlowFence: Practical Data Protection for Emerging IoT Application Frameworks , 2016, USENIX Security Symposium.

[17]  Xiangyu Zhang,et al.  High Accuracy Attack Provenance via Binary-based Execution Partition , 2013, NDSS.

[18]  Paul T. Groth,et al.  Looking Inside the Black-Box: Capturing Data Provenance Using Dynamic Instrumentation , 2014, IPAW.

[19]  Samuel T. King,et al.  ReVirt: enabling intrusion analysis through virtual-machine logging and replay , 2002, OPSR.

[20]  Byung-Gon Chun,et al.  TaintDroid: An Information-Flow Tracking System for Realtime Privacy Monitoring on Smartphones , 2010, OSDI.

[21]  Jason Flinn,et al.  JetStream: Cluster-Scale Parallelization of Information Flow Queries , 2016, OSDI.

[22]  Yunheung Paek,et al.  Vigilare: toward snoop-based kernel integrity monitor , 2012, CCS '12.

[23]  Andreas Haeberlen,et al.  Data Provenance at Internet Scale: Architecture, Experiences, and the Road Ahead , 2017, CIDR.

[24]  Andreas Haeberlen,et al.  The Good, the Bad, and the Differences: Better Network Diagnostics with Differential Provenance , 2016, SIGCOMM.

[25]  Tal Garfinkel,et al.  VMwareDecoupling Dynamic Program Analysis from Execution in Virtual Environments , 2008, USENIX Annual Technical Conference.

[26]  Paul T. Groth,et al.  Decoupling Provenance Capture and Analysis from Execution , 2015, TaPP.

[27]  Josep Torrellas,et al.  ReplayConfusion: Detecting cache-based covert channel attacks using record and replay , 2016, 2016 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[28]  Fengyuan Xu,et al.  High Fidelity Data Reduction for Big Data Security Dependency Analyses , 2016, CCS.

[29]  Jun Wang,et al.  TaintPipe: Pipelined Symbolic Taint Analysis , 2015, USENIX Security Symposium.

[30]  Andreas Haeberlen,et al.  Detecting Covert Timing Channels with Time-Deterministic Replay , 2014, OSDI.

[31]  Michael Chow,et al.  Eidetic Systems , 2014, OSDI.

[32]  Margo I. Seltzer,et al.  Layering in Provenance Systems , 2009, USENIX Annual Technical Conference.

[33]  Leman Akoglu,et al.  Fast Memory-efficient Anomaly Detection in Streaming Heterogeneous Graphs , 2016, KDD.

[34]  Wenke Lee,et al.  RecProv: Towards Provenance-Aware User Space Record and Replay , 2016, IPAW.

[35]  Brendan Dolan-Gavitt,et al.  Repeatable Reverse Engineering with PANDA , 2015, PPREW@ACSAC.

[36]  David Zhang,et al.  Secure program execution via dynamic information flow tracking , 2004, ASPLOS XI.

[37]  Stefanos Kaxiras,et al.  Splash-3: A properly synchronized benchmark suite for contemporary research , 2016, 2016 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS).

[38]  Wenke Lee,et al.  Gyrus: A Framework for User-Intent Monitoring of Text-based Networked Applications , 2014, NDSS.

[39]  Mona Attariyan,et al.  X-ray: Automating Root-Cause Diagnosis of Performance Anomalies in Production Software , 2012, OSDI.

[40]  Xi Wang,et al.  Intrusion Recovery Using Selective Re-execution , 2010, OSDI.

[41]  Zhen Xiao,et al.  Samsara: Efficient Deterministic Replay in Multiprocessor Environments with Hardware Virtualization Extensions , 2016, USENIX Annual Technical Conference.

[42]  Andreas Haeberlen,et al.  Let SDN Be Your Eyes: Secure Forensics in Data Center Networks , 2014 .

[43]  James Newsome,et al.  Dynamic Taint Analysis for Automatic Detection, Analysis, and SignatureGeneration of Exploits on Commodity Software , 2005, NDSS.

[44]  Angelos D. Keromytis,et al.  libdft: practical dynamic data flow tracking for commodity systems , 2012, VEE '12.

[45]  Emmett Witchel,et al.  Ensuring operating system kernel integrity with OSck , 2011, ASPLOS XVI.