DECAF++: Elastic Whole-System Dynamic Taint Analysis

Whole-system dynamic taint analysis has many unique applications such as malware analysis and fuzz testing. Compared to process-level taint analysis, it offers a wider analysis scope, a better transparency and tamper resistance. The main barrier of applying whole-system dynamic taint analysis in practice is the large slowdown that can be sometimes up to 30 times. Existing optimization schemes have either considerable baseline overheads (when there is no tainted data) or specific hardware dependencies. In this paper, we propose an elastic wholesystem dynamic taint analysis approach, and implement it in a prototype called DECAF++. Elastic whole-system dynamic taint analysis strives to perform taint analysis as least frequent as possible while maintaining the precision and accuracy. Although similar ideas are explored before for process-level taint analysis, we are the first to successfully achieve true elasticity for whole-system taint analysis via pure software approaches. We evaluated our prototype DECAF++ on nbench, apache bench, and SPEC CPU2006. Under taint analysis loads, DECAF++ achieves 202% speedup on nbench and 66% speedup on apache bench. Under no taint analysis load, DECAF++ imposes only 4% overhead on SPEC CPU2006.

[1]  Alessandro Orso,et al.  Dytan: a generic dynamic taint analysis framework , 2007, ISSTA '07.

[2]  Jun Wang,et al.  StraightTaint: Decoupled offline symbolic taint analysis , 2016, 2016 31st IEEE/ACM International Conference on Automated Software Engineering (ASE).

[3]  Nicholas Nethercote,et al.  Valgrind: a framework for heavyweight dynamic binary instrumentation , 2007, PLDI '07.

[4]  Cheng Wang,et al.  LIFT: A Low-Overhead Practical Information Flow Tracking System for Detecting Security Attacks , 2006, 2006 39th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'06).

[5]  Brendan Dolan-Gavitt,et al.  Repeatable Reverse Engineering with PANDA , 2015, PPREW@ACSAC.

[6]  Fabrice Bellard,et al.  QEMU, a Fast and Portable Dynamic Translator , 2005, USENIX ATC, FREENIX Track.

[7]  Harish Patil,et al.  Pin: building customized program analysis tools with dynamic instrumentation , 2005, PLDI '05.

[8]  Herbert Bos,et al.  VUzzer: Application-aware Evolutionary Fuzzing , 2017, NDSS.

[9]  Cheng Wang,et al.  Software-based transparent and comprehensive control-flow error detection , 2006, International Symposium on Code Generation and Optimization (CGO'06).

[10]  Peter J. Denning,et al.  The locality principle , 2005, CACM.

[11]  Zhenkai Liang,et al.  Polyglot: automatic extraction of protocol message format using dynamic binary analysis , 2007, CCS '07.

[12]  Lok K. Yan,et al.  On Soundness and Precision of Dynamic Taint Analysis , 2014 .

[13]  Alessandro Orso,et al.  RAIN: Refinable Attack Investigation with On-demand Inter-Process Information Flow Tracking , 2017, CCS.

[14]  Tal Garfinkel,et al.  Understanding data lifetime via whole system simulation , 2004 .

[15]  Heng Yin,et al.  Capturing Malware Propagations with Code Injections and Code-Reuse Attacks , 2017, CCS.

[16]  Angelos D. Keromytis,et al.  ShadowReplica: efficient parallelization of dynamic data flow tracking , 2013, CCS.

[17]  Andrew Warfield,et al.  Practical taint-based protection using demand emulation , 2006, EuroSys.

[18]  Ali Davanian Effective granularity in Internet badhood detection:Detection rate, Precision and Implementation performance , 2017 .

[19]  Herbert Bos,et al.  Minemu: The World's Fastest Taint Tracker , 2011, RAID.

[20]  Zhenkai Liang,et al.  HookFinder: Identifying and Understanding Malware Hooking Behaviors , 2008, NDSS.

[21]  Xiangyu Zhang,et al.  LDX: Causality Inference by Lightweight Dual Execution , 2016, ASPLOS.

[22]  Jun Wang,et al.  TaintPipe: Pipelined Symbolic Taint Analysis , 2015, USENIX Security Symposium.

[23]  Heng Yin,et al.  Make it work, make it right, make it fast: building a platform-neutral whole-system dynamic binary analysis platform , 2014, ISSTA 2014.

[24]  James Newsome,et al.  Dynamic Taint Analysis for Automatic Detection, Analysis, and SignatureGeneration of Exploits on Commodity Software , 2005, NDSS.

[25]  Angelos D. Keromytis,et al.  libdft: practical dynamic data flow tracking for commodity systems , 2012, VEE '12.

[26]  Intel Corp,et al.  Virtualization Without Direct Execution or Jitting: Designing a Portable Virtual Machine Infrastructure , 2008 .