DECAF: A Platform-Neutral Whole-System Dynamic Binary Analysis Platform

Dynamic binary analysis is a prevalent and indispensable technique in program analysis. While several dynamic binary analysis tools and frameworks have been proposed, all suffer from one or more of: prohibitive performance degradation, a semantic gap between the analysis code and the program being analyzed, architecture/OS specificity, being user-mode only, and lacking APIs. We present DECAF, a virtual machine based, multi-target, whole-system dynamic binary analysis framework built on top of QEMU. DECAF provides Just-In-Time Virtual Machine Introspection and a plugin architecture with a simple-to-use event-driven programming interface. DECAF implements a new instruction-level taint tracking engine at bit granularity, which exercises fine control over the QEMU Tiny Code Generator (TCG) intermediate representation to accomplish on-the-fly optimizations while ensuring that the taint propagation is sound and highly precise. We perform a formal analysis of DECAF's taint propagation rules to verify that most instructions introduce neither false positives nor false negatives. We also present three platform-neutral plugins—Instruction Tracer, Keylogger Detector, and API Tracer, to demonstrate the ease of use and effectiveness of DECAF in writing cross-platform and system-wide analysis tools. Implementation of DECAF consists of 9,550 lines of C++ code and 10,270 lines of C code and we evaluate DECAF using CPU2006 SPEC benchmarks and show average overhead of 605 percent for system wide tainting and 12 percent for VMI.

[1]  Herbert Bos,et al.  Argos: an emulator for fingerprinting zero-day attacks for advertised honeypots with automatic signature generation , 2006, EuroSys.

[2]  Jinpeng Wei,et al.  MOSE: Live Migration Based On-the-Fly Software Emulation , 2015, ACSAC.

[3]  Heng Yin TEMU: Binary Code Analysis via Whole-System Layered Annotative Execution , 2010 .

[4]  J. Meseguer,et al.  Security Policies and Security Models , 1982, 1982 IEEE Symposium on Security and Privacy.

[5]  Alessandro Orso,et al.  Dytan: a generic dynamic taint analysis framework , 2007, ISSTA '07.

[6]  Nicholas Nethercote,et al.  Valgrind: a framework for heavyweight dynamic binary instrumentation , 2007, PLDI '07.

[7]  Heng Yin,et al.  Panorama: capturing system-wide information flow for malware detection and analysis , 2007, CCS '07.

[8]  Angelos D. Keromytis,et al.  A General Approach for Efficiently Accelerating Software-based Dynamic Data Flow Tracking on Commodity Hardware , 2012, NDSS.

[9]  Joe D. Warren,et al.  The program dependence graph and its use in optimization , 1987, TOPL.

[10]  Tal Garfinkel,et al.  Understanding data lifetime via whole system simulation , 2004 .

[11]  Zhenkai Liang,et al.  BitBlaze: A New Approach to Computer Security via Binary Analysis , 2008, ICISS.

[12]  Heng Yin,et al.  MACE: high-coverage and robust memory analysis for commodity operating systems , 2014, ACSAC '14.

[13]  Herbert Bos,et al.  Minemu: The World's Fastest Taint Tracker , 2011, RAID.

[14]  Clark W. Barrett,et al.  The SMT-LIB Standard Version 2.0 , 2010 .

[15]  Stephen McCamant,et al.  Path-exploration lifting: hi-fi tests for lo-fi emulators , 2012, ASPLOS XVII.

[16]  Zhenkai Liang,et al.  HookFinder: Identifying and Understanding Malware Hooking Behaviors , 2008, NDSS.

[17]  David L. Dill,et al.  A Decision Procedure for Bit-Vectors and Arrays , 2007, CAV.

[18]  Jelena Mirkovic,et al.  Safe and Automated Live Malware Experimentation on Public Testbeds , 2014, CSET.

[19]  Lok K. Yan,et al.  On Soundness and Precision of Dynamic Taint Analysis , 2014 .

[20]  Ankur Taly,et al.  Automated synthesis of symbolic instruction encodings from I/O samples , 2012, PLDI.

[21]  George Candea,et al.  S2E: a platform for in-vivo multi-path analysis of software systems , 2011, ASPLOS XVI.

[22]  Nikolaj Bjørner,et al.  Z3: An Efficient SMT Solver , 2008, TACAS.

[23]  Heng Yin,et al.  Renovo: a hidden code extractor for packed executables , 2007, WORM '07.

[24]  Mu Zhang,et al.  Extract Me If You Can: Abusing PDF Parsers in Malware Detectors , 2016, NDSS.

[25]  Stephen McCamant,et al.  DTA++: Dynamic Taint Analysis with Targeted Control-Flow Propagation , 2011, NDSS.

[26]  Nicholas Nethercote,et al.  Using Valgrind to Detect Undefined Value Errors with Bit-Precision , 2005, USENIX Annual Technical Conference, General Track.

[27]  Xuxian Jiang,et al.  Stealthy malware detection through vmm-based "out-of-the-box" semantic view reconstruction , 2007, CCS '07.

[28]  Yangchun Fu,et al.  Space Traveling across VM: Automatically Bridging the Semantic Gap in Virtual Machine Introspection via Online Kernel Data Redirection , 2012, 2012 IEEE Symposium on Security and Privacy.

[29]  Frederic T. Chong,et al.  Minos: Control Data Attack Prevention Orthogonal to Memory Model , 2004, 37th International Symposium on Microarchitecture (MICRO-37'04).

[30]  David Brumley,et al.  BAP: A Binary Analysis Platform , 2011, CAV.

[31]  Jonathon T. Giffin,et al.  2011 IEEE Symposium on Security and Privacy Virtuoso: Narrowing the Semantic Gap in Virtual Machine Introspection , 2022 .

[32]  Heng Yin,et al.  DroidScope: Seamlessly Reconstructing the OS and Dalvik Semantic Views for Dynamic Android Malware Analysis , 2012, USENIX Security Symposium.

[33]  Wenke Lee,et al.  Ether: malware analysis via hardware virtualization extensions , 2008, CCS.

[34]  James Newsome,et al.  Dynamic Taint Analysis for Automatic Detection, Analysis, and SignatureGeneration of Exploits on Commodity Software , 2005, NDSS.

[35]  Angelos D. Keromytis,et al.  libdft: practical dynamic data flow tracking for commodity systems , 2012, VEE '12.

[36]  Fabrice Bellard,et al.  QEMU, a Fast and Portable Dynamic Translator , 2005, USENIX ATC, FREENIX Track.

[37]  Harish Patil,et al.  Pin: building customized program analysis tools with dynamic instrumentation , 2005, PLDI '05.

[38]  Cheng Wang,et al.  LIFT: A Low-Overhead Practical Information Flow Tracking System for Detecting Security Attacks , 2006, 2006 39th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'06).

[39]  Nils Klarlund,et al.  MONA 1.x: New Techniques for WS1S and WS2S , 1998, CAV.