SelectiveTaint: Efficient Data Flow Tracking With Static Binary Rewriting

Taint analysis has been widely used in many security applications such as exploit detection, information flow tracking, malware analysis, and protocol reverse engineering. State-of-theart taint analysis tools are usually built atop dynamic binary instrumentation, which instruments at every possible instruction, and rely on runtime information to decide whether a particular instruction involves taint or not, thereby usually having high performance overhead. This paper presents SELECTIVETAINT, an efficient selective taint analysis framework for binary executables. The key idea is to selectively instrument the instructions involving taint analysis using static binary rewriting instead of dynamic binary instrumentation. At a high level, SELECTIVETAINT statically scans taint sources of interest in the binary code, leverages value set analysis to conservatively determine whether an instruction operand needs to be tainted or not, and then selectively taints the instructions of interest. We have implemented SELECTIVETAINT and evaluated it with a set of binary programs including 16 coreutils (focusing on file I/O) and five network daemon programs (focusing on network I/O) such as nginx web server. Our evaluation results show that the binaries statically instrumented by SELECTIVETAINT has superior performance compared to the state-of-the-art dynamic taint analysis frameworks (e.g., 1.7x faster than that of libdft).

[1]  Anh Nguyen-Tuong,et al.  Automatically Hardening Web Applications Using Precise Tainting , 2005, SEC.

[2]  Christopher Krügel,et al.  Ramblr: Making Reassembly Great Again , 2017, NDSS.

[3]  Thomas W. Reps,et al.  Analyzing Memory Accesses in x86 Executables , 2004, CC.

[4]  Heng Yin,et al.  Panorama: capturing system-wide information flow for malware detection and analysis , 2007, CCS '07.

[5]  Jun Wang,et al.  TaintPipe: Pipelined Symbolic Taint Analysis , 2015, USENIX Security Symposium.

[6]  R. Sekar,et al.  Efficient fine-grained binary instrumentationwith applications to taint-tracking , 2008, CGO '08.

[7]  Satish Narayanasamy,et al.  Iodine: Fast Dynamic Taint Tracking Using Rollback-free Optimistic Hybrid Analysis , 2019, 2019 IEEE Symposium on Security and Privacy (SP).

[8]  Angelos D. Keromytis,et al.  ShadowReplica: efficient parallelization of dynamic data flow tracking , 2013, CCS.

[9]  Guilherme Ottoni,et al.  RIFLE: An Architectural Framework for User-Centric Information-Flow Security , 2004, 37th International Symposium on Microarchitecture (MICRO-37'04).

[10]  Angelos D. Keromytis,et al.  libdft: practical dynamic data flow tracking for commodity systems , 2012, VEE '12.

[11]  Thomas W. Reps,et al.  WYSINWYX: What you see is not what you eXecute , 2005, TOPL.

[12]  Angelos D. Keromytis,et al.  A General Approach for Efficiently Accelerating Software-based Dynamic Data Flow Tracking on Commodity Hardware , 2012, NDSS.

[13]  Xi Chen,et al.  A Tough Call: Mitigating Advanced Code-Reuse Attacks at the Binary Level , 2016, 2016 IEEE Symposium on Security and Privacy (SP).

[14]  Christopher Krügel,et al.  SOK: (State of) The Art of War: Offensive Techniques in Binary Analysis , 2016, 2016 IEEE Symposium on Security and Privacy (SP).

[15]  Zhenkai Liang,et al.  Polyglot: automatic extraction of protocol message format using dynamic binary analysis , 2007, CCS '07.

[16]  Xiangyu Zhang,et al.  Automatic Reverse Engineering of Data Structures from Binary Execution , 2010, NDSS.

[17]  Dawn Xiaodong Song,et al.  TaintEraser: protecting sensitive data leaks using application-level taint tracking , 2011, OPSR.

[18]  Eric M. Schulte,et al.  Datalog Disassembly , 2019, USENIX Security Symposium.

[19]  Alessandro Orso,et al.  Using positive tainting and syntax-aware evaluation to counter SQL injection attacks , 2006, SIGSOFT '06/FSE-14.

[20]  David Zhang,et al.  Secure program execution via dynamic information flow tracking , 2004, ASPLOS XI.

[21]  Monica S. Lam,et al.  Cloning-based context-sensitive pointer alias analysis using binary decision diagrams , 2004, PLDI '04.

[22]  Claudia Eckert,et al.  τCFI: Type-Assisted Control Flow Integrity for x86-64 Binaries , 2018, RAID.

[23]  Tadeusz Pietraszek,et al.  Defending Against Injection Attacks Through Context-Sensitive String Evaluation , 2005, RAID.

[24]  Kevin W. Hamlen,et al.  Superset Disassembly: Statically Rewriting x86 Binaries Without Heuristics , 2018, NDSS.

[25]  Yizheng Chen,et al.  Neutaint: Efficient Dynamic Taint Analysis with Neural Networks , 2019, 2020 IEEE Symposium on Security and Privacy (SP).

[26]  Harish Patil,et al.  Pin: building customized program analysis tools with dynamic instrumentation , 2005, PLDI '05.

[27]  Xuxian Jiang,et al.  Automatic Protocol Format Reverse Engineering through Context-Aware Monitored Execution , 2008, NDSS.

[28]  Xiangyu Zhang,et al.  Obfuscation resilient binary code reuse through trace-oriented programming , 2013, CCS.

[29]  Matthew Hicks,et al.  Full-Speed Fuzzing: Reducing Fuzzing Overhead through Coverage-Guided Tracing , 2018, 2019 IEEE Symposium on Security and Privacy (SP).

[30]  G. Ramalingam,et al.  The undecidability of aliasing , 1994, TOPL.

[31]  Alessandro Orso,et al.  Dytan: a generic dynamic taint analysis framework , 2007, ISSTA '07.

[32]  Yi Sun,et al.  Probabilistic Disassembly , 2019, 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE).

[33]  Wolfram Amme,et al.  Data Dependence Analysis of Assembly Code , 2004, International Journal of Parallel Programming.

[34]  Jun Wang,et al.  StraightTaint: Decoupled offline symbolic taint analysis , 2016, 2016 31st IEEE/ACM International Conference on Automated Software Engineering (ASE).

[35]  Saumya K. Debray,et al.  Alias analysis of executable code , 1998, POPL '98.

[36]  Barton P. Miller,et al.  Anywhere, any-time binary instrumentation , 2011, PASTE '11.

[37]  James Newsome,et al.  Dynamic Taint Analysis for Automatic Detection, Analysis, and SignatureGeneration of Exploits on Commodity Software , 2005, NDSS.

[38]  Herbert Bos,et al.  Minemu: The World's Fastest Taint Tracker , 2011, RAID.

[39]  Xiangyu Zhang,et al.  High Accuracy Attack Provenance via Binary-based Execution Partition , 2013, NDSS.

[40]  Dinghao Wu,et al.  Reassembleable Disassembling , 2015, USENIX Security Symposium.