Towards Paving the Way for Large-Scale Windows Malware Analysis: Generic Binary Unpacking with Orders-of-Magnitude Performance Boost

Binary packing, encoding binary code prior to execution and decoding them at run time, is the most common obfuscation adopted by malware authors to camouflage malicious code. Especially, most packers recover the original code by going through a set of "written-then-executed" layers, which renders determining the end of the unpacking increasingly difficult. Many generic binary unpacking approaches have been proposed to extract packed binaries without the prior knowledge of packers. However, the high runtime overhead and lack of anti-analysis resistance have severely limited their adoptions. Over the past two decades, packed malware is always a veritable challenge to anti-malware landscape. This paper revisits the long-standing binary unpacking problem from a new angle: packers consistently obfuscate the standard use of API calls. Our in-depth study on an enormous variety of Windows malware packers at present leads to a common property: malware's Import Address Table (IAT), which acts as a lookup table for dynamically linked API calls, is typically erased by packers for further obfuscation; and then unpacking routine, like a custom dynamic loader, will reconstruct IAT before original code resumes execution. During a packed malware execution, if an API is invoked through looking up a rebuilt IAT, it indicates that the original payload has been restored. This insight motivates us to design an efficient unpacking approach, called BinUnpack. Compared to the previous methods that suffer from multiple "written-then-executed" unpacking layers, BinUnpack is free from tedious memory access monitoring, and therefore it introduces very small runtime overhead. To defeat a variety of ever-evolving evasion tricks, we design BinUnpack's API monitor module via a novel kernel-level DLL hijacking technique. We have evaluated BinUnpack's efficacy extensively with more than 238K packed malware and multiple Windows utilities. BinUnpack's success rate is significantly better than that of existing tools with several orders of magnitude performance boost. Our study demonstrates that BinUnpack can be applied to speeding up large-scale malware analysis.

[1]  Zhendong Su,et al.  Automatic detection of unsafe component loadings , 2010, ISSTA '10.

[2]  Galen C. Hunt,et al.  Detours: binary interception of Win32 functions , 1999 .

[3]  Marco Cova,et al.  Understanding, Denying and Detecting , 2014 .

[4]  Somesh Jha,et al.  OmniUnpack: Fast, Generic, and Safe Unpacking of Malware , 2007, Twenty-Third Annual Computer Security Applications Conference (ACSAC 2007).

[5]  Wenke Lee,et al.  Classification of packed executables for accurate computer virus detection , 2008, Pattern Recognit. Lett..

[6]  Zhendong Su,et al.  Automatic Detection of Unsafe Dynamic Component Loadings , 2012, IEEE Transactions on Software Engineering.

[7]  Kang G. Shin,et al.  MutantX-S: Scalable Malware Clustering Based on Static Features , 2013, USENIX Annual Technical Conference.

[8]  Vijay Varadharajan,et al.  Secure Dynamic Software Loading and Execution Using Cross Component Verification , 2015, 2015 45th Annual IEEE/IFIP International Conference on Dependable Systems and Networks.

[9]  Fu Jianming,et al.  Malware Behavior Capturing Based on Taint Propagation and Stack Backtracing , 2011, 2011IEEE 10th International Conference on Trust, Security and Privacy in Computing and Communications.

[10]  Debin Gao,et al.  Denial-of-Service Attacks on Host-Based Generic Unpackers , 2009, ICICS.

[11]  Vijay Varadharajan,et al.  Rethinking Software Component Security: Software Component Level Integrity and Cross Verification , 2016, Comput. J..

[12]  Gustav Lundsgård,et al.  Bypassing modern sandbox technologies , 2016 .

[13]  Barton P. Miller,et al.  Binary-code obfuscations in prevalent packer tools , 2013, CSUR.

[14]  Jonathon T. Giffin,et al.  Automatic Reverse Engineering of Malware Emulators , 2009, 2009 30th IEEE Symposium on Security and Privacy.

[15]  Stefano Zanero,et al.  Lines of malicious code: insights into the malicious software industry , 2012, ACSAC '12.

[16]  Marco Cova,et al.  Command & Control: Understanding, Denying and Detecting , 2014, ArXiv.

[17]  Stephen Fewer Reflective Dll Injection , 2008 .

[18]  Heng Yin,et al.  Renovo: a hidden code extractor for packed executables , 2007, WORM '07.

[19]  Kang G. Shin,et al.  Large-scale malware indexing using function-call graphs , 2009, CCS.

[20]  I. Phillips Windows Vista security: First impressions , 2006, Inf. Secur. Tech. Rep..

[21]  Guillaume Bonfante,et al.  CoDisasm: Medium Scale Concatic Disassembly of Self-Modifying Binaries with Overlapping Instructions , 2015, CCS.

[22]  Barton P. Miller,et al.  Hybrid Analysis and Control of Malware , 2010, RAID.

[23]  Karem A. Sakallah,et al.  Detecting Traditional Packers, Decisively , 2013, RAID.

[24]  Digit Oktavianto,et al.  Cuckoo malware analysis : analyze malware using Cuckoo Sandbox , 2013 .

[25]  Saumya K. Debray,et al.  Reverse Engineering Self-Modifying Code: Unpacker Extraction , 2010, 2010 17th Working Conference on Reverse Engineering.

[26]  Saumya Debray,et al.  A Generic Approach to Automatic Deobfuscation of Executable Code , 2015, 2015 IEEE Symposium on Security and Privacy.

[27]  Kieran McLaughlin,et al.  Obfuscation: The Hidden Malware , 2011, IEEE Security & Privacy.

[28]  Hung-Min Sun,et al.  API Monitoring System for Defeating Worms and Exploits in MS-Windows System , 2006, ACISP.

[29]  Yanick Fratantonio,et al.  Understanding Linux Malware , 2018, 2018 IEEE Symposium on Security and Privacy (SP).

[30]  Farnam Jahanian,et al.  PolyPack: an automated online packing service for optimal antivirus evasion , 2009 .

[31]  Mu Zhang,et al.  V2E: combining hardware virtualization and softwareemulation for transparent and extensible malware analysis , 2012, VEE '12.

[32]  Nirwan Ansari,et al.  Revealing Packed Malware , 2008, IEEE Security & Privacy.

[33]  Giovanni Vigna,et al.  When Malware is Packin’ Heat , 2018 .

[34]  Junji Shikata,et al.  An Empirical Evaluation of an Unpacking Method Implemented with Dynamic Binary Instrumentation , 2011, IEICE Trans. Inf. Syst..

[35]  Piotr Bania Generic Unpacking of Self-modifying, Aggressive, Packed Binary Programs , 2009, ArXiv.

[36]  Igor Santos,et al.  Countering entropy measure attacks on packed software detection , 2012, 2012 IEEE Consumer Communications and Networking Conference (CCNC).

[37]  Stefano Zanero,et al.  Measuring and Defeating Anti-Instrumentation-Equipped Malware , 2017, DIMVA.

[38]  Somesh Jha,et al.  Malware Lineage in the Wild , 2017, Comput. Secur..

[39]  Danny Quist Valsmith Covert Debugging Circumventing Software Armoring Techniques , 2007 .

[40]  Davide Balzarotti,et al.  SoK: Deep Packer Inspection: A Longitudinal Study of the Complexity of Run-Time Packers , 2015, 2015 IEEE Symposium on Security and Privacy.

[41]  Jiang Ming,et al.  BinSim: Trace-based Semantic Binary Diffing via System Call Sliced Segment Equivalence Checking , 2017, USENIX Security Symposium.

[42]  Leyla Bilge,et al.  Thwarting real-time dynamic unpacking , 2011, EUROSEC '11.

[43]  Mark Russinovich,et al.  Windows Internals - Parts 1 and 2 , 2012 .

[44]  Christopher Krügel,et al.  Scalable, Behavior-Based Malware Clustering , 2009, NDSS.

[45]  Wenke Lee,et al.  PolyUnpack: Automating the Hidden-Code Extraction of Unpack-Executing Malware , 2006, 2006 22nd Annual Computer Security Applications Conference (ACSAC'06).

[46]  Joan Calvet,et al.  Understanding Swizzor's Obfuscation Scheme , 2010 .

[47]  Robert Lyda,et al.  Using Entropy Analysis to Find Encrypted and Packed Malware , 2007, IEEE Security & Privacy.

[48]  Halvar Flake,et al.  Structural Comparison of Executable Objects , 2004, DIMVA.

[49]  Christopher Krügel,et al.  Static Disassembly of Obfuscated Binaries , 2004, USENIX Security Symposium.

[50]  Bülent Yener,et al.  A Survey On Automated Dynamic Malware Analysis Evasion and Counter-Evasion: PC, Mobile, and Web , 2017, ROOTS.

[51]  Gianluca Stringhini,et al.  PayBreak: Defense Against Cryptographic Ransomware , 2017, AsiaCCS.

[52]  Harish Patil,et al.  Pin: building customized program analysis tools with dynamic instrumentation , 2005, PLDI '05.

[53]  Heejo Lee,et al.  Packer Detection for Multi-Layer Executables Using Entropy Analysis , 2017, Entropy.

[54]  Makoto Iwamura,et al.  Stealth Loader: Trace-Free Program Loading for API Obfuscation , 2017, RAID.

[55]  Makoto Iwamura,et al.  Memory behavior-based automatic malware unpacking in stealth debugging environment , 2010, 2010 5th International Conference on Malicious and Unwanted Software.

[56]  Masakatu Morii,et al.  Comparing Malware Samples for Unpacking: A Feasibility Study , 2016, 2016 11th Asia Joint Conference on Information Security (AsiaJCIS).

[57]  Andrew Honig,et al.  Practical Malware Analysis: The Hands-On Guide to Dissecting Malicious Software , 2012 .

[58]  Gabriela Limon Garcia,et al.  Forensic physical memory analysis : an overview of tools and techniques , 2007 .

[59]  Giovanni Vigna,et al.  MalGene: Automatic Extraction of Malware Analysis Evasion Signature , 2015, CCS.

[60]  Tzi-cker Chiueh,et al.  A Study of the Packer Problem and Its Solutions , 2008, RAID.

[61]  Takeo Hariu,et al.  API Chaser: Anti-analysis Resistant Malware Analyzer , 2013, RAID.

[62]  Vinod Yegneswaran,et al.  Eureka: A Framework for Enabling Static Malware Analysis , 2008, ESORICS.

[63]  Davide Balzarotti,et al.  RAMBO: Run-Time Packer Analysis with Multiple Branch Observation , 2016, DIMVA.

[64]  David Korczynski RePEconstruct: reconstructing binaries with self-modifying code and import address table destruction , 2016, 2016 11th International Conference on Malicious and Unwanted Software (MALWARE).

[65]  Lingyu Wang,et al.  BinShape: Scalable and Robust Binary Library Function Identification Using Function Shape , 2017, DIMVA.

[66]  Felix C. Freiling,et al.  Toward Automated Dynamic Malware Analysis Using CWSandbox , 2007, IEEE Secur. Priv..

[67]  Wenke Lee,et al.  Ether: malware analysis via hardware virtualization extensions , 2008, CCS.

[68]  Using dual-mappings to evade automated unpackers , 2008 .