Covert Computation - Hiding code in code through compile-time obfuscation

Abstract Recently, the concept of semantic-aware malware detection has been proposed in the literature. Instead of relying on a syntactic analysis (i.e., comparison of a program to pre-generated signatures of malware samples), semantic-aware malware detection tries to model the effects a malware sample has on the machine. Thus, it does not depend on a specific syntactic implementation. For this purpose a model of the underlying machine is used. While it is possible to construct more and more precise models of hardware architectures, we show that there are ways to implement hidden functionality based on side effects in the microprocessor that are difficult to cover with a model. In this paper we give a comprehensive analysis of side effects in the x86 architecture and describe an implementation concept based on the idea of compile-time obfuscation, where obfuscating transformations are applied to the code at compile time. Finally, we provide an evaluation based on a prototype implementation to show the practicability of our approach and estimate complexity and space overhead using actual malware samples.

[1]  Paul C. van Oorschot,et al.  White-Box Cryptography and an AES Implementation , 2002, Selected Areas in Cryptography.

[2]  Rhiannon Weaver,et al.  A Probabilistic Population Study of the Conficker-C Botnet , 2010, PAM.

[3]  Daniel Bilar,et al.  Opcodes as predictor for malware , 2007, Int. J. Electron. Secur. Digit. Forensics.

[4]  Christian S. Collberg,et al.  A Taxonomy of Obfuscating Transformations , 1997 .

[5]  Zhendong Su,et al.  On deriving unknown vulnerabilities from zero-day polymorphic and metamorphic worm exploits , 2005, CCS '05.

[6]  Kieran McLaughlin,et al.  Obfuscation: The Hidden Malware , 2011, IEEE Security & Privacy.

[7]  Christopher Krügel,et al.  AccessMiner: using system-centric models for malware protection , 2010, CCS '10.

[8]  A. Turing On Computable Numbers, with an Application to the Entscheidungsproblem. , 1937 .

[9]  Christopher Krügel,et al.  Limits of Static Analysis for Malware Detection , 2007, Twenty-Third Annual Computer Security Applications Conference (ACSAC 2007).

[10]  Vikram S. Adve,et al.  LLVM: a compilation framework for lifelong program analysis & transformation , 2004, International Symposium on Code Generation and Optimization, 2004. CGO 2004..

[11]  Vinod Yegneswaran,et al.  Eureka: A Framework for Enabling Static Malware Analysis , 2008, ESORICS.

[12]  Donald F. Towsley,et al.  Code red worm propagation modeling and analysis , 2002, CCS '02.

[13]  Engin Kirda,et al.  A View on Current Malware Behaviors , 2009, LEET.

[14]  John H. Reif,et al.  Autonomous programmable biomolecular devices using self-assembled DNA nanostructures , 2007, CACM.

[15]  Mattia Monga,et al.  Detecting Self-mutating Malware Using Control-Flow Graph Matching , 2006, DIMVA.

[16]  Christopher Krügel,et al.  A survey on automated dynamic malware-analysis techniques and tools , 2012, CSUR.

[17]  Felix C. Freiling,et al.  Toward Automated Dynamic Malware Analysis Using CWSandbox , 2007, IEEE Secur. Priv..

[18]  Carey Nachenberg,et al.  Computer virus-antivirus coevolution , 1997, Commun. ACM.

[19]  Somesh Jha,et al.  A semantics-based approach to malware detection , 2008, TOPL.

[20]  Gregory R. Andrews,et al.  Disassembly of executable code revisited , 2002, Ninth Working Conference on Reverse Engineering, 2002. Proceedings..

[21]  Salvatore J. Stolfo,et al.  On the infeasibility of modeling polymorphic shellcode , 2007, CCS '07.

[22]  Robert Lyda,et al.  Using Entropy Analysis to Find Encrypted and Packed Malware , 2007, IEEE Security & Privacy.

[23]  JhaSomesh,et al.  A semantics-based approach to malware detection , 2007 .

[24]  Stefan Katzenbeisser,et al.  Detecting Malicious Code by Model Checking , 2005, DIMVA.

[25]  Koen De Bosschere,et al.  Instruction Set Limitation in Support of Software Diversity , 2009, ICISC.

[26]  Christopher Krügel,et al.  Effective and Efficient Malware Detection at the End Host , 2009, USENIX Security Symposium.

[27]  Somesh Jha,et al.  Testing malware detectors , 2004, ISSTA '04.

[28]  Tzi-cker Chiueh,et al.  Automatic Generation of String Signatures for Malware Detection , 2009, RAID.

[29]  Koen De Bosschere,et al.  On the Effectiveness of Source Code Transformations for Binary Obfuscation , 2006, Software Engineering Research and Practice.

[30]  Roberto Giacobazzi,et al.  Hiding Information in Completeness Holes: New Perspectives in Code Obfuscation and Watermarking , 2008, 2008 Sixth IEEE International Conference on Software Engineering and Formal Methods.

[31]  Somesh Jha,et al.  Semantics-aware malware detection , 2005, 2005 IEEE Symposium on Security and Privacy (S&P'05).

[32]  Steven Gianvecchio,et al.  Mimimorphism: a new approach to binary code obfuscation , 2010, CCS '10.