Wire -- A Formal Intermediate Language for Binary Analysis

Wire is a intermediate language to enable static program analysis on low level objects such as native executables. It has practical benefit in analysing the structure and semantics of malware, or for identifying software defects in closed source software. In this paper we describe how an executable program is disassembled and translated to the Wire intermediate language. We define the formal syntax and operational semantics of Wire and discuss our justifications for its language features. We use Wire in our previous work Malwise, a malware variant detection system. We also examine applications for when a formally defined intermediate language is given. Our results include showing the semantic equivalence between obfuscated and non obfuscated code samples. These examples stem from the obfuscations commonly used by malware.

[1]  Grant Malcolm,et al.  Detection of metamorphic computer viruses using algebraic specification , 2006, Journal in Computer Virology.

[2]  Yang Xiang,et al.  A Fast Flowgraph Based Classification System for Packed and Polymorphic Malware on the Endhost , 2010, 2010 24th IEEE International Conference on Advanced Information Networking and Applications.

[3]  Yang Xiang,et al.  Malware Variant Detection Using Similarity Search over Sets of Control Flow Graphs , 2011, 2011IEEE 10th International Conference on Trust, Security and Privacy in Computing and Communications.

[4]  Flemming Nielson,et al.  Semantics with Applications: An Appetizer , 2007, Undergraduate Topics in Computer Science.

[5]  Fabrice Bellard,et al.  QEMU, a Fast and Portable Dynamic Translator , 2005, USENIX ATC, FREENIX Track.

[6]  Harish Patil,et al.  Pin: building customized program analysis tools with dynamic instrumentation , 2005, PLDI '05.

[7]  Cristina Cifuentes,et al.  Reverse compilation techniques , 1994 .

[8]  Joseph A. Goguen,et al.  Algebraic semantics of imperative programs , 1996, Foundations of computing series.

[9]  Zhenkai Liang,et al.  BitBlaze: A New Approach to Computer Security via Binary Analysis , 2008, ICISS.

[10]  Christopher Krügel,et al.  Static Disassembly of Obfuscated Binaries , 2004, USENIX Security Symposium.

[11]  Saumya K. Debray,et al.  Obfuscation of executable code to improve resistance to static disassembly , 2003, CCS '03.

[12]  Nicholas Nethercote,et al.  Valgrind: A Program Supervision Framework , 2003, RV@CAV.

[13]  Vasanth Bala,et al.  Dynamo: a transparent dynamic optimization system , 2000, SIGP.

[14]  Yang Xiang,et al.  Classification of malware using structured control flow , 2010 .

[15]  Alfred V. Aho,et al.  Compilers: Principles, Techniques, and Tools , 1986, Addison-Wesley series in computer science / World student series edition.

[16]  Thomas Dullien,et al.  REIL: A platform-independent intermediate representation of disassembled code for static code analysis , 2009 .

[17]  Somesh Jha,et al.  Testing malware detectors , 2004, ISSTA '04.

[18]  Michael Van Emmerik,et al.  Static single assignment for decompilation , 2007 .

[19]  Debin Gao,et al.  BinHunt: Automatically Finding Semantic Differences in Binary Programs , 2008, ICICS.

[20]  S. Katzenbeisser,et al.  Malware Normalization , 2005 .

[21]  David L. Dill,et al.  A Decision Procedure for Bit-Vectors and Arrays , 2007, CAV.