Code Artificiality: A Metric for the Code Stealth Based on an N-Gram Model

This paper proposes a method for evaluating the artificiality of protected code by means of an N-gram model. The proposed artificiality metric helps us measure the stealth of the protected code, that is, the degree to which protected code can be distinguished from unprotected code. In a case study, we use the proposed method to evaluate the artificiality of programs that are transformed by well-known obfuscation techniques. The results show that static obfuscating transformations (e.g., Control flow flattening) have little effect on artificiality. However, dynamic obfuscating transformations (e.g., Code encryption), or a technique that inserts junk code fragments into the program, tend to increase the artificiality, which may have a significant impact on the stealth of the code.

[1]  Jack W. Davidson,et al.  Protection of software-based survivability mechanisms , 2001, 2001 International Conference on Dependable Systems and Networks.

[2]  James H. Martin,et al.  Speech and language processing: an introduction to natural language processing, computational linguistics, and speech recognition, 2nd Edition , 2000, Prentice Hall series in artificial intelligence.

[3]  Yuichiro Kanzaki,et al.  Exploiting self-modification mechanism for program protection , 2003, Proceedings 27th Annual International Computer Software and Applications Conference. COMPAC 2003.

[4]  Saumya K. Debray,et al.  Obfuscation of executable code to improve resistance to static disassembly , 2003, CCS '03.

[5]  Marco Torchiano,et al.  Towards experimental evaluation of code obfuscation techniques , 2008, QoP '08.

[6]  Andrew Walenstein,et al.  Malware phylogeny generation using permutations of code , 2005, Journal in Computer Virology.

[7]  Bart Preneel,et al.  A taxonomy of self-modifying code for obfuscation , 2011, Comput. Secur..

[8]  Christian S. Collberg,et al.  A Taxonomy of Obfuscating Transformations , 1997 .

[9]  Christian S. Collberg,et al.  Surreptitious Software - Obfuscation, Watermarking, and Tamperproofing for Software Protection , 2009, Addison-Wesley Software Security Series.

[10]  Christian S. Collberg,et al.  Distributed application tamper detection via continuous software updates , 2012, ACSAC '12.

[11]  David Aucsmith,et al.  Tamper Resistant Software: An Implementation , 1996, Information Hiding.

[12]  Henry S. Warren,et al.  Hacker's Delight , 2002 .

[13]  Eiji Okamoto,et al.  A tentative approach to constructing tamper-resistant software , 1998, NSPW '97.

[14]  Yuichiro Kanzaki,et al.  Queue-based cost evaluation of mental simulation process in program comprehension , 2003, Proceedings. 5th International Workshop on Enterprise Networking and Computing in Healthcare Industry (IEEE Cat. No.03EX717).

[15]  Other Contributors Are Indicated Where They Contribute The Free Software Foundation , 2017 .

[16]  Robert Lyda,et al.  Using Entropy Analysis to Find Encrypted and Packed Malware , 2007, IEEE Security & Privacy.