Term-Rewriting Deobfuscation for Static Client-Side Scripting Malware Detection

Ensuring users with a safe web experience has become a critical problem recently as fraud and privacy infringement on the Internet are becoming current. Web-scripting-based malware is also intensively used to carry out longer-term exploitation such as XSS worms or botnets, and server-side countermeasures are often ineffective against such threats while client-side ones seldom deal with the problem of obfuscation. In order to provide a sounder and more complete analysis, we propose to carry out deobfuscation of web-scripting-language-based malware. In this paper, we study the possibility of automating the deobfuscation process using a term rewriting system based on automated deduction. Such static approach intends to evade anti-analysis techniques and unknown obfuscation schemes. With some preliminary experiments in JavaScript, we show evidence that this is actually possible and highlight several challenges we need to tackle in order to implement an effective script-based malware deobfuscator. This approach can be generalized to web scripting languages other than JavaScript such as ActionScript or VBScript. Applications encompass script-based malware static analysis or malware distribution website crawling. This paper is included in a wider project that aims to provide a client-based defense against Web 2.0 malware.

[1]  Eunjin Jung,et al.  Obfuscated malicious javascript detection using classification techniques , 2009, 2009 4th International Conference on Malicious and Unwanted Software (MALWARE).

[2]  Amit Sahai,et al.  On the (im)possibility of obfuscating programs , 2001, JACM.

[3]  Niels Provos,et al.  The Ghost in the Browser: Analysis of Web-based Malware , 2007, HotBots.

[4]  Ruo Ando,et al.  Parallel analysis of polymorphic viral code using automated deduction system , 2007, Eighth ACIS International Conference on Software Engineering, Artificial Intelligence, Networking, and Parallel/Distributed Computing (SNPD 2007).

[5]  Christopher Krügel,et al.  Detection and analysis of drive-by-download attacks and malicious JavaScript code , 2010, WWW '10.

[6]  Damien Deville,et al.  SpyProxy: Execution-based Detection of Malicious Web Content , 2007, USENIX Security Symposium.

[7]  Christopher Krügel,et al.  Cross Site Scripting Prevention with Dynamic Data Tainting and Static Analysis , 2007, NDSS.

[8]  YoungHan Choi,et al.  Automatic Detection for JavaScript Obfuscation Attacks in Web Pages through String Pattern Analysis , 2009, FGIT.

[9]  Tsuhan Chen,et al.  Malicious web content detection by machine learning , 2010, Expert Syst. Appl..

[10]  Dongyong Yang,et al.  Particle Swarm Optimization with Adaptive Parameters , 2007 .

[11]  Cristian Craioveanu Server-side script polumorphism: Techniques of analysis and defense , 2008, 2008 3rd International Conference on Malicious and Unwanted Software (MALWARE).

[12]  Youki Kadobayashi,et al.  Towards revealing JavaScript program intents using abstract interpretation , 2010, AINTEC.

[13]  Martin Johns,et al.  On JavaScript Malware and related threats , 2008, Journal in Computer Virology.