Detecting Obfuscated JavaScript Malware Using Sequences of Internal Function Calls

Web browsers are often used as a popular means for compromising Internet hosts. An attacker may inject a JavaScript malware into a web page. When a victim visits this page, the malware is executed and attempts to exploit a specific browser vulnerability or download an unwanted program. Obfuscated JavaScript malware can easily evade signature-based detection by changing the appearance of JavaScript code. To address this problem, some previous studies have used static analysis in which some features are extracted from both benign and malicious web pages, and then a classifier is trained to distinguish between them. Because nowadays benign JavaScript code is often obfuscated, static analysis techniques generate many false alarms. In this paper, we use dynamic analysis to monitor a web page for detecting obfuscated JavaScript malware. We first load a set of malicious web pages in a real web browser and collect a sequence of predictive function calls using internal function debugging for each of them. We then group similar sequences into the same cluster based on the normalized Levenshtein distance (NLD) metric and generate a so-called behavioral signature for each cluster. A web page is detected as malicious only if the sequence of its intercepted function calls is matched with at least one generated behavioral signature. Our evaluation results show that the generated behavioral signatures are able to detect obfuscated JavaScript malware with a low false alarm rate.

[1]  Andreas Dewald,et al.  ADSandbox: sandboxing JavaScript to fight malicious websites , 2010, SAC '10.

[2]  Margo McCall,et al.  IEEE Computer Society , 2019, Encyclopedia of Software Engineering.

[3]  Lawrence K. Saul,et al.  Beyond blacklists: learning to detect malicious web sites from suspicious URLs , 2009, KDD.

[4]  Eunjin Jung,et al.  Obfuscated malicious javascript detection using classification techniques , 2009, 2009 4th International Conference on Malicious and Unwanted Software (MALWARE).

[5]  Lucas Chi Kwong Hui,et al.  Color Set Size Problem with Application to String Matching , 1992, CPM.

[6]  Giovanni Vigna,et al.  Prophiler: a fast filter for the large-scale detection of malicious web pages , 2011, WWW.

[7]  Dan Gusfield Algorithms on Strings, Trees, and Sequences - Computer Science and Computational Biology , 1997 .

[8]  Dong Hoon Lee,et al.  JsSandbox: A Framework for Analyzing the Behavior of Malicious JavaScript Code using Internal Function Hooking , 2012, KSII Trans. Internet Inf. Syst..

[9]  Andreas Dewald,et al.  Forschungsberichte der Fakultät IV – Elektrotechnik und Informatik C UJO : Efficient Detection and Prevention of Drive-by-Download Attacks , 2010 .

[10]  Zhang,et al.  Debugging Tools来了,不再怕Windows的“大蓝脸” , 2008 .

[11]  Ramarathnam Venkatesan,et al.  Pattern Mining for Future Attacks , 2009 .

[12]  Benjamin Livshits,et al.  ZOZZLE: Fast and Precise In-Browser JavaScript Malware Detection , 2011, USENIX Security Symposium.

[13]  Christopher Krügel,et al.  Detection and analysis of drive-by-download attacks and malicious JavaScript code , 2010, WWW '10.

[14]  Wei Xu,et al.  The power of obfuscation techniques in malicious JavaScript code: A measurement study , 2012, 2012 7th International Conference on Malicious and Unwanted Software.

[15]  Damien Deville,et al.  SpyProxy: Execution-based Detection of Malicious Web Content , 2007, USENIX Security Symposium.