Detecting Malicious Javascript in PDF through Document Instrumentation

An emerging threat vector, embedded malware inside popular document formats, has become rampant since 2008. Owed to its wide-spread use and Javascript support, PDF has been the primary vehicle for delivering embedded exploits. Unfortunately, existing defenses are limited in effectiveness, vulnerable to evasion, or computationally expensive to be employed as an on-line protection system. In this paper, we propose a context-aware approach for detection and confinement of malicious Javascript in PDF. Our approach statically extracts a set of static features and inserts context monitoring code into a document. When an instrumented document is opened, the context monitoring code inside will cooperate with our runtime monitor to detect potential infection attempts in the context of Javascript execution. Thus, our detector can identify malicious documents by using both static and runtime features. To validate the effectiveness of our approach in a real world setting, we first conduct a security analysis, showing that our system is able to remain effective in detection and be robust against evasion attempts even in the presence of sophisticated adversaries. We implement a prototype of the proposed system, and perform extensive experiments using 18623 benign PDF samples and 7370 malicious samples. Our evaluation results demonstrate that our approach can accurately detect and confine malicious Javascript in PDF with minor performance overhead.

[1]  Mohamed Omar Rayes Protected Mode , 2011, Encyclopedia of Cryptography and Security.

[2]  Haining Wang,et al.  Characterizing insecure javascript practices on the web , 2009, WWW '09.

[3]  Evangelos P. Markatos,et al.  Comprehensive shellcode detection using runtime heuristics , 2010, ACSAC '10.

[4]  Evangelos P. Markatos,et al.  Combining static and dynamic analysis for the detection of malicious documents , 2011, EUROSEC '11.

[5]  Christopher Krügel,et al.  Detection and analysis of drive-by-download attacks and malicious JavaScript code , 2010, WWW '10.

[6]  Stephanie Forrest,et al.  A sense of self for Unix processes , 1996, Proceedings 1996 IEEE Symposium on Security and Privacy.

[7]  Angelos Stavrou,et al.  Malicious PDF detection using metadata and structural features , 2012, ACSAC '12.

[8]  Benjamin Livshits,et al.  ZOZZLE: Fast and Precise In-Browser JavaScript Malware Detection , 2011, USENIX Security Symposium.

[9]  Vitaly Shmatikov,et al.  Abusing File Processing in Malware Detectors for Fun and Profit , 2012, 2012 IEEE Symposium on Security and Privacy.

[10]  Michalis Polychronakis,et al.  An Empirical Study of Real-world Polymorphic Code Injection Attacks , 2009, LEET.

[11]  Pavel Laskov,et al.  Detection of Malicious PDF Files Based on Hierarchical Document Structure , 2013, NDSS.

[12]  Muhammad Zubair Shafiq,et al.  Embedded Malware Detection Using Markov n-Grams , 2008, DIMVA.

[13]  Joshua Mason,et al.  English shellcode , 2009, CCS.

[14]  Fabian Monrose,et al.  Automatic Hooking for Forensic Analysis of Document-based Code Injection Attacks Techniques and Empirical Analyses , 2012 .

[15]  Ariel J. Feldman,et al.  Lest we remember: cold-boot attacks on encryption keys , 2008, CACM.

[16]  Sherri Davidoff Cleartext Passwords in Linux Memory , 2008 .

[17]  Pavel Laskov,et al.  Static detection of malicious JavaScript-bearing PDF documents , 2011, ACSAC '11.

[18]  Felix C. Freiling,et al.  Toward Automated Dynamic Malware Analysis Using CWSandbox , 2007, IEEE Secur. Priv..

[19]  Christopher Krügel,et al.  Defending Browsers against Drive-by Downloads: Mitigating Heap-Spraying Code Injection Attacks , 2009, DIMVA.

[20]  Giorgio Giacinto,et al.  A Pattern Recognition System for Malicious PDF Files Detection , 2012, MLDM.

[21]  Christopher Krügel,et al.  Effective and Efficient Malware Detection at the End Host , 2009, USENIX Security Symposium.

[22]  Wei Xu,et al.  JStill: mostly static detection of obfuscated malicious JavaScript code , 2013, CODASPY.

[23]  Giorgio Giacinto,et al.  Looking at the bag is not enough to find the bomb: an evasion of structural methods for malicious PDF files detection , 2013, ASIA CCS '13.

[24]  Salvatore J. Stolfo,et al.  A Study of Malcode-Bearing Documents , 2007, DIMVA.