My script engines know what you did in the dark: converting engines into script API tracers

Malicious scripts have been crucial attack vectors in recent attacks such as malware spam (malspam) and fileless malware. Since malicious scripts are generally obfuscated, statically analyzing them is difficult due to reflections. Therefore, dynamic analysis, which is not affected by obfuscation, is used for malicious script analysis. However, despite its wide adoption, some problems remain unsolved. Current designs of script analysis tools do not fulfill the following three requirements important for malicious script analysis. (1) Universally applicable to various script languages, (2) capable of outputting analysis logs that can precisely recover the behavior of malicious scripts, and (3) applicable to proprietary script engines. In this paper, we propose a method for automatically generating script API tracer by analyzing the target script engine binaries. The method mine the knowledge of script engine internals that are required to append behavior analysis capability. This enables the addition of analysis functionalities to arbitrary script engines and generation of script API tracers that can fulfill the above requirements. Experimental results showed that we can apply this method for building malicious script analysis tools.

[1]  Shoichi Saito,et al.  Identifying system calls invoked by malware using branch trace facilities , 2015, IMECS 2015.

[2]  Harish Patil,et al.  Pin: building customized program analysis tools with dynamic instrumentation , 2005, PLDI '05.

[3]  Kevin Coogan,et al.  Deobfuscation of virtualization-obfuscated software: a semantics-based approach , 2011, CCS '11.

[4]  Christopher Krügel,et al.  FlashDetect: ActionScript 3 Malware Detection , 2012, RAID.

[5]  Eric Bodden,et al.  jÄk: Using Dynamic Analysis to Crawl and Test Modern Web Applications , 2015, RAID.

[6]  Mu Zhang,et al.  Extract Me If You Can: Abusing PDF Parsers in Malware Detectors , 2016, NDSS.

[7]  Takeo Hariu,et al.  API Chaser: Anti-analysis Resistant Malware Analyzer , 2013, RAID.

[8]  Jonathon T. Giffin,et al.  2011 IEEE Symposium on Security and Privacy Virtuoso: Narrowing the Semantic Gap in Virtual Machine Introspection , 2022 .

[9]  Petar Tsankov,et al.  Debin: Predicting Debug Information in Stripped Binaries , 2018, CCS.

[10]  Frank Piessens,et al.  JSand: complete client-side sandboxing of third-party JavaScript without browser modifications , 2012, ACSAC '12.

[11]  M S Waterman,et al.  Identification of common molecular subsequences. , 1981, Journal of molecular biology.

[12]  Heng Yin,et al.  Measuring and Disrupting Anti-Adblockers Using Differential Execution Analysis , 2018, NDSS.

[13]  Jonathon T. Giffin,et al.  Automatic Reverse Engineering of Malware Emulators , 2009, 2009 30th IEEE Symposium on Security and Privacy.

[14]  David Brumley,et al.  TIE: Principled Reverse Engineering of Types in Binary Programs , 2011, NDSS.

[15]  Konrad Rieck,et al.  TypeMiner: Recovering Types in Binary Programs Using Machine Learning , 2019, DIMVA.

[16]  Brendan Dolan-Gavitt,et al.  Tappan Zee (north) bridge: mining memory accesses for introspection , 2013, CCS.

[17]  R. Weisberg A-N-D , 2011 .

[18]  George Candea,et al.  Prototyping symbolic execution engines for interpreted languages , 2014, ASPLOS.