Automatic Detection for JavaScript Obfuscation Attacks in Web Pages through String Pattern Analysis

Recently, most of malicious web pages include obfuscated codes in order to circumvent the detection of signature-based detection systems. It is difficult to decide whether the sting is obfuscated because the shape of obfuscated strings are changed continuously. In this paper, we propose a novel methodology that can detect obfuscated strings in the malicious web pages. We extracted three metrics as rules for detecting obfuscated strings by analyzing patterns of normal and malicious JavaScript codes. They are N-gram , Entropy , and Word Size . N-gram checks how many each byte code is used in strings. Entropy checks distributed of used byte codes. Word size checks whether there is used very long string. Based on the metrics, we implemented a practical tool for our methodology and evaluated it using read malicious web pages. The experiment results showed that our methodology can detect obfuscated strings in web pages effectively.