A Malicious Web Site Identification Technique Using Web Structure Clustering

Epidemic cyber incidents are caused by malicious websites using exploit kits. The exploit kit facilitate attackers to perform the drive-by download (DBD) attack. However, it is reported that malicious websites using an exploit kit have similarity in their website structure (WS)trees. Hence, malicious website identification techniques leveraging WStrees have been studied, where the WS-trees can be estimated from HTTP traffic data. Nevertheless, the defensive component of the exploit kit prevents us from capturing the WS-tree perfectly. This paper shows, hence, a new WS-tree construction procedure by using the fact that a DBD attack happens in a certain duration. This paper proposes, moreover, a new malicious website identification technique by clustering the WS-tree of the exploit kits. Experiment results assuming the D3M dataset verify that the proposed technique identifies exploit kits with a reasonable accuracy even when HTTP traffic from the malicious sites are partially lost. key words: website structure, malicious website, exploit kit, clustering

[1]  Yuta Takata,et al.  MineSpider: Extracting Hidden URLs Behind Evasive Drive-by Download Attacks , 2016, IEICE Trans. Inf. Syst..

[2]  Byung-Ik Kim,et al.  Suspicious Malicious Web Site Detection with Strength Analysis of a JavaScript Obfuscation , 2010 .

[3]  Jiyong Jang,et al.  Detecting Malicious Exploit Kits using Tree-based Similarity Searches , 2016, CODASPY.

[4]  Stefan Savage,et al.  Manufacturing compromise: the emergence of exploit-as-a-service , 2012, CCS.

[5]  Benjamin Livshits,et al.  Rozzle: De-cloaking Internet Malware , 2012, 2012 IEEE Symposium on Security and Privacy.

[6]  Ali A. Ghorbani,et al.  Detecting Malicious URLs Using Lexical Analysis , 2016, NSS.

[7]  Hiroshi Inamura,et al.  A proposal of malicious URLs detection based on features generated by Exploit Kits , 2016 .

[8]  Tongbo Luo,et al.  Next Generation Of Exploit Kit Detection By Building Simulated Obfuscators , 2016 .

[9]  V. N. Venkatakrishnan,et al.  WebWinnow: leveraging exploit kit workflows to detect malicious urls , 2014, CODASPY '14.

[10]  Antonio Nucci,et al.  Detecting malicious HTTP redirections using trees of user browsing activity , 2014, IEEE INFOCOM 2014 - IEEE Conference on Computer Communications.

[11]  Christopher Krügel,et al.  PExy: The Other Side of Exploit Kits , 2014, DIMVA.

[12]  Mitsuaki Akiyama,et al.  Searching Structural Neighborhood of Malicious URLs to Improve Blacklisting , 2011, 2011 IEEE/IPSJ International Symposium on Applications and the Internet.

[13]  Gianluca Stringhini,et al.  Shady paths: leveraging surfing crowds to detect malicious web pages , 2013, CCS.

[14]  Wenke Lee,et al.  ARROW: GenerAting SignatuRes to Detect DRive-By DOWnloads , 2011, WWW.

[15]  Benjamin Livshits,et al.  Kizzle: A Signature Compiler for Detecting Exploit Kits , 2016, 2016 46th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN).

[16]  Fabio Massacci,et al.  Anatomy of Exploit Kits - Preliminary Analysis of Exploit Kits as Software Artefacts , 2013, ESSoS.