Malicious URL protection based on attackers' habitual behavioral analysis

Abstract In terms of URL-based features, some studies have classified malicious URLs into a group with the same attributes. However, the malicious URLs are of two different types, each of which produces entirely different results. Thus, depending on their intention, adversaries leave slightly different behavioral traces within the malicious URLs. This paper presents an in-depth empirical study conducted based on 1,529,433 malicious URLs collected over the past two years. In particular, we analyze attackers' tactical behavior regarding URLs and extract common features. We then divide them into three different feature pools to determine the level of compromise of unknown URLs. To leverage detection rates, we employ a similarity matching technique. We believe that new URLs can be identified through attackers' habitual URL manipulation behaviors. This approach covers a large set of malicious URLs with small feature sets. The accuracy of the proposed approach (up to 70%) is reasonable and the approach requires only the attributes of URLs to be examined. This model can be utilized during preprocessing to determine whether input URLs are benign, and as a web filter or a risk-level scaler to estimate whether a URL is malicious.

[1]  Brian Ryner,et al.  Large-Scale Automatic Classification of Phishing Pages , 2010, NDSS.

[2]  Wenke Lee,et al.  Detecting Malware Domains at the Upper DNS Hierarchy , 2011, USENIX Security Symposium.

[3]  Minaxi Gupta,et al.  Behind Phishing: An Examination of Phisher Modi Operandi , 2008, LEET.

[4]  Gang Wang,et al.  Detecting malicious landing pages in Malware Distribution Networks , 2013, 2013 43rd Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN).

[5]  Dawn Xiaodong Song,et al.  Design and Evaluation of a Real-Time URL Spam Filtering Service , 2011, 2011 IEEE Symposium on Security and Privacy.

[6]  Babak Rahbarinia,et al.  Segugio: Efficient Behavior-Based Tracking of Malware-Control Domains in Large ISP Networks , 2015, 2015 45th Annual IEEE/IFIP International Conference on Dependable Systems and Networks.

[7]  Roberto Perdisci,et al.  Scalable fine-grained behavioral clustering of HTTP-based malware , 2013, Comput. Networks.

[8]  Christopher Krügel,et al.  Revolver: An Automated Approach to the Detection of Evasive Web-based Malware , 2013, USENIX Security Symposium.

[9]  Nick Feamster,et al.  Building a Dynamic Reputation System for DNS , 2010, USENIX Security Symposium.

[10]  Thorsten Holz,et al.  As the net churns: Fast-flux botnet observations , 2008, 2008 3rd International Conference on Malicious and Unwanted Software (MALWARE).

[11]  Juan Caballero,et al.  Driving in the Cloud: An Analysis of Drive-by Download Operations and Abuse Reporting , 2013, DIMVA.

[12]  Leyla Bilge,et al.  EXPOSURE: Finding Malicious Domains Using Passive DNS Analysis , 2011, NDSS.

[13]  Gianluca Stringhini,et al.  Shady paths: leveraging surfing crowds to detect malicious web pages , 2013, CCS.

[14]  Benjamin Livshits,et al.  Kizzle: A Signature Compiler for Detecting Exploit Kits , 2016, 2016 46th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN).

[15]  Niels Provos,et al.  All Your iFRAMEs Point to Us , 2008, USENIX Security Symposium.

[16]  Jack W. Stokes,et al.  WebCop: Locating Neighborhoods of Malware on the Web , 2010, LEET.

[17]  Felix C. Freiling,et al.  Measuring and Detecting Fast-Flux Service Networks , 2008, NDSS.

[18]  Vinod Yegneswaran,et al.  EKHunter: A Counter-Offensive Toolkit for Exploit Kit Infiltration , 2015, NDSS.

[19]  Markus Strohmaier,et al.  Short links under attack: geographical analysis of spam in a URL shortener network , 2012, HT '12.

[20]  Lawrence K. Saul,et al.  Identifying suspicious URLs: an application of large-scale online learning , 2009, ICML '09.

[21]  Giovanni Vigna,et al.  Prophiler: a fast filter for the large-scale detection of malicious web pages , 2011, WWW.

[22]  Jiyong Jang,et al.  Detecting Malicious Exploit Kits using Tree-based Similarity Searches , 2016, CODASPY.

[23]  Michalis Faloutsos,et al.  PhishDef: URL names say it all , 2010, 2011 Proceedings IEEE INFOCOM.

[24]  Lawrence K. Saul,et al.  Beyond blacklists: learning to detect malicious web sites from suspicious URLs , 2009, KDD.

[25]  V. N. Venkatakrishnan,et al.  WebWinnow: leveraging exploit kit workflows to detect malicious urls , 2014, CODASPY '14.

[26]  Jian Pei,et al.  Malicious URL detection by dynamically mining patterns without pre-defined elements , 2013, World Wide Web.