Two-Stage Classification Model to Detect Malicious Web Pages

Malicious web pages are an emerging security concern on the Internet due to their popularity and their potential serious impacts. Detecting and analyzing them is very costly because of their qualities and complexities. There has been some research approaches carried out in order to detect them. The approaches can be classified into two main groups based on their used analysis features: static feature based and run-time feature based approaches. While static feature based approach shows it strengthens as light-weight system, run-time feature based approach has better performance in term of detection accuracy. This paper presents a novel two-stage classification model to detect malicious web pages. Our approach divided detection process into two stages: Estimating maliciousness of web pages and then identifying malicious web pages. Static features are light-weight but less valuable so they are used to identify potential malicious web pages in the first stage. Only potential malicious web pages are forwarded to the second stage for further investigation. On the other hand, run-time features are costly but more valuable so they are used in the final stage to identify malicious web pages.

[1]  Tsuhan Chen,et al.  Malicious web content detection by machine learning , 2010, Expert Syst. Appl..

[2]  Helen J. Wang,et al.  BrowserShield: vulnerability-driven filtering of dynamic HTML , 2006, OSDI '06.

[3]  Ian Welch,et al.  HoneyC - The low-interaction client honeypot , 2006 .

[4]  Helen J. Wang,et al.  A Systematic Approach to Uncover Security Flaws in GUI Logic , 2007, 2007 IEEE Symposium on Security and Privacy (SP '07).

[5]  Chi-Sung Laih,et al.  Malicious Webpage Detection by Semantics-Aware Reasoning , 2008, 2008 Eighth International Conference on Intelligent Systems Design and Applications.

[6]  Yang Wang,et al.  Collecting Internet Malware Based on Client-side Honeypot , 2008, 2008 The 9th International Conference for Young Computer Scientists.

[7]  Xuxian Jiang,et al.  Automated Web Patrol with Strider HoneyMonkeys: Finding Web Sites That Exploit Browser Vulnerabilities , 2006, NDSS.

[8]  Felix C. Freiling,et al.  Monkey-Spider: Detecting Malicious Websites with Low-Interaction Honeyclients , 2008, Sicherheit.

[9]  Christopher Krügel,et al.  Detection and analysis of drive-by-download attacks and malicious JavaScript code , 2010, WWW '10.

[10]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[11]  Lawrence K. Saul,et al.  Identifying suspicious URLs: an application of large-scale online learning , 2009, ICML '09.

[12]  Chia-Mei Chen,et al.  Anomaly Behavior Analysis for Web Page Inspection , 2009, 2009 First International Conference on Networks & Communications.

[13]  Collin Jackson,et al.  Securing frame communication in browsers , 2008, CACM.

[14]  Niels Provos,et al.  The Ghost in the Browser: Analysis of Web-based Malware , 2007, HotBots.

[15]  ProvosNiels,et al.  Cybercrime 2.0: When the Cloud Turns Dark , 2009 .

[16]  Dieter Gollmann,et al.  Securing Web applications , 2008, Inf. Secur. Tech. Rep..

[17]  Dawei Wang,et al.  Malicious Web Pages Detection Based on Abnormal Visibility Recognition , 2009, 2009 International Conference on E-Business and Information System Security.

[18]  Niels Provos,et al.  All Your iFRAMEs Point to Us , 2008, USENIX Security Symposium.

[19]  Steven D. Gribble,et al.  A Crawler-based Study of Spyware in the Web , 2006, NDSS.

[20]  C. Seifert Know Your Enemy: Malicious Web Servers , 2007 .

[21]  Alexandros Asthenidis,et al.  Social Networks as an Attack Platform: Facebook Case Study , 2009, 2009 Eighth International Conference on Networks.

[22]  B. Achiriloaie,et al.  VI REFERENCES , 1961 .

[23]  Niels Provos,et al.  Cybercrime 2.0: When the Cloud Turns Dark , 2009, ACM Queue.

[24]  Lawrence K. Saul,et al.  Beyond blacklists: learning to detect malicious web sites from suspicious URLs , 2009, KDD.

[25]  P. Komisarczuk,et al.  Identification of Malicious Web Pages with Static Heuristics , 2008, 2008 Australasian Telecommunication Networks and Applications Conference.

[26]  Kevin Borders,et al.  Social networks and context-aware spam , 2008, CSCW.

[27]  Peishun Liu,et al.  Identification of Malicious Web Pages by Inductive Learning , 2009, WISM.

[28]  Meledath Damodaran,et al.  Security in web 2.0 application development , 2008, iiWAS.

[29]  Niels Provos,et al.  Cybercrime 2.0: when the cloud turns dark , 2009, CACM.

[30]  Mehdi Jazayeri,et al.  Some Trends in Web Application Development , 2007, Future of Software Engineering (FOSE '07).

[31]  Hector Garcia-Molina,et al.  Web Spam Taxonomy , 2005, AIRWeb.

[32]  George Lawton Web 2.0 Creates Security Challenges , 2007, Computer.