A Markov Detection Tree-Based Centralized Scheme to Automatically Identify Malicious Webpages on Cloud Platforms

The effective detection of malicious webpages plays a paramount role in ensuring the Web security on the Internet. However, the detection results of current methods are poor and their efficiency is low, and thus, it is important and challenging to design an efficient detection scheme that can improve the accuracy of classification of malicious webpages. To overcome this challenge, a Markov detection tree scheme is proposed in this paper to automatically identify and classify malicious webpages, where the link relations of unified resource locators, the information gain ratio, and Markov decision process as well as decision tree are used to analyze malicious webpages simultaneously. To increase the detection accuracy for malicious webpages, two methods of filling missing values are presented to process the null attribute values of webpages. We compare the performance of our algorithms when the different methods are applied in terms of the information gain ratio, classification accuracy, and detection efficiency. Our experimental results show that the proposed methods can improve the accuracy and efficiency in the classification of malicious webpage detections.

[1]  Jianhua Sun,et al.  Fine-Grained Mining and Classification of Malicious Web Pages , 2013, 2013 Fourth International Conference on Digital Manufacturing & Automation.

[2]  Nor Badrul Anuar,et al.  Malicious accounts: Dark of the social networks , 2017, J. Netw. Comput. Appl..

[3]  Aijun An,et al.  Detection of malicious and non-malicious website visitors using unsupervised neural network learning , 2013, Appl. Soft Comput..

[4]  Zhou Zhou,et al.  GuidedTracker: Track the victims with access logs to finding malicious web pages , 2014, 2014 IEEE Global Communications Conference.

[5]  Raymond Chiong,et al.  Identifying malicious web domains using machine learning techniques with online credibility and performance data , 2016, 2016 IEEE Congress on Evolutionary Computation (CEC).

[6]  Aziz Mohaisen,et al.  Towards Automatic and Lightweight Detection and Classification of Malicious Web Contents , 2015, 2015 Third IEEE Workshop on Hot Topics in Web Systems and Technologies (HotWeb).

[7]  Rong Wang,et al.  Detection of malicious web pages based on hybrid analysis , 2017, J. Inf. Secur. Appl..

[8]  Anu Vazhayil,et al.  AMA: Static Code Analysis of Web Page for the Detection of Malicious Scripts , 2016 .

[9]  Tsuhan Chen,et al.  Malicious web content detection by machine learning , 2010, Expert Syst. Appl..

[10]  Athanasios V. Vasilakos,et al.  Differential Game-Based Strategies for Preventing Malware Propagation in Wireless Sensor Networks , 2014, IEEE Transactions on Information Forensics and Security.

[11]  Jiannong Cao,et al.  On-Line Anomaly Detection With High Accuracy , 2018, IEEE/ACM Transactions on Networking.

[12]  Hongjie Li,et al.  A stochastic evolutionary coalition game model of secure and dependable virtual service in Sensor-Cloud , 2015, Appl. Soft Comput..

[13]  Dohoon Kim,et al.  WebMon: ML- and YARA-based malicious webpage detection , 2018, Comput. Networks.

[14]  Yuta Takata,et al.  Fine-Grained Analysis of Compromised Websites with Redirection Graphs and JavaScript Traces , 2017, IEICE Trans. Inf. Syst..

[15]  Jianhua Liu,et al.  Energy-Efficient Two-Layer Cooperative Defense Scheme to Secure Sensor-Clouds , 2018, IEEE Transactions on Information Forensics and Security.

[16]  Prabaharan Poornachandran,et al.  A lexical approach for classifying malicious URLs , 2015, 2015 International Conference on High Performance Computing & Simulation (HPCS).

[17]  Hassan B. Kazemian,et al.  Comparisons of machine learning techniques for detecting malicious webpages , 2015, Expert Syst. Appl..

[18]  Sathish A.P. Kumar,et al.  Phishing – challenges and solutions , 2018 .

[19]  Jianhua Liu,et al.  A non-cooperative non-zero-sum game-based dependability assessment of heterogeneous WSNs with malware diffusion , 2017, J. Netw. Comput. Appl..

[20]  Ji Feng,et al.  Distributed Deep Forest and its Application to Automatic Detection of Cash-Out Fraud , 2018, ACM Trans. Intell. Syst. Technol..

[21]  Patrick Traynor,et al.  Detecting Mobile Malicious Webpages in Real Time , 2017, IEEE Transactions on Mobile Computing.

[22]  Zhao Chen,et al.  A web page malicious script detection system , 2014, 2014 IEEE 3rd International Conference on Cloud Computing and Intelligence Systems.

[23]  Danny Hendler,et al.  Detection of malicious webmail attachments based on propagation patterns , 2018, Knowl. Based Syst..

[24]  Lior Rokach,et al.  SFEM: Structural feature extraction methodology for the detection of malicious office documents using machine learning methods , 2016, Expert Syst. Appl..

[25]  Shui Yu,et al.  Multistage Signaling Game-Based Optimal Detection Strategies for Suppressing Malware Diffusion in Fog-Cloud-Based IoT Networks , 2018, IEEE Internet of Things Journal.