Mitigating Webshell Attacks through Machine Learning Techniques

A webshell is a command execution environment in the form of web pages. It is often used by attackers as a backdoor tool for web server operations. Accurately detecting webshells is of great significance to web server protection. Most security products detect webshells based on feature-matching methods—matching input scripts against pre-built malicious code collections. The feature-matching method has a low detection rate for obfuscated webshells. However, with the help of machine learning algorithms, webshells can be detected more efficiently and accurately. In this paper, we propose a new PHP webshell detection model, the NB-Opcode (naive Bayes and opcode sequence) model, which is a combination of naive Bayes classifiers and opcode sequences. Through experiments and analysis on a large number of samples, the experimental results show that the proposed method could effectively detect a range of webshells. Compared with the traditional webshell detection methods, this method improves the efficiency and accuracy of webshell detection.

[1]  Fan Shi,et al.  Feature Design and Selection Based on Web Application-Oriented Active Threat Awareness Model , 2016, 2016 Sixth International Conference on Instrumentation & Measurement, Computer, Communication and Control (IMCCC).

[2]  Aiko M. Hormann,et al.  Programs for Machine Learning. Part I , 1962, Inf. Control..

[3]  Ma Duohe Research of Webshell Detection Based on Decision Tree , 2012 .

[4]  Xin Sun,et al.  A Matrix Decomposition based Webshell Detection Method , 2017, ICCSP '17.

[5]  Yoseba K. Penya,et al.  Idea: Opcode-Sequence-Based Malware Detection , 2010, ESSoS.

[6]  Dong-Hoon Yoo,et al.  WebSHArk 1.0: A Benchmark Collection for Malicious Web Shell Detection , 2015, J. Inf. Process. Syst..

[7]  Thorsten Holz,et al.  No Honor Among Thieves: A Large-Scale Analysis of Malicious Web Shells , 2016, WWW.

[8]  Han-Bing Yan,et al.  Automatic and Accurate Detection of Webshell Based on Convolutional Neural Network , 2018 .

[9]  Baojiang Cui,et al.  A Webshell Detection Technology Based on HTTP Traffic Analysis , 2018, IMIS.

[10]  B. Love Comparing supervised and unsupervised category learning , 2002, Psychonomic bulletin & review.

[11]  Wim Mees,et al.  Training a multi-criteria decision system and application to the detection of PHP webshells , 2019, 2019 International Conference on Military Communications and Information Systems (ICMCIS).

[12]  Jingjing Yang,et al.  A Method of Detecting Webshell Based on Multi-layer Perception , 2019 .

[13]  Guo Xiaojun,et al.  Webshell detection techniques in web applications , 2014, Fifth International Conference on Computing, Communications and Networking Technologies (ICCCNT).

[14]  Jing Yang,et al.  A Novel Semantic-Aware Approach for Detecting Malicious Web Traffic , 2017, ICICS.

[15]  Peter M. Wrench,et al.  Towards a PHP webshell taxonomy using deobfuscation-assisted similarity analysis , 2015, 2015 Information Security for South Africa (ISSA).

[17]  Cheng Huang,et al.  Webshell Detection Based on Random Forest–Gradient Boosting Decision Tree Algorithm , 2018, 2018 IEEE Third International Conference on Data Science in Cyberspace (DSC).

[18]  Meng Zhen Research of Linux WebShell Detection based on SVM Classifier , 2014 .