Towards a PHP webshell taxonomy using deobfuscation-assisted similarity analysis

The abundance of PHP-based Remote Access Trojans (or web shells) found in the wild has led malware researchers to develop systems capable of tracking and analysing these shells. In the past, such shells were ably classified using signature matching, a process that is currently unable to cope with the sheer volume and variety of web-based malware in circulation. Although a large percentage of newly-created webshell software incorporates portions of code derived from seminal shells such as c99 and r57, they are able to disguise this by making extensive use of obfuscation techniques intended to frustrate any attempts to dissect or reverse engineer the code. This paper presents an approach to shell classification and analysis (based on similarity to a body of known malware) in an attempt to create a comprehensive taxonomy of PHP-based web shells. Several different measures of similarity were used in conjunction with clustering algorithms and visualisation techniques in order to achieve this. Furthermore, an auxiliary component capable of syntactically deobfuscating PHP code is described. This was employed to reverse idiomatic obfuscation constructs used by software authors. It was found that this deobfuscation dramatically increased the observed levels of similarity by exposing additional code for analysis.

[1]  Peter M. Wrench,et al.  Towards a sandbox for the deobfuscation and dissection of PHP malware , 2014, 2014 Information Security for South Africa.

[2]  Christian S. Collberg,et al.  A Taxonomy of Obfuscating Transformations , 1997 .

[3]  Roger Dingledine,et al.  Financial Cryptography and Data Security , 2009, Lecture Notes in Computer Science.

[4]  Tyler Moore,et al.  Evil Searching: Compromise and Recompromise of Internet Hosts for Phishing , 2009, Financial Cryptography.

[5]  Nikolaj Cholakov On some drawbacks of the PHP platform , 2008, CompSysTech.

[6]  Ming Xu,et al.  Malware obfuscation measuring via evolutionary similarity , 2009, 2009 First International Conference on Future Information Networks.

[7]  Guoyin Wang,et al.  An Efficient Piecewise Hashing Method for Computer Forensics , 2008, First International Workshop on Knowledge Discovery and Data Mining (WKDD 2008).