LSTM RNN: detecting exploit kits using redirection chain sequences

While consumers use the web to perform routine activities, they are under the constant threat of attack from malicious websites. Even when visiting ‘trusted’ sites, there is always a risk that site is compromised, and, hosting a malicious script. In this scenario, the injected script would typically force the victim’s browser to undergo a series of redirects before reaching an attacker-controlled domain, which, delivers the actual malware. Although these malicious redirection chains aim to frustrate detection and analysis efforts, they could be used to help identify web-based attacks. Building upon previous work, this paper presents the first known application of a Long Short-Term Memory (LSTM) network to detect Exploit Kit (EK) traffic, utilising the structure of HTTP redirects. Samples are processed as sequences, where each timestep represents a redirect and contains a unique combination of 48 features. The experiment is conducted using a ground-truth dataset of 1279 EK and 5910 benign redirection chains. Hyper-parameters are tuned via K-fold cross-validation (5f-CV), with the optimal configuration achieving an F1 score of 0.9878 against the unseen test set. Furthermore, we compare the results of isolated feature categories to assess their importance.

[1]  Identifying Evasive Code in Malicious Websites by Analyzing Redirection Differences , 2018, IEICE Trans. Inf. Syst..

[2]  Shiraishi Yoshiaki,et al.  A Malicious Web Site Identification Technique Using Web Structure Clustering , 2017 .

[3]  Antonio Nucci,et al.  Detecting malicious HTTP redirections using trees of user browsing activity , 2014, IEEE INFOCOM 2014 - IEEE Conference on Computer Communications.

[4]  Masayuki Murata,et al.  Evasive Malicious Website Detection by Leveraging Redirection Subgraph Similarities , 2019, IEICE Trans. Inf. Syst..

[5]  Martin Grill,et al.  Exploit Kit Website Detection Using HTTP Proxy Logs , 2016, ICNCC '16.

[6]  Ralf C. Staudemeyer,et al.  Understanding LSTM - a tutorial into Long Short-Term Memory Recurrent Neural Networks , 2019, ArXiv.

[7]  Fabio Massacci,et al.  Anatomy of Exploit Kits - Preliminary Analysis of Exploit Kits as Software Artefacts , 2013, ESSoS.

[8]  Sakir Sezer,et al.  You Could Be Mine(d): The Rise of Cryptojacking , 2020, IEEE Security & Privacy.

[9]  Sakir Sezer,et al.  REdiREKT: Extracting Malicious Redirections from Exploit Kit Traffic , 2020, 2020 IEEE Conference on Communications and Network Security (CNS).

[10]  Yuta Takata,et al.  MineSpider: Extracting URLs from Environment-Dependent Drive-by Download Attacks , 2015, 2015 IEEE 39th Annual Computer Software and Applications Conference.

[11]  K. P. Soman,et al.  Detecting Android malware using Long Short-term Memory (LSTM) , 2018, J. Intell. Fuzzy Syst..

[12]  Gianluca Stringhini,et al.  Shady paths: leveraging surfing crowds to detect malicious web pages , 2013, CCS.

[13]  Zhou Li,et al.  Hunting the Red Fox Online: Understanding and Detection of Mass Redirect-Script Injections , 2014, 2014 IEEE Symposium on Security and Privacy.

[14]  Wei Ye,et al.  Anomaly-Based Web Attack Detection: A Deep Learning Approach , 2017, ICNCC.

[15]  Jiyong Jang,et al.  Detecting Malicious Exploit Kits using Tree-based Similarity Searches , 2016, CODASPY.

[16]  Ma,et al.  The Decline of Exploit Kits as an Exploitation Strategy , 2018 .

[17]  Sudsanguan Ngamsuriyaroj,et al.  Classification of Exploit-Kit behaviors via machine learning approach , 2018, 2018 20th International Conference on Advanced Communication Technology (ICACT).

[18]  Roberto Perdisci,et al.  WebWitness: Investigating, Categorizing, and Mitigating Malware Download Paths , 2015, USENIX Security Symposium.

[19]  Navneet Goyal,et al.  A Comparison of Machine Learning Attributes for Detecting Malicious Websites , 2019, 2019 11th International Conference on Communication Systems & Networks (COMSNETS).

[20]  Ayumu Kubota,et al.  An Approach to Detect Drive-By Download by Observing the Web Page Transition Behaviors , 2014, 2014 Ninth Asia Joint Conference on Information Security.

[21]  Hilde van der Togt,et al.  Publisher's Note , 2003, J. Netw. Comput. Appl..

[22]  Nazife Baykal,et al.  I see EK: A lightweight technique to reveal exploit kit family by overall URL patterns of infection chains , 2019 .

[23]  Ren-Hung Hwang,et al.  An LSTM-Based Deep Learning Approach for Classifying Malicious Traffic at the Packet Level , 2019, Applied Sciences.

[24]  Liang Liu,et al.  Research on Malicious JavaScript Detection Technology Based on LSTM , 2018, IEEE Access.