Phishcasting: Deep Learning for Time Series Forecasting of Phishing Attacks

Phishing attacks remain pervasive and continue to be a source of significant monetary loss, identity theft, and malware. One of the challenges is that in most organizational settings, the detection paradigm is inherently about identifying and reacting to threats in real-time, as they are unfolding. As a way to complement these efforts with greater foresight, we introduce the idea of phishcasting — forecasting of phishing threat levels weeks or months into the future. Given that phishing attack volume time series data is noisy and devoid of traditional seasonal and cyclical trends, we extend the time series forecasting framework to utilize multiple time series, auxiliary information and alternate representations. We also introduce CoT-Net, a flexible, end-to-end CNN-LSTM based deep learning method for forecasting of complex phishing attack volume time series. CoT-Net uses time series embeddings to uncover correlations between organizational attack patterns within and across industry sectors. Using a publicly available test bed featuring multiple organizations’ attack volume over time, we find CoT-Net to outperform most state-of-the-art time series forecasting methods. By showing that phishcasting might be possible and practical, our work has important proactive implications for cybersecurity.

[1]  Shanchieh Jay Yang,et al.  Forecasting cyberattacks with incomplete, imbalanced, and insignificant data , 2018, Cybersecur..

[2]  Thong Ngee Goh,et al.  A comparative study of neural network and Box-Jenkins ARIMA modeling in time series prediction , 2002 .

[3]  Fatemeh Zahedi,et al.  Fake-Website Detection Tools: Identifying Elements that Promote Individuals' Use and Enhance Their Performance , 2015, J. Assoc. Inf. Syst..

[4]  Ahmed Abbasi,et al.  PhishMonger: A free and open source public archive of real-world phishing websites , 2016, 2016 IEEE Conference on Intelligence and Security Informatics (ISI).

[5]  Ryan T. Wright,et al.  Research Note - Influence Techniques in Phishing Attacks: An Examination of Vulnerability and Resistance , 2014, Inf. Syst. Res..

[6]  J. Doug Tygar,et al.  The battle against phishing: Dynamic Security Skins , 2005, SOUPS '05.

[7]  Christopher Sweet,et al.  Differentiating and Predicting Cyberattack Behaviors Using LSTM , 2018, 2018 IEEE Conference on Dependable and Secure Computing (DSC).

[8]  Yumi Iwashita,et al.  Comprehensive Analysis of Time Series Forecasting Using Neural Networks , 2020, ArXiv.

[9]  Ahmet Murat Ozbayoglu,et al.  Financial Time Series Forecasting with Deep Learning : A Systematic Literature Review: 2005-2019 , 2019, Appl. Soft Comput..

[10]  Xue Ben,et al.  Deep Transformer Models for Time Series Forecasting: The Influenza Prevalence Case , 2020, ArXiv.

[11]  Tommy W. S. Chow,et al.  Textual and Visual Content-Based Anti-Phishing: A Bayesian Approach , 2011, IEEE Transactions on Neural Networks.

[12]  Roger H. L. Chiang,et al.  Big Data Research in Information Systems: Toward an Inclusive Research Agenda , 2016, J. Assoc. Inf. Syst..

[13]  Cesare Alippi,et al.  Deep Learning for Time Series Forecasting: The Electric Load Case , 2019, CAAI Trans. Intell. Technol..

[14]  Fatemeh Zahedi,et al.  Phishing susceptibility: The good, the bad, and the ugly , 2016, 2016 IEEE Conference on Intelligence and Security Informatics (ISI).

[15]  Jay F. Nunamaker,et al.  Enhancing Predictive Analytics for Anti-Phishing by Exploiting Website Genre Information , 2015, J. Manag. Inf. Syst..

[16]  Yue Zhang,et al.  DDoS Event Forecasting using Twitter Data , 2017, IJCAI.

[17]  Bruce Schneier,et al.  Inside risks: semantic network attacks , 2000, CACM.

[18]  Pavel Filonov,et al.  Multivariate Industrial Time Series with Cyber-Attack Simulation: Fault Detection Using an LSTM-based Predictive Data Model , 2016, ArXiv.

[19]  Ryan T. Wright,et al.  Training to Mitigate Phishing Attacks Using Mindfulness Techniques , 2017, J. Manag. Inf. Syst..

[20]  Fatemeh Zahedi,et al.  Interface Design Elements for Anti-phishing Systems , 2011, DESRIST.

[21]  Gordon Werner,et al.  Leveraging Intra-Day Temporal Variations to Predict Daily Cyberattack Activity , 2018, 2018 IEEE International Conference on Intelligence and Security Informatics (ISI).

[22]  Stefan Zohren,et al.  Time Series Forecasting With Deep Learning: A Survey , 2020, ArXiv.