Predicting Cyber Events by Leveraging Hacker Sentiment

Recent high-profile cyber attacks exemplify why organizations need better cyber defenses. Cyber threats are hard to accurately predict because attackers usually try to mask their traces. However, they often discuss exploits and techniques on hacking forums. The community behavior of the hackers may provide insights into groups' collective malicious activity. We propose a novel approach to predict cyber events using sentiment analysis. We test our approach using cyber attack data from 2 major business organizations. We consider 3 types of events: malicious software installation, malicious destination visits, and malicious emails that surpassed the target organizations' defenses. We construct predictive signals by applying sentiment analysis on hacker forum posts to better understand hacker behavior. We analyze over 400K posts generated between January 2016 and January 2018 on over 100 hacking forums both on surface and Dark Web. We find that some forums have significantly more predictive power than others. Sentiment-based models that leverage specific forums can outperform state-of-the-art deep learning and time-series models on forecasting cyber attacks weeks ahead of the events.

[1]  Paulo Shakarian,et al.  Early Warnings of Cyber Threats in Online Discussions , 2017, 2017 IEEE International Conference on Data Mining Workshops (ICDMW).

[2]  Bo Pang,et al.  Thumbs up? Sentiment Classification using Machine Learning Techniques , 2002, EMNLP.

[3]  Paul M. Salmon,et al.  It's Dark in There: Using Systems Analysis to Investigate Trust and Engagement in Dark Web Forums , 2015, HCI.

[4]  Paulo Shakarian,et al.  DarkEmbed: Exploit Prediction With Neural Language Models , 2018, AAAI.

[5]  Doina Caragea,et al.  Predicting Cyber Risks through National Vulnerability Database , 2015, Inf. Secur. J. A Glob. Perspect..

[6]  Nur Al Hasan Haldar,et al.  BiSAL - A bilingual sentiment analysis lexicon to analyze Dark Web forums for cyber security , 2015 .

[7]  Ahmad Diab,et al.  Darknet and deepnet mining for proactive cybersecurity threat intelligence , 2016, 2016 IEEE Conference on Intelligence and Security Informatics (ISI).

[8]  Shari Lawrence Pfleeger,et al.  Leveraging behavioral science to mitigate cyber security risk , 2012, Comput. Secur..

[9]  Tudor Dumitras,et al.  Vulnerability Disclosure in the Age of Social Media: Exploiting Twitter for Predicting Real-World Exploits , 2015, USENIX Security Symposium.

[10]  竹安 数博,et al.  Time series analysis and its applications , 2007 .

[11]  Sergio Takeo Kofuji,et al.  Applying multi-correlation for improving forecasting in cyber security , 2011, 2011 Sixth International Conference on Digital Information Management.

[12]  Ulrik Franke,et al.  Cyber situational awareness - A systematic review of the literature , 2014, Comput. Secur..

[13]  James W. Pennebaker,et al.  Linguistic Inquiry and Word Count (LIWC2007) , 2007 .

[14]  Efthimios Tambouris,et al.  Understanding the Predictive Power of Social Media This is a pre-print version of the following article : , 2013 .

[15]  Skipper Seabold,et al.  Statsmodels: Econometric and Statistical Modeling with Python , 2010, SciPy.

[16]  Divya Bansal,et al.  Computational Techniques for Predicting Cyber Threats , 2015 .

[17]  Sushil Jajodia,et al.  Cyber Situational Awareness - Issues and Research , 2009, Cyber Situational Awareness.

[18]  Bernardo A. Huberman,et al.  Predicting the Future with Social Media , 2010, Web Intelligence.

[19]  Paulo Shakarian,et al.  Proactive identification of exploits in the wild through vulnerability mentions online , 2017, 2017 International Conference on Cyber Conflict (CyCon U.S.).

[20]  Chaoyi Pang,et al.  Sentiment Analysis for Effective Detection of Cyber Bullying , 2012, APWeb.

[21]  H. Kuhn The Hungarian method for the assignment problem , 1955 .

[22]  S. Freud The Standard Edition of the Complete Psychological Works of Sigmund Freud , 1953 .

[23]  Richard Frank,et al.  Identifying digital threats in a hacker web forum , 2015, 2015 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM).

[24]  Eric Gilbert,et al.  VADER: A Parsimonious Rule-Based Model for Sentiment Analysis of Social Media Text , 2014, ICWSM.

[25]  Robert H. Shumway,et al.  Time series analysis and its applications : with R examples , 2017 .

[26]  S. Freud The Psychopathology of Everyday Life , 1915 .

[27]  Cleotilde Gonzalez,et al.  Cyber Situation Awareness , 2013, Hum. Factors.

[28]  Fabrício Benevenuto,et al.  A Benchmark Comparison of State-of-the-Practice Sentiment Analysis Methods , 2015, ArXiv.

[29]  Alexander Kott,et al.  Statistical models for the number of successful cyber intrusions , 2019, ArXiv.

[30]  Nick Mathewson,et al.  Tor: The Second-Generation Onion Router , 2004, USENIX Security Symposium.

[31]  Ahmad Diab,et al.  Darkweb Cyber Threat Intelligence Mining , 2017 .

[32]  Hsinchun Chen,et al.  Sentiment and affect analysis of Dark Web forums: Measuring radicalization on the internet , 2008, 2008 IEEE International Conference on Intelligence and Security Informatics.

[33]  Heejo Lee,et al.  Cyber Weather Forecasting: Forecasting Unknown Internet Worms Using Randomness Analysis , 2012, SEC.

[34]  J. Pieprzyk,et al.  Characterising and predicting cyber attacks using the Cyber Attacker Model Profile (CAMP) , 2012 .

[35]  Ashish Sureka,et al.  Applying Social Media Intelligence for Predicting and Identifying On-line Radicalization and Civil Unrest Oriented Threats , 2015, ArXiv.