TIMiner: Automatically extracting and analyzing categorized cyber threat intelligence from social data

Abstract Security organizations increasingly rely on Cyber Threat Intelligence (CTI) sharing to enhance resilience against cyber threats. However, its effectiveness remains dubious due to two major limitations: first, the existing approaches fail to identify the unseen types of Indicator of compromise (IOC); second, they are incapable of automatically generating categorized CTIs with domain tags (e.g., finance, government), which makes CTI sharing ineffective. To combat the challenges, this paper proposes TIMiner, a novel automated framework for CTI extraction and sharing based on social media data. Particularly, an efficient domain recognizer based on convolutional neural network is first implemented to identify CTIs’ targeted domain. Then, an indicator of compromise (IOC) extraction approach based on word embedding and syntactic dependence is proposed, which provides the ability to identify unseen types of IOCs. Finally, the extracted IOC and its domain tag are integrated to generate a categorized CTI with specific-domain. TIMiner is capable of generating CTIs with domain tags automatically. With the categorized CTIs, Threat-Index is presented to quantify the severity of the threats toward different domains. Experimental results confirm that the proposed CTI domain recognizer and IOC extraction achieve superior performance with the accuracy exceeding 84% and 94%, respectively. Moreover, TIMiner stimulates new insights on the evolution of cyber attacks across multiple domains.

[1]  Tudor Dumitras,et al.  Vulnerability Disclosure in the Age of Social Media: Exploiting Twitter for Predicting Real-World Exploits , 2015, USENIX Security Symposium.

[2]  Humayun Zafar,et al.  Rethinking FS-ISAC: An IT Security Information Sharing Network Model for the Financial Services Sector , 2014, Commun. Assoc. Inf. Syst..

[3]  Sainadh Jamalpur,et al.  Dynamic Malware Analysis Using Cuckoo Sandbox , 2018, 2018 Second International Conference on Inventive Communication and Computational Technologies (ICICCT).

[4]  Tero Kokkonen Architecture for the Cyber Security Situational Awareness System , 2016, NEW2AN.

[5]  Robert A. Bridges,et al.  Automated Behavioral Analysis of Malware: A Case Study of WannaCry Ransomware , 2017, 2017 16th IEEE International Conference on Machine Learning and Applications (ICMLA).

[6]  Katrin Franke,et al.  Extracting cyber threat intelligence from hacker forums: Support vector machines versus convolutional neural networks , 2017, 2017 IEEE International Conference on Big Data (Big Data).

[7]  Ehab Al-Shaer,et al.  Data-driven analytics for cyber-threat intelligence and information sharing , 2017, Comput. Secur..

[8]  Yuri Demchenko,et al.  The Incident Object Description Exchange Format , 2007, RFC.

[9]  Yoon Kim,et al.  Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.

[10]  Georgios Kambourakis,et al.  DDoS in the IoT: Mirai and Other Botnets , 2017, Computer.

[11]  Vasileios Mavroeidis,et al.  Cyber Threat Intelligence Model: An Evaluation of Taxonomies, Sharing Standards, and Ontologies within Cyber Threat Intelligence , 2017, 2017 European Intelligence and Security Informatics Conference (EISIC).

[12]  Ching Y. Suen,et al.  Detecting predatory conversations in social media by deep Convolutional Neural Networks , 2016, Digit. Investig..

[13]  Florian Skopik,et al.  A problem shared is a problem halved: A survey on the dimensions of collective cyber defense through security information sharing , 2016, Comput. Secur..

[14]  Haoran Lu,et al.  Reading Thieves' Cant: Automatically Identifying and Understanding Dark Jargons from Cybercrime Marketplaces , 2018, USENIX Security Symposium.

[15]  Stefan Savage,et al.  Reading the Tea leaves: A Comparative Analysis of Threat Intelligence , 2019, USENIX Security Symposium.

[16]  Wei Xu,et al.  Bidirectional LSTM-CRF Models for Sequence Tagging , 2015, ArXiv.

[17]  Gail-Joon Ahn,et al.  ACTRA: A Case Study for Threat Information Sharing , 2015, WISCS@CCS.

[18]  Marco Balduzzi,et al.  Automatic Extraction of Indicators of Compromise for Web Applications , 2016, WWW.

[19]  Qiben Yan,et al.  Very Short Intermittent DDoS Attacks in an Unsaturated System , 2017, SecureComm.

[20]  Wiem Tounsi,et al.  A survey on technical threat intelligence in the age of sophisticated cyber attacks , 2018, Comput. Secur..

[21]  Zhou Li,et al.  Acing the IOC Game: Toward Automatic Discovery and Analysis of Open-Source Cyber Threat Intelligence , 2016, CCS.

[22]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[23]  Jong Hyuk Park,et al.  A comprehensive study on APT attacks and countermeasures for future networks and communications: challenges and solutions , 2019, The Journal of Supercomputing.

[24]  Paulo Shakarian,et al.  Early Warnings of Cyber Threats in Online Discussions , 2017, 2017 IEEE International Conference on Data Mining Workshops (ICDMW).

[25]  Joshua Neil,et al.  Attack chain detection , 2015, Stat. Anal. Data Min..

[26]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[27]  Jiyong Jang,et al.  Threat Intelligence Computing , 2018, CCS.

[28]  Ali E. Abdallah,et al.  Towards an Anonymity Supported Platform for Shared Cyber Threat Intelligence , 2017, CRiSIS.

[29]  Kevin Jones,et al.  On the collaborative practices of cyber threat intelligence analysts to develop and utilize tacit Threat and Defence Knowledge , 2016, 2016 International Conference On Cyber Situational Awareness, Data Analytics And Assessment (CyberSA).