Crowdsourcing Cybersecurity: Cyber Attack Detection using Social Media

Social media is often viewed as a sensor into various societal events such as disease outbreaks, protests, and elections. We describe the use of social media as a crowdsourced sensor to gain insight into ongoing cyber-attacks. Our approach detects a broad range of cyber-attacks (e.g., distributed denial of service (DDoS) attacks, data breaches, and account hijacking) in a weakly supervised manner using just a small set of seed event triggers and requires no training or labeled samples. A new query expansion strategy based on convolution kernels and dependency parses helps model semantic structure and aids in identifying key event characteristics. Through a large-scale analysis over Twitter, we demonstrate that our approach consistently identifies and encodes events, outperforming existing methods.

[1]  Hila Becker,et al.  Identifying content for planned events across social media sites , 2012, WSDM '12.

[2]  Leyla Bilge,et al.  The Dropper Effect: Insights into Malware Distribution with Downloader Graph Analytics , 2015, CCS.

[3]  Aravind Srinivasan,et al.  'Beating the news' with EMBERS: forecasting civil unrest using open source indicators , 2014, KDD.

[4]  Nicolas Christin,et al.  Automatically Detecting Vulnerable Websites Before They Turn Malicious , 2014, USENIX Security Symposium.

[5]  Heng Ji,et al.  Refining Event Extraction through Cross-Document Inference , 2008, ACL.

[6]  Chang-Tien Lu,et al.  Determining Relative Airport Threats from News and Social Media , 2017, AAAI.

[7]  Yutaka Matsuo,et al.  Earthquake shakes Twitter users: real-time event detection by social sensors , 2010, WWW '10.

[8]  Zhou Li,et al.  Acing the IOC Game: Toward Automatic Discovery and Analysis of Open-Source Cyber Threat Intelligence , 2016, CCS.

[9]  Xiaofeng Wang,et al.  Automatic Crime Prediction Using Events Extracted from Twitter Posts , 2012, SBP.

[10]  Jakub Piskorski,et al.  Enhancing Event Descriptions through Twitter Mining , 2012, ICWSM.

[11]  Alberto Maria Segre,et al.  The Use of Twitter to Track Levels of Disease Activity and Public Concern in the U.S. during the Influenza A H1N1 Pandemic , 2011, PloS one.

[12]  Lawrence B. Holder,et al.  Discovering Structural Anomalies in Graph-Based Data , 2007, Seventh IEEE International Conference on Data Mining Workshops (ICDMW 2007).

[13]  Michael McGill,et al.  Introduction to Modern Information Retrieval , 1983 .

[14]  Naren Ramakrishnan,et al.  Planned Protest Modeling in News and Social Media , 2015, AAAI.

[15]  Jon M. Kleinberg,et al.  Bursty and Hierarchical Structure in Streams , 2002, Data Mining and Knowledge Discovery.

[16]  Diane J. Cook,et al.  Graph-based anomaly detection , 2003, KDD '03.

[17]  ChenLei,et al.  Event detection over twitter social media streams , 2014, VLDB 2014.

[18]  Mingyan Liu,et al.  Predicting Cyber Security Incidents Using Feature-Based Characterization of Network-Level Malicious Activities , 2015, IWSPA@CODASPY.

[19]  Parinaz Naghizadeh Ardabili,et al.  Cloudy with a Chance of Breach: Forecasting Cyber Security Incidents , 2015, USENIX Security Symposium.

[20]  Chang-Tien Lu,et al.  Unsupervised Spatial Event Detection in Targeted Domains with Applications to Civil Unrest Modeling , 2014, PloS one.

[21]  Paul Barford,et al.  Intrusion as (anti)social communication: characterization and detection , 2012, KDD.

[22]  Jakub Piskorski,et al.  Exploiting Twitter for Border Security-Related Intelligence Gathering , 2013, 2013 European Intelligence and Security Informatics Conference.

[23]  Lei Chen,et al.  Event detection over twitter social media streams , 2013, The VLDB Journal.

[24]  Rohit J. Kate A Dependency-based Word Subsequence Kernel , 2008, EMNLP.

[25]  Tom M. Mitchell,et al.  Weakly Supervised Extraction of Computer Security Events from Twitter , 2015, WWW.

[26]  Adam Doupé,et al.  Towards Automated Threat Intelligence Fusion , 2016, 2016 IEEE 2nd International Conference on Collaboration and Internet Computing (CIC).

[27]  Tudor Dumitras,et al.  Understanding the Relationship between Human Behavior and Susceptibility to Cyber Attacks , 2017, ACM Trans. Intell. Syst. Technol..

[28]  Oren Etzioni,et al.  Open domain event extraction from twitter , 2012, KDD.

[29]  Petr Sojka,et al.  Software Framework for Topic Modelling with Large Corpora , 2010 .

[30]  Delbert Dueck,et al.  Clustering by Passing Messages Between Data Points , 2007, Science.

[31]  ChengXiang Zhai,et al.  Discovering evolutionary theme patterns from text: an exploration of temporal text mining , 2005, KDD '05.

[32]  William M. Campbell,et al.  Toward Finding Malicious Cyber Discussions in Social Media , 2017, AAAI Workshops.

[33]  Weiru Liu,et al.  Detecting anomalies in graphs with numeric labels , 2011, CIKM '11.

[34]  Wael Khreich,et al.  A Survey of Techniques for Event Detection in Twitter , 2015, Comput. Intell..

[35]  Flora S. Tsai,et al.  Detecting Cyber Security Threats in Weblogs Using Probabilistic Models , 2007, PAISI.

[36]  Tudor Dumitras,et al.  Vulnerability Disclosure in the Age of Social Media: Exploiting Twitter for Predicting Real-World Exploits , 2015, USENIX Security Symposium.

[37]  Hila Becker,et al.  Beyond Trending Topics: Real-World Event Identification on Twitter , 2011, ICWSM.

[38]  Salvatore J. Stolfo,et al.  Data Mining Approaches for Intrusion Detection , 1998, USENIX Security Symposium.

[39]  Tudor Dumitras,et al.  FeatureSmith: Automatically Engineering Features for Malware Detection by Mining the Security Literature , 2016, CCS.

[40]  Stefan Savage,et al.  You've Got Vulnerability: Exploring Effective Vulnerability Notifications , 2016, USENIX Security Symposium.