Vulnerability Disclosure in the Age of Social Media: Exploiting Twitter for Predicting Real-World Exploits

In recent years, the number of software vulnerabilities discovered has grown significantly. This creates a need for prioritizing the response to new disclosures by assessing which vulnerabilities are likely to be exploited and by quickly ruling out the vulnerabilities that are not actually exploited in the real world. We conduct a quantitative and qualitative exploration of the vulnerability-related information disseminated on Twitter. We then describe the design of a Twitter-based exploit detector, and we introduce a threat model specific to our problem. In addition to response prioritization, our detection techniques have applications in risk modeling for cyber-insurance and they highlight the value of information provided by the victims of attacks.

[1]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[2]  Jacob Ratkiewicz,et al.  Detecting and Tracking Political Abuse in Social Media , 2011, ICWSM.

[3]  Karen A. Scarfone,et al.  An analysis of CVSS version 2 vulnerability scoring , 2009, ESEM 2009.

[4]  Vern Paxson,et al.  The Matter of Heartbleed , 2014, Internet Measurement Conference.

[5]  Karen A. Scarfone,et al.  Guide to Adopting and Using the Security Content Automation Protocol (SCAP) Version 1.0 , 2010 .

[6]  Blaine Nelson,et al.  The security of machine learning , 2010, Machine Learning.

[7]  Peter L. Bartlett,et al.  Open problems in the security of learning , 2008, AISec '08.

[8]  Karen Scarfone,et al.  An analysis of CVSS version 2 vulnerability scoring , 2009, 2009 3rd International Symposium on Empirical Software Engineering and Measurement.

[9]  Yutaka Matsuo,et al.  Earthquake shakes Twitter users: real-time event detection by social sensors , 2010, WWW '10.

[10]  Tudor Dumitras,et al.  Toward a standard benchmark for computer security research: the worldwide intelligence network environment (WINE) , 2011, BADGERS '11.

[11]  M. Sebastian,et al.  Modelling the Stock Market using Twitter , 2010 .

[12]  Bernhard E. Boser,et al.  A training algorithm for optimal margin classifiers , 1992, COLT '92.

[13]  Leyla Bilge,et al.  Before we knew it: an empirical study of zero-day attacks in the real world , 2012, CCS.

[14]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[15]  Christos Faloutsos,et al.  Netprobe: a fast and scalable system for fraud detection in online auction networks , 2007, WWW '07.

[16]  Bernardo A. Huberman,et al.  Predicting the Future with Social Media , 2010, Web Intelligence.

[17]  Benyuan Liu,et al.  Predicting Flu Trends using Twitter data , 2011, 2011 IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS).

[18]  Alex Hai Wang,et al.  Don't follow me: Spam detection in Twitter , 2010, 2010 International Conference on Security and Cryptography (SECRYPT).

[19]  Marti A. Hearst Trends & Controversies: Support Vector Machines , 1998, IEEE Intell. Syst..

[20]  Rainer Böhme,et al.  Modeling Cyber-Insurance: Towards a Unifying Framework , 2010, WEIS.

[21]  Mehran Bozorgi,et al.  Beyond heuristics: learning to classify vulnerabilities and predict exploits , 2010, KDD.

[22]  Blaine Nelson,et al.  Support Vector Machines Under Adversarial Label Noise , 2011, ACML.

[23]  D. Lazer,et al.  The Parable of Google Flu: Traps in Big Data Analysis , 2014, Science.

[24]  Johan Bollen,et al.  Twitter mood predicts the stock market , 2010, J. Comput. Sci..

[25]  Michael J. Paul,et al.  National and Local Influenza Surveillance through Twitter: An Analysis of the 2012-2013 Influenza Epidemic , 2013, PloS one.

[26]  Fabio Massacci,et al.  A preliminary analysis of vulnerability scores for attacks in wild: the ekits and sym datasets , 2012, BADGERS@CCS.

[27]  Andreas Christmann,et al.  Support vector machines , 2008, Data Mining and Knowledge Discovery Handbook.

[28]  Fabio Massacci,et al.  Quantitative Assessment of Risk Reduction with Cybercrime Black Market Monitoring , 2013, 2013 IEEE Security and Privacy Workshops.

[29]  Bernardo A. Huberman,et al.  Predicting the Future with Social Media , 2010, 2010 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology.

[30]  Tudor Dumitras,et al.  Some Vulnerabilities Are Different Than Others - Studying Vulnerabilities and Attack Surfaces in the Wild , 2014, RAID.

[31]  Christos Faloutsos,et al.  Polonium: Tera-Scale Graph Mining and Inference for Malware Detection , 2011 .

[32]  Vern Paxson,et al.  Consequences of Connectivity: Characterizing Account Hijacking on Twitter , 2014, CCS.

[33]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[34]  Mihai Christodorescu,et al.  Proceedings of the 2012 ACM Workshop on Building analysis datasets and gathering experience returns for security , 2012, CCS 2012.

[35]  Mizuki Morita,et al.  Twitter Catches The Flu: Detecting Influenza Epidemics using Twitter , 2011, EMNLP.

[36]  Virgílio A. F. Almeida,et al.  Detecting Spammers on Twitter , 2010 .

[37]  Vern Paxson,et al.  Adapting Social Spam Infrastructure for Political Censorship , 2012, LEET.

[38]  Niels Provos,et al.  CAMP: Content-Agnostic Malware Protection , 2013, NDSS.

[39]  Isabelle Guyon,et al.  Automatic Capacity Tuning of Very Large VC-Dimension Classifiers , 1992, NIPS.