Beyond heuristics: learning to classify vulnerabilities and predict exploits

The security demands on modern system administration are enormous and getting worse. Chief among these demands, administrators must monitor the continual ongoing disclosure of software vulnerabilities that have the potential to compromise their systems in some way. Such vulnerabilities include buffer overflow errors, improperly validated inputs, and other unanticipated attack modalities. In 2008, over 7,400 new vulnerabilities were disclosed--well over 100 per week. While no enterprise is affected by all of these disclosures, administrators commonly face many outstanding vulnerabilities across the software systems they manage. Vulnerabilities can be addressed by patches, reconfigurations, and other workarounds; however, these actions may incur down-time or unforeseen side-effects. Thus, a key question for systems administrators is which vulnerabilities to prioritize. From publicly available databases that document past vulnerabilities, we show how to train classifiers that predict whether and how soon a vulnerability is likely to be exploited. As input, our classifiers operate on high dimensional feature vectors that we extract from the text fields, time stamps, cross references, and other entries in existing vulnerability disclosure reports. Compared to current industry-standard heuristics based on expert knowledge and static formulas, our classifiers predict much more accurately whether and how soon individual vulnerabilities are likely to be exploited.

[1]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[2]  David D. Lewis,et al.  Naive (Bayes) at Forty: The Independence Assumption in Information Retrieval , 1998, ECML.

[3]  William A. Arbaugh,et al.  IEEE 52 Computer , 1985 .

[4]  David Moore,et al.  Code-Red: a case study on the spread and victims of an internet worm , 2002, IMW '02.

[5]  Eric Rescorla Security Holes . . . Who Cares? , 2003, USENIX Security Symposium.

[6]  Andy Ozment,et al.  The Likelihood of Vulnerability Rediscovery and the Social Utility of Vulnerability Hunting , 2005, WEIS.

[7]  Dmitri Nizovtsev,et al.  Economic Analysis of Incentives to Disclose Software Vulnerabilities , 2005, WEIS.

[8]  Rahul Telang,et al.  Does information security attack frequency increase with vulnerability disclosure? An empirical analysis , 2006, Inf. Syst. Frontiers.

[9]  Steve W. Manzuik,et al.  Windows of Vulnerability , 2006 .

[10]  Steven M. Bellovin On the Brittleness of Software and the Infeasibility of Security Metrics , 2006, IEEE Security & Privacy Magazine.

[11]  Karen A. Scarfone,et al.  A Complete Guide to the Common Vulnerability Scoring System Version 2.0 | NIST , 2007 .

[12]  Chih-Jen Lin,et al.  LIBLINEAR: A Library for Large Linear Classification , 2008, J. Mach. Learn. Res..

[13]  Hao Xu,et al.  Optimal Policy for Software Vulnerability Disclosure , 2008, Manag. Sci..

[14]  Manuel Suter,et al.  The Forum of Incident Response and Security Teams (FIRST) , 2008 .

[15]  Bernhard Plattner,et al.  Modelling the Security Ecosystem- The Dynamics of (In)Security , 2009, WEIS.