BP: Profiling Vulnerabilities on the Attack Surface

Security practitioners use the attack surface of software systems to prioritize areas of systems to test and analyze. To date, approaches for predicting which code artifacts are vulnerable have utilized a binary classification of code as vulnerable or not vulnerable. To better understand the strengths and weaknesses of vulnerability prediction approaches, vulnerability datasets with classification and severity data are needed. The goal of this paper is to help researchers and practitioners make security effort prioritization decisions by evaluating which classifications and severities of vulnerabilities are on an attack surface approximated using crash dump stack traces. In this work, we use crash dump stack traces to approximate the attack surface of Mozilla Firefox. We then generate a dataset of 271 vulnerable files in Firefox, classified using the Common Weakness Enumeration (CWE) system. We use these files as an oracle for the evaluation of the attack surface generated using crash data. In the Firefox vulnerability dataset, 14 different classifications of vulnerabilities appeared at least once. In our study, 85.3% of vulnerable files were on the attack surface generated using crash data. We found no difference between the severity of vulnerabilities found on the attack surface generated using crash data and vulnerabilities not occurring on the attack surface. Additionally, we discuss lessons learned during the development of this vulnerability dataset.

[1]  Yanick Fratantonio,et al.  RETracer: Triaging Crashes by Reverse Execution from Partial Memory Dumps , 2016, 2016 IEEE/ACM 38th International Conference on Software Engineering (ICSE).

[2]  Laurie A. Williams,et al.  Evaluating Complexity, Code Churn, and Developer Activity Metrics as Indicators of Software Vulnerabilities , 2011, IEEE Transactions on Software Engineering.

[3]  Ahmed E. Hassan,et al.  What Do Mobile App Users Complain About? , 2015, IEEE Software.

[4]  Inderpal S. Bhandari,et al.  Orthogonal Defect Classification - A Concept for In-Process Measurements , 1992, IEEE Trans. Software Eng..

[5]  Laurie A. Williams,et al.  One Technique is Not Enough: A Comparison of Vulnerability Discovery Techniques , 2011, 2011 International Symposium on Empirical Software Engineering and Measurement.

[6]  Andrew Meneely,et al.  Do Bugs Foreshadow Vulnerabilities? A Study of the Chromium Project , 2015, 2015 IEEE/ACM 12th Working Conference on Mining Software Repositories.

[7]  Andrew Meneely,et al.  Beyond the Attack Surface: Assessing Security Risk with Random Walks on Call Graphs , 2016, SPRO@CCS.

[8]  J. Chris Foreman,et al.  Identifying the Cyber Attack Surface of the Advanced Metering Infrastructure , 2015 .

[9]  Thomas Zimmermann,et al.  What Makes a Good Bug Report? , 2008, IEEE Transactions on Software Engineering.

[10]  Laurie A. Williams,et al.  Searching for a Needle in a Haystack: Predicting Security Vulnerabilities for Windows Vista , 2010, 2010 Third International Conference on Software Testing, Verification and Validation.

[11]  Laurie A. Williams,et al.  Approximating Attack Surfaces with Stack Traces , 2015, 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering.

[12]  Michael Howard,et al.  Measuring Relative Attack Surfaces , 2005 .

[13]  Mark Sullivan,et al.  A comparison of software defects in database management systems and operating systems , 1992, [1992] Digest of Papers. FTCS-22: The Twenty-Second International Symposium on Fault-Tolerant Computing.

[14]  Laurie A. Williams,et al.  Risk-Based Attack Surface Approximation: How Much Data Is Enough? , 2017, 2017 IEEE/ACM 39th International Conference on Software Engineering: Software Engineering in Practice Track (ICSE-SEIP).

[15]  Ning Chen,et al.  STAR: Stack Trace Based Automatic Crash Reproduction via Symbolic Execution , 2015, IEEE Transactions on Software Engineering.

[16]  Laurie A. Williams,et al.  How bad is it, really? an analysis of severity scores for vulnerabilities: poster , 2018, HotSoS.

[17]  Jacob Cohen,et al.  Weighted kappa: Nominal scale agreement provision for scaled disagreement or partial credit. , 1968 .

[18]  Georgios Kambourakis,et al.  DDoS in the IoT: Mirai and Other Botnets , 2017, Computer.

[19]  Wouter Joosen,et al.  Is Newer Always Better?: The Case of Vulnerability Prediction Models , 2016, ESEM.

[20]  Michael W. Godfrey,et al.  Code Review Quality: How Developers See It , 2016, 2016 IEEE/ACM 38th International Conference on Software Engineering (ICSE).

[21]  David A. Wagner,et al.  Reducing attack surfaces for intra-application communication in android , 2012, SPSM '12.

[22]  William Yurcik,et al.  Threat Modeling as a Basis for Security Requirements , 2005 .

[23]  Bin Wang,et al.  Automated support for classifying software failure reports , 2003, 25th International Conference on Software Engineering, 2003. Proceedings..

[24]  Mauro Pezzè,et al.  Reproducing concurrency failures from crash stacks , 2017, ESEC/SIGSOFT FSE.

[25]  Yuanyuan Zhou,et al.  Have things changed now?: an empirical study of bug characteristics in modern open source software , 2006, ASID '06.

[26]  Indrajit Ray,et al.  Using Attack Surface Entry Points and Reachability Analysis to Assess the Risk of Software Vulnerability Exploitability , 2014, 2014 IEEE 15th International Symposium on High-Assurance Systems Engineering.

[27]  Vern Paxson,et al.  The Matter of Heartbleed , 2014, Internet Measurement Conference.

[28]  Dongxi Liu,et al.  A Reliable and Practical Approach to Kernel Attack Surface Reduction of Commodity OS , 2018, ArXiv.

[29]  Luisa Verdoliva,et al.  Automatically analyzing groups of crashes for finding correlations , 2017, ESEC/SIGSOFT FSE.

[30]  Wouter Joosen,et al.  Predicting Vulnerable Software Components via Text Mining , 2014, IEEE Transactions on Software Engineering.

[31]  Tim Menzies,et al.  Revisiting unsupervised learning for defect prediction , 2017, ESEC/SIGSOFT FSE.

[32]  Xinwen Zhang,et al.  Assessing Attack Surface with Component-Based Package Dependency , 2015, NSS.

[33]  Christopher Theisen,et al.  Better together: Comparing vulnerability prediction models , 2020, Inf. Softw. Technol..

[34]  Peng Liu,et al.  CREDAL: Towards Locating a Memory Corruption Vulnerability with Your Core Dump , 2016, CCS.

[35]  Jeannette M. Wing,et al.  An Attack Surface Metric , 2011, IEEE Transactions on Software Engineering.

[36]  Dimitris Geneiatakis Minimizing Databases Attack Surface Against SQL Injection Attacks , 2015, ICICS.

[37]  Andreas Zeller,et al.  It's not a bug, it's a feature: How misclassification impacts bug prediction , 2013, 2013 35th International Conference on Software Engineering (ICSE).