Better together: Comparing vulnerability prediction models

Abstract Context Vulnerability Prediction Models (VPMs) are an approach for prioritizing security inspection and testing to find and fix vulnerabilities. VPMs have been created based on a variety of metrics and approaches, yet widespread adoption of VPM usage in practice has not occurred. Knowing which VPMs have strong prediction and which VPMs have low data requirements and resources usage would be useful for practitioners to match VPMs to their project’s needs. The low density of vulnerabilities compared to defects is also an obstacle for practical VPMs. Objective The goal of the paper is to help security practitioners and researchers choose appropriate features for vulnerability prediction through a comparison of Vulnerability Prediction Models. Method We performed replications of VPMs on Mozilla Firefox with 28,750 source code files featuring 271 vulnerabilities using software metrics, text mining, and crash data. We then combined features from each VPM and reran our classifiers. Results We improved the F-score of the best VPM (.20 to 0.28) by combining features from three types of VPMs and using Naive Bayes as the classifier. The strongest features in the combined model were the number of times a file was involved in a crash, the number of outgoing calls from a file, and the string “nullptr”. Conclusion Our results indicate that further work is needed to develop new features for input into classifiers. In addition, new analytic approaches for VPMs are needed for VPMs to be useful in practical situations, due to the low density of vulnerabilities in software (less than 1% for our dataset).

[1]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[2]  Manik Sharma,et al.  Analysis of Static and Dynamic Metrics for Productivity and Time Complexity , 2011 .

[3]  Laurie A. Williams,et al.  One Technique is Not Enough: A Comparison of Vulnerability Discovery Techniques , 2011, 2011 International Symposium on Empirical Software Engineering and Measurement.

[4]  Laurie A. Williams,et al.  Using SQL Hotspots in a Prioritization Heuristic for Detecting All Types of Web Application Vulnerabilities , 2011, 2011 Fourth IEEE International Conference on Software Testing, Verification and Validation.

[5]  Brendan Murphy,et al.  CODEMINE: Building a Software Development Data Analytics Platform at Microsoft , 2013, IEEE Software.

[6]  Fabio Massacci,et al.  Which is the right source for vulnerability studies?: an empirical analysis on Mozilla Firefox , 2010, MetriSec '10.

[7]  Tao Xie,et al.  Identifying security bug reports via text mining: An industrial case study , 2010, 2010 7th IEEE Working Conference on Mining Software Repositories (MSR 2010).

[8]  Sven Apel,et al.  Faster Discovery of Faster System Configurations with Spectral Learning , 2017 .

[9]  Jeffrey M. Voas,et al.  What Happened to Software Metrics? , 2017, Computer.

[10]  Tim Menzies,et al.  Tuning for Software Analytics: is it Really Necessary? , 2016, Inf. Softw. Technol..

[11]  Yuming Zhou,et al.  Effort-aware just-in-time defect prediction: simple unsupervised models could be better than supervised models , 2016, SIGSOFT FSE.

[12]  Laurie A. Williams,et al.  Can traditional fault prediction models be used for vulnerability prediction? , 2011, Empirical Software Engineering.

[13]  Witold Pedrycz,et al.  A comparative analysis of the efficiency of change metrics and static code attributes for defect prediction , 2008, 2008 ACM/IEEE 30th International Conference on Software Engineering.

[14]  Laurie A. Williams,et al.  An initial study on the use of execution complexity metrics as indicators of software vulnerabilities , 2011, SESS '11.

[15]  Laurie A. Williams,et al.  Risk-Based Attack Surface Approximation: How Much Data Is Enough? , 2017, 2017 IEEE/ACM 39th International Conference on Software Engineering: Software Engineering in Practice Track (ICSE-SEIP).

[16]  Xiao Liu,et al.  An empirical study on software defect prediction with a simplified metric set , 2014, Inf. Softw. Technol..

[17]  Wouter Joosen,et al.  Predicting Vulnerable Software Components via Text Mining , 2014, IEEE Transactions on Software Engineering.

[18]  Olcay Taner Yildiz,et al.  Software defect prediction using Bayesian networks , 2012, Empirical Software Engineering.

[19]  Laurie A. Williams,et al.  Challenges with applying vulnerability prediction models , 2015, HotSoS.

[20]  Tim Menzies,et al.  Revisiting unsupervised learning for defect prediction , 2017, ESEC/SIGSOFT FSE.

[21]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[22]  Michael Gegick,et al.  Predicting Attack-prone Components , 2009, 2009 International Conference on Software Testing Verification and Validation.

[23]  Laurie A. Williams,et al.  Is complexity really the enemy of software security? , 2008, QoP '08.

[24]  Shane McIntosh,et al.  Revisiting the Impact of Classification Techniques on the Performance of Defect Prediction Models , 2015, 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering.

[25]  Kendra J Kratkiewicz,et al.  Evaluating Static Analysis Tools for Detecting Buffer Overflows in C Code , 2005 .

[26]  Andreas Zeller,et al.  Predicting vulnerable software components , 2007, CCS '07.

[27]  Laurie A. Williams,et al.  Searching for a Needle in a Haystack: Predicting Security Vulnerabilities for Windows Vista , 2010, 2010 Third International Conference on Software Testing, Verification and Validation.

[28]  Laurie A. Williams,et al.  Approximating Attack Surfaces with Stack Traces , 2015, 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering.

[29]  Tim Menzies,et al.  Data Mining Static Code Attributes to Learn Defect Predictors , 2007 .

[30]  Premkumar T. Devanbu,et al.  How, and why, process metrics are better , 2013, 2013 35th International Conference on Software Engineering (ICSE).

[31]  Anuradha Chug,et al.  Dynamic metrics are superior than static metrics in maintainability prediction: An empirical case study , 2015, 2015 4th International Conference on Reliability, Infocom Technologies and Optimization (ICRITO) (Trends and Future Directions).

[32]  Ruchika Malhotra,et al.  Investigation of relationship between object-oriented metrics and change proneness , 2013, Int. J. Mach. Learn. Cybern..

[33]  Riccardo Scandariato,et al.  Predicting Vulnerable Components: Software Metrics vs Text Mining , 2014, 2014 IEEE 25th International Symposium on Software Reliability Engineering.

[34]  Laurie A. Williams,et al.  Strengthening the empirical analysis of the relationship between Linus' Law and software security , 2010, ESEM '10.

[35]  Christopher Krügel,et al.  Limits of Static Analysis for Malware Detection , 2007, Twenty-Third Annual Computer Security Applications Conference (ACSAC 2007).

[36]  Michelle Cartwright,et al.  On Building Prediction Systems for Software Engineers , 2000, Empirical Software Engineering.

[37]  Laurie A. Williams,et al.  Evaluating Complexity, Code Churn, and Developer Activity Metrics as Indicators of Software Vulnerabilities , 2011, IEEE Transactions on Software Engineering.

[38]  Laurie A. Williams,et al.  Secure open source collaboration: an empirical study of linus' law , 2009, CCS.