The effect of Bellwether analysis on software vulnerability severity prediction models

Vulnerability severity prediction (VSP) models provide useful insight for vulnerability prioritization and software maintenance. Previous studies have proposed a variety of machine learning algorithms as an important paradigm for VSP. However, to the best of our knowledge, there are no other existing research studies focusing on investigating how a subset of features can be used to improve VSP. To address this deficiency, this paper presents a general framework for VSP using the Bellwether analysis (i.e., exemplary data ). First, we apply the natural language processing techniques to the textual descriptions of software vulnerability. Next, we developed an algorithm termed Bellvul to identify and select an exemplary subset of data (referred to as Bellwether ) to be considered as the training set to yield improved prediction accuracy against the growing portfolio , within-project cases, and the k- fold cross-validation subset. Finally, we assessed the performance of four machine learning algorithms, namely, deep neural network, logistic regression, k-nearest neighbor, and random forest using the sampled instances. The prediction results of the suggested models and the benchmark techniques were assessed based on the standard classification evaluation metrics such as precision, recall, and F-measure. The experimental result shows that the Bellwether approach achieves F-measure ranging from 14.3% to 97.8%, which is an improvement over the benchmark techniques. In conclusion, the proposed approach is a promising research direction for assisting software engineers when seeking to predict instances of vulnerability records that demand much attention prior to software release.

[1]  Foutse Khomh,et al.  Analyzing the Impact of Antipatterns on Change-Proneness Using Fine-Grained Source Code Changes , 2012, 2012 19th Working Conference on Reverse Engineering.

[2]  Doo-Hwan Bae,et al.  On the value of outlier elimination on software effort estimation research , 2012, Empirical Software Engineering.

[3]  Lefteris Angelis,et al.  Assessment of Vulnerability Severity using Text Mining , 2017, PCI.

[4]  Meera Sharma,et al.  The Way Ahead for Bug-fix time Prediction , 2015, QuASoQ/WAWSE/CMCE@APSEC.

[5]  Wouter Joosen,et al.  Predicting Vulnerable Software Components via Text Mining , 2014, IEEE Transactions on Software Engineering.

[6]  Jie Tian,et al.  Text Clustering on National Vulnerability Database , 2010, 2010 Second International Conference on Computer Engineering and Applications.

[7]  Jing Zhang,et al.  Vulnerability severity prediction and risk metric modeling for software , 2017, Applied Intelligence.

[8]  Yashwant K. Malaiya,et al.  Comparing and Evaluating CVSS Base Metrics and Microsoft Rating System , 2015, 2015 IEEE International Conference on Software Quality, Reliability and Security.

[9]  Zhihua Cai,et al.  Evaluation Measures of the Classification Performance of Imbalanced Data Sets , 2009 .

[10]  Xiang Chen,et al.  FECAR: A Feature Selection Framework for Software Defect Prediction , 2014, 2014 IEEE 38th Annual Computer Software and Applications Conference.

[11]  Pearl Brereton,et al.  Robust Statistical Methods for Empirical Software Engineering , 2017, Empirical Software Engineering.

[12]  Luigi Piroddi,et al.  A Feature Selection and Classification Algorithm Based on Randomized Extraction of Model Populations , 2018, IEEE Transactions on Cybernetics.

[13]  Doina Caragea,et al.  An Empirical Study on Using the National Vulnerability Database to Predict Software Vulnerabilities , 2011, DEXA.

[14]  Fabrizio Sebastiani,et al.  Supervised term weighting for automated text categorization , 2003, SAC '03.

[15]  Stephen G. MacDonell,et al.  Investigating the Significance of the Bellwether Effect to Improve Software Effort Prediction: Further Empirical Study , 2018, IEEE Transactions on Reliability.

[16]  Rahul Telang,et al.  An Empirical Analysis of the Impact of Software Vulnerability Announcements on Firm Stock Price , 2007, IEEE Transactions on Software Engineering.

[17]  Indrajit Ray,et al.  Evaluating CVSS Base Score Using Vulnerability Rewards Programs , 2016, SEC.

[18]  Sousuke Amasaki,et al.  Towards Better Selection Between Moving Windows and Growing Portfolio , 2016, PROFES.

[19]  Meiyappan Nagappan,et al.  Characterizing and predicting blocking bugs in open source projects , 2018, J. Syst. Softw..

[20]  Ramayya Krishnan,et al.  An Empirical Analysis of Software Vendors' Patch Release Behavior: Impact of Vulnerability Disclosure , 2010, Inf. Syst. Res..

[21]  Laurie A. Williams,et al.  Evaluating Complexity, Code Churn, and Developer Activity Metrics as Indicators of Software Vulnerabilities , 2011, IEEE Transactions on Software Engineering.

[22]  Yves Le Traon,et al.  Vulnerability Prediction Models: A Case Study on the Linux Kernel , 2016, 2016 IEEE 16th International Working Conference on Source Code Analysis and Manipulation (SCAM).

[23]  Taghi M. Khoshgoftaar,et al.  An Empirical Study of Learning from Imbalanced Data Using Random Forest , 2007 .

[24]  Xiaozhen Xue,et al.  Predicting Vulnerable Software Components through Deep Neural Network , 2017, ICDLT '17.

[25]  Andreas Zeller,et al.  Predicting vulnerable software components , 2007, CCS '07.

[26]  Lefteris Angelis,et al.  A multi-target approach to estimate software vulnerability characteristics and severity scores , 2018, J. Syst. Softw..

[27]  D. Toher,et al.  Tests for equality of variances between two samples which contain both paired observations and independent observations , 2018 .

[28]  David A. Belsley A Guide to using the collinearity diagnostics , 1991, Computer Science in Economics and Management.

[29]  Prasanna Chandra,et al.  Investment Analysis and Portfolio Management , 2004 .

[30]  Zhenchang Xing,et al.  Learning to Predict Severity of Software Vulnerability Using Only Vulnerability Description , 2017, 2017 IEEE International Conference on Software Maintenance and Evolution (ICSME).

[31]  Inderjit S. Dhillon,et al.  A Divisive Information-Theoretic Feature Clustering Algorithm for Text Classification , 2003, J. Mach. Learn. Res..

[32]  Xiang Li,et al.  A Mining Approach to Obtain the Software Vulnerability Characteristics , 2017, 2017 Fifth International Conference on Advanced Cloud and Big Data (CBD).

[33]  Matthew Roughan,et al.  The Effect of Common Vulnerability Scoring System Metrics on Vulnerability Exploit Delay , 2018, 2018 Sixth International Symposium on Computing and Networking (CANDAR).

[34]  H. White,et al.  On More Robust Estimation of Skewness and Kurtosis: Simulation and Application to the S&P500 Index , 2003 .

[35]  Indrajit Ray,et al.  Assessing vulnerability exploitability risk using software properties , 2016, Software Quality Journal.

[36]  Yuqing Zhang,et al.  Improving VRSS-based vulnerability prioritization using analytic hierarchy process , 2012, J. Syst. Softw..

[37]  Yaman Roumani,et al.  Time series modeling of vulnerabilities , 2015, Comput. Secur..

[38]  Vali Derhami,et al.  An automatic method for CVSS score prediction using vulnerabilities description , 2015, J. Intell. Fuzzy Syst..

[39]  Lefteris Angelis,et al.  Associating the Severity of Vulnerabilities with their Description , 2016, CAiSE Workshops.

[40]  Karen Scarfone,et al.  Common Vulnerability Scoring System , 2006, IEEE Security & Privacy.

[41]  Martin F. Porter,et al.  An algorithm for suffix stripping , 1997, Program.

[42]  J. Gastwirth,et al.  The impact of Levene’s test of equality of variances on statistical theory and practice , 2009, 1010.0308.

[43]  Mehdi R. Zargham,et al.  Vulnerability Scrying Method for Software Vulnerability Discovery Prediction Without a Vulnerability Database , 2013, IEEE Transactions on Reliability.

[44]  Lefteris Angelis,et al.  Impact Metrics of Security Vulnerabilities: Analysis and Weighing , 2015, Inf. Secur. J. A Glob. Perspect..

[45]  Andrew W. Moore,et al.  X-means: Extending K-means with Efficient Estimation of the Number of Clusters , 2000, ICML.

[46]  Mohammad Bagher Menhaj,et al.  Training feedforward networks with the Marquardt algorithm , 1994, IEEE Trans. Neural Networks.

[47]  Yuan-Shun Dai,et al.  Feature selection based on feature interactions with application to text categorization , 2019, Expert Syst. Appl..

[48]  Mathias Ekstedt,et al.  Empirical Analysis of System-Level Vulnerability Metrics through Actual Attacks , 2012, IEEE Transactions on Dependable and Secure Computing.

[49]  Riccardo Scandariato,et al.  The Effect of Dimensionality Reduction on Software Vulnerability Prediction Models , 2017, IEEE Transactions on Reliability.

[50]  Yuqing Zhang,et al.  VRSS: A new system for rating and scoring vulnerabilities , 2011, Comput. Commun..

[51]  Ritu Sibal,et al.  Prioritizing software vulnerability types using multi-criteria decision-making techniques , 2017 .

[52]  Lefteris Angelis,et al.  WIVSS: a new methodology for scoring information systems vulnerabilities , 2013, PCI '13.

[53]  Stephen G. MacDonell,et al.  Investigating the Significance of Bellwether Effect to Improve Software Effort Estimation , 2017, 2017 IEEE International Conference on Software Quality, Reliability and Security (QRS).

[54]  Yashwant K. Malaiya,et al.  AN ANALYSIS OF THE VULNERABILITY DISCOVERY PROCESS IN WEB BROWSERS , 2006 .

[55]  Tim Menzies,et al.  Too much automation? The bellwether effect and its implications for transfer learning , 2016, 2016 31st IEEE/ACM International Conference on Automated Software Engineering (ASE).

[56]  R. Cook Detection of influential observation in linear regression , 2000 .

[57]  Audris Mockus,et al.  How Does Context Affect the Distribution of Software Maintainability Metrics? , 2013, 2013 IEEE International Conference on Software Maintenance.

[58]  Osamu Mizuno,et al.  The impact of feature reduction techniques on defect prediction models , 2019, Empirical Software Engineering.

[59]  Haidar Osman,et al.  Automatic feature selection by regularization to improve bug prediction accuracy , 2017, 2017 IEEE Workshop on Machine Learning Techniques for Software Quality Evaluation (MaLTeSQuE).

[60]  S. Shapiro,et al.  An Analysis of Variance Test for Normality (Complete Samples) , 1965 .

[61]  Serge Demeyer,et al.  Comparing Mining Algorithms for Predicting the Severity of a Reported Bug , 2011, 2011 15th European Conference on Software Maintenance and Reengineering.

[62]  Sushil Jajodia,et al.  Aggregating CVSS Base Scores for Semantics-Rich Network Security Metrics , 2012, 2012 IEEE 31st Symposium on Reliable Distributed Systems.

[63]  Raghu Ramakrishnan,et al.  Bellwether analysis: predicting global aggregates from local regions , 2006, VLDB.

[64]  Raghu Ramakrishnan,et al.  Bellwether analysis: Searching for cost-effective query-defined predictors in large databases , 2009, TKDD.

[65]  Gitika Sharma,et al.  A Novel Way of Assessing Software Bug Severity Using Dictionary of Critical Terms , 2015 .

[66]  Ayse Tosun Misirli,et al.  A Conceptual Replication on Predicting the Severity of Software Vulnerabilities , 2019, EASE.

[67]  Jin Liu,et al.  MICHAC: Defect Prediction via Feature Selection Based on Maximal Information Coefficient with Hierarchical Agglomerative Clustering , 2016, 2016 IEEE 23rd International Conference on Software Analysis, Evolution, and Reengineering (SANER).

[68]  D. Lakens,et al.  Why Psychologists Should by Default Use Welch's t-test Instead of Student's t-test with Unequal Group Sizes , 2017 .

[69]  David Last Using historical software vulnerability data to forecast future vulnerabilities , 2015, 2015 Resilience Week (RWS).

[70]  Tim Menzies,et al.  Bellwethers: A Baseline Method for Transfer Learning , 2017, IEEE Transactions on Software Engineering.

[71]  Giandomenico Spezzano,et al.  An Adaptive Distributed Ensemble Approach to Mine Concept-Drifting Data Streams , 2007 .

[72]  Andrew Meneely,et al.  Vulnerability severity scoring and bounties: why the disconnect? , 2016, SWAN@SIGSOFT FSE.

[73]  Hans-Peter Kriegel,et al.  Clustering high-dimensional data: A survey on subspace clustering, pattern-based clustering, and correlation clustering , 2009, TKDD.

[74]  Siv Hilde Houmb,et al.  Estimating ToE Risk Level Using CVSS , 2009, 2009 International Conference on Availability, Reliability and Security.

[75]  H. A. Bayoud Tests of normality: new test and comparative study , 2019, Commun. Stat. Simul. Comput..

[76]  Lwin Khin Shar,et al.  Predicting SQL injection and cross site scripting vulnerabilities through mining input sanitization patterns , 2013, Inf. Softw. Technol..

[77]  Barry W. Boehm,et al.  Negative results for software effort estimation , 2016, Empirical Software Engineering.

[78]  Emilia Mendes,et al.  Investigating the use of moving windows to improve software effort prediction: a replicated study , 2017, Empirical Software Engineering.