Extracting rules for vulnerabilities detection with static metrics using machine learning

Software quality is the prime solicitude in software engineering and vulnerability is one of the major threat in this respect. Vulnerability hampers the security of the software and also impairs the quality of the software. In this paper, we have conducted experimental research on evaluating the utility of machine learning algorithms to detect the vulnerabilities. To execute this experiment; a set of software metrics was extracted using machine learning in the form of easily accessible laws. Here, 32 supervised machine learning algorithms have been considered for 3 most occurred vulnerabilities namely: Lawofdemeter, BeanMemberShouldSerialize,and LocalVariablecouldBeFinal in a software system. Using the J48 machine learning algorithm in this research, up to 96% of accurate result in vulnerability detection was achieved. The results are validated against tenfold cross validation and also, the statistical parameters like ROC curve, Kappa statistics; Recall, Precision, etc. have been used for analyzing the result.

[1]  Miguel Correia,et al.  Automatic detection and correction of web application vulnerabilities using data mining to predict false positives , 2014, WWW.

[2]  Minhaz Fahim Zibran,et al.  A Comparative Study on Vulnerabilities in Categories of Clones and Non-cloned Code , 2016, 2016 IEEE 23rd International Conference on Software Analysis, Evolution, and Reengineering (SANER).

[3]  Al-Sakib Khan Pathan,et al.  A survey on SQL injection: Vulnerabilities, attacks, and prevention techniques , 2011, 2011 IEEE 15th International Symposium on Consumer Electronics (ISCE).

[4]  Mohammad Zulkernine,et al.  Using complexity, coupling, and cohesion metrics as early indicators of vulnerabilities , 2011, J. Syst. Archit..

[5]  Lionel C. Briand,et al.  Web Application Vulnerability Prediction Using Hybrid Program Analysis and Machine Learning , 2015, IEEE Transactions on Dependable and Secure Computing.

[6]  Robert E. Davis,et al.  Statistics for the evaluation and comparison of models , 1985 .

[7]  Sachin Kumar,et al.  A novel method based on extreme learning machine to predict heating and cooling load through design and structural attributes , 2018, Energy and Buildings.

[8]  Mika Mäntylä,et al.  Comparing and experimenting machine learning techniques for code smell detection , 2015, Empirical Software Engineering.

[9]  Ira Winkler,et al.  How to Hack Computers , 2017 .

[10]  Laurie A. Williams,et al.  Evaluating Complexity, Code Churn, and Developer Activity Metrics as Indicators of Software Vulnerabilities , 2011, IEEE Transactions on Software Engineering.

[11]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[12]  Saibal K. Pal,et al.  Intelligent Energy Conservation: Indoor Temperature Forecasting with Extreme Learning Machine , 2016 .

[13]  Christopher Krügel,et al.  Limits of Static Analysis for Malware Detection , 2007, Twenty-Third Annual Computer Security Applications Conference (ACSAC 2007).

[14]  Jeffrey S. Foster,et al.  A comparison of bug finding tools for Java , 2004, 15th International Symposium on Software Reliability Engineering.

[15]  Lwin Khin Shar,et al.  Predicting common web application vulnerabilities from input validation and sanitization code patterns , 2012, 2012 Proceedings of the 27th IEEE/ACM International Conference on Automated Software Engineering.

[16]  Chitra Nasa,et al.  Evaluation of Different Classification Techniques for WEB Data , 2012 .

[17]  Sachin Kumar,et al.  A novel hybrid model based on particle swarm optimisation and extreme learning machine for short-term temperature prediction using ambient sensors , 2019, Sustainable Cities and Society.

[18]  Lerina Aversano,et al.  The life and death of statically detected vulnerabilities: An empirical study , 2009, Inf. Softw. Technol..

[19]  Deepak Kumar,et al.  Vulnerability Patch Modeling , 2016 .

[20]  Onur Ozdemir,et al.  Automated Vulnerability Detection in Source Code Using Deep Representation Learning , 2018, 2018 17th IEEE International Conference on Machine Learning and Applications (ICMLA).

[21]  Adarsh Anand,et al.  Modeling and Characterizing Software Vulnerabilities , 2017 .

[22]  B. Love Comparing supervised and unsupervised category learning , 2002, Psychonomic bulletin & review.

[23]  Wouter Joosen,et al.  Predicting Vulnerable Software Components via Text Mining , 2014, IEEE Transactions on Software Engineering.

[24]  Baldoino Fonseca dos Santos Neto,et al.  Experimenting Machine Learning Techniques to Predict Vulnerabilities , 2016, 2016 Seventh Latin-American Symposium on Dependable Computing (LADC).