LEOPARD: Identifying Vulnerable Code for Vulnerability Assessment Through Program Metrics

Identifying potentially vulnerable locations in a code base is critical as a pre-step for effective vulnerability assessment; i.e., it can greatly help security experts put their time and effort to where it is needed most. Metric-based and pattern-based methods have been presented for identifying vulnerable code. The former relies on machine learning and cannot work well due to the severe imbalance between non-vulnerable and vulnerable code or lack of features to characterize vulnerabilities. The latter needs the prior knowledge of known vulnerabilities and can only identify similar but not new types of vulnerabilities. In this paper, we propose and implement a generic, lightweight and extensible framework, LEOPARD, to identify potentially vulnerable functions through program metrics. LEOPARD requires no prior knowledge about known vulnerabilities. It has two steps by combining two sets of systematically derived metrics. First, it uses complexity metrics to group the functions in a target application into a set of bins. Then, it uses vulnerability metrics to rank the functions in each bin and identifies the top ones as potentially vulnerable. Our experimental results on 11 real-world projects have demonstrated that, LEOPARD can cover 74.0% of vulnerable functions by identifying 20% of functions as vulnerable and outperform machine learning-based and static analysis-based techniques. We further propose three applications of LEOPARD for manual code review and fuzzing, through which we discovered 22 new bugs in real applications like PHP, radare2 and FFmpeg, and eight of them are new vulnerabilities.

[1]  Michael Gegick,et al.  Prioritizing software security fortification throughcode-level metrics , 2008, QoP '08.

[2]  Shuvendu K. Lahiri,et al.  Towards Practical Reactive Security Audit Using Extended Static Checkers , 2013, 2013 IEEE Symposium on Security and Privacy.

[3]  Yang Liu,et al.  Collaborative Security , 2015, ACM Comput. Surv..

[4]  Laurie A. Williams,et al.  Is complexity really the enemy of software security? , 2008, QoP '08.

[5]  Konrad Rieck,et al.  Chucky: exposing missing checks in source code for vulnerability discovery , 2013, CCS.

[6]  Laurie A. Williams,et al.  An initial study on the use of execution complexity metrics as indicators of software vulnerabilities , 2011, SESS '11.

[7]  Laurie A. Williams,et al.  Evaluating Complexity, Code Churn, and Developer Activity Metrics as Indicators of Software Vulnerabilities , 2011, IEEE Transactions on Software Engineering.

[8]  Yuming Zhou,et al.  How Far We Have Progressed in the Journey? An Examination of Cross-Project Defect Prediction , 2018, ACM Trans. Softw. Eng. Methodol..

[9]  David A. Wagner,et al.  This copyright notice must be included in the reproduced paper. USENIX acknowledges all trademarks herein. Detecting Format String Vulnerabilities with Type Qualifiers , 2001 .

[10]  Steve Hanna,et al.  FLAX: Systematic Discovery of Client-side Validation Vulnerabilities in Rich Web Applications , 2010, NDSS.

[11]  Ben Stock,et al.  25 million flows later: large-scale detection of DOM-based XSS , 2013, CCS.

[12]  James R. Larus,et al.  Righting software , 2004, IEEE Software.

[13]  Richard Torkar,et al.  Software fault prediction metrics: A systematic literature review , 2013, Inf. Softw. Technol..

[14]  Yuming Zhou,et al.  Effort-aware just-in-time defect prediction: simple unsupervised models could be better than supervised models , 2016, SIGSOFT FSE.

[15]  Gary McGraw,et al.  Software Security: Building Security In , 2006, 2006 17th International Symposium on Software Reliability Engineering.

[16]  Yang Liu,et al.  Proteus: computing disjunctive loop summary via path dependency analysis , 2016, SIGSOFT FSE.

[17]  Herbert Bos,et al.  VUzzer: Application-aware Evolutionary Fuzzing , 2017, NDSS.

[18]  Guofei Gu,et al.  TaintScope: A Checksum-Aware Directed Fuzzing Tool for Automatic Software Vulnerability Detection , 2010, 2010 IEEE Symposium on Security and Privacy.

[19]  Zhe Yang,et al.  Modular checking for buffer overflows in the large , 2006, ICSE.

[20]  Abhik Roychoudhury,et al.  Directed Greybox Fuzzing , 2017, CCS.

[21]  Ying Zou,et al.  Cross-Project Defect Prediction Using a Connectivity-Based Unsupervised Classifier , 2016, 2016 IEEE/ACM 38th International Conference on Software Engineering (ICSE).

[22]  Yang Liu,et al.  SPAIN: Security Patch Analysis for Binaries towards Understanding the Pain and Pills , 2017, 2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE).

[23]  Thorsten Holz,et al.  Simulation of Built-in PHP Features for Precise Static Code Analysis , 2014, NDSS.

[24]  Yang Liu,et al.  BinGo: cross-architecture cross-OS binary search , 2016, SIGSOFT FSE.

[25]  Mohammad Zulkernine,et al.  Using complexity, coupling, and cohesion metrics as early indicators of vulnerabilities , 2011, J. Syst. Archit..

[26]  Yuming Zhou,et al.  Code Churn: A Neglected Metric in Effort-Aware Just-in-Time Defect Prediction , 2017, 2017 ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM).

[27]  Vitaly Shmatikov,et al.  RoleCast: finding missing security checks when you do not know what checks are , 2011, OOPSLA '11.

[28]  Konrad Rieck,et al.  Automatic Inference of Search Patterns for Taint-Style Vulnerabilities , 2015, 2015 IEEE Symposium on Security and Privacy.

[29]  Laurie A. Williams,et al.  An empirical model to predict security vulnerabilities using code complexity metrics , 2008, ESEM '08.

[30]  Gary McGraw,et al.  ITS4: a static vulnerability scanner for C and C++ code , 2000, Proceedings 16th Annual Computer Security Applications Conference (ACSAC'00).

[31]  Xiao Ma,et al.  AutoISES: Automatically Inferring Security Specification and Detecting Violations , 2008, USENIX Security Symposium.

[32]  Jaechang Nam,et al.  CLAMI: Defect Prediction on Unlabeled Datasets (T) , 2015, 2015 30th IEEE/ACM International Conference on Automated Software Engineering (ASE).

[33]  Ling Xu,et al.  Automated change-prone class prediction on unlabeled dataset using unsupervised method , 2017, Inf. Softw. Technol..

[34]  Tibor Gyimóthy,et al.  Empirical validation of object-oriented metrics on open source software for fault prediction , 2005, IEEE Transactions on Software Engineering.

[35]  Miguel Correia,et al.  DEKANT: a static analysis tool that learns to detect web application vulnerabilities , 2016, ISSTA.

[36]  Abhik Roychoudhury,et al.  Coverage-Based Greybox Fuzzing as Markov Chain , 2016, IEEE Transactions on Software Engineering.

[37]  Stephen McCamant,et al.  Statically-directed dynamic automated test generation , 2011, ISSTA '11.

[38]  Viet Hung Nguyen,et al.  Predicting vulnerable software components with dependency graphs , 2010, MetriSec '10.

[39]  Dawn Xiaodong Song,et al.  SoK: Eternal War in Memory , 2013, 2013 IEEE Symposium on Security and Privacy.

[40]  Ashkan Sami,et al.  Evaluating and comparing complexity, coupling and a new proposed set of coupling metrics in cross-project vulnerability prediction , 2016, SAC.

[41]  David Lo,et al.  Combining Software Metrics and Text Features for Vulnerable File Prediction , 2015, 2015 20th International Conference on Engineering of Complex Computer Systems (ICECCS).

[42]  Yang Liu,et al.  FOT: a versatile, configurable, extensible fuzzing framework , 2018, ESEC/SIGSOFT FSE.

[43]  Tracy Hall,et al.  A Systematic Literature Review on Fault Prediction Performance in Software Engineering , 2012, IEEE Transactions on Software Engineering.

[44]  Wouter Joosen,et al.  Is Newer Always Better?: The Case of Vulnerability Prediction Models , 2016, ESEM.

[45]  Laurie A. Williams,et al.  Searching for a Needle in a Haystack: Predicting Security Vulnerabilities for Windows Vista , 2010, 2010 Third International Conference on Software Testing, Verification and Validation.

[46]  David Evans,et al.  Improving Security Using Extensible Lightweight Static Analysis , 2002, IEEE Softw..

[47]  Richard Lippmann,et al.  Testing static analysis tools using exploitable buffer overflows from open source code , 2004, SIGSOFT '04/FSE-12.

[48]  David Lo,et al.  Supervised vs Unsupervised Models: A Holistic Look at Effort-Aware Just-in-Time Defect Prediction , 2017, 2017 IEEE International Conference on Software Maintenance and Evolution (ICSME).

[49]  Andrew Meneely,et al.  Do Bugs Foreshadow Vulnerabilities? A Study of the Chromium Project , 2015, 2015 IEEE/ACM 12th Working Conference on Mining Software Repositories.

[50]  Banu Diri,et al.  A systematic review of software fault prediction studies , 2009, Expert Syst. Appl..

[51]  Konrad Rieck,et al.  Modeling and Discovering Vulnerabilities with Code Property Graphs , 2014, 2014 IEEE Symposium on Security and Privacy.

[52]  Andreas Zeller,et al.  Predicting vulnerable software components , 2007, CCS '07.

[53]  David Brumley,et al.  Scheduling black-box mutational fuzzing , 2013, CCS.

[54]  Ruchika Malhotra,et al.  A systematic review of machine learning techniques for software fault prediction , 2015, Appl. Soft Comput..

[55]  Anas N. Al-Rabadi,et al.  A comparison of modified reconstructability analysis and Ashenhurst‐Curtis decomposition of Boolean functions , 2004 .

[56]  Patrice Godefroid,et al.  Automated Whitebox Fuzz Testing , 2008, NDSS.

[57]  Riccardo Scandariato,et al.  Predicting Vulnerable Components: Software Metrics vs Text Mining , 2014, 2014 IEEE 25th International Symposium on Software Reliability Engineering.

[58]  Alexander Aiken,et al.  Static Detection of Security Vulnerabilities in Scripting Languages , 2006, USENIX Security Symposium.

[59]  Ashkan Sami,et al.  Using complexity metrics to improve software security , 2013 .

[60]  Wouter Joosen,et al.  Predicting Vulnerable Software Components via Text Mining , 2014, IEEE Transactions on Software Engineering.

[61]  Tim Menzies,et al.  Revisiting unsupervised learning for defect prediction , 2017, ESEC/SIGSOFT FSE.

[62]  Benjamin Livshits,et al.  Finding Security Vulnerabilities in Java Applications with Static Analysis , 2005, USENIX Security Symposium.

[63]  Christopher Krügel,et al.  Driller: Augmenting Fuzzing Through Selective Symbolic Execution , 2016, NDSS.

[64]  Bihuan Chen,et al.  Hawkeye: Towards a Desired Directed Grey-box Fuzzer , 2018, CCS.

[65]  Laurie A. Williams,et al.  Can traditional fault prediction models be used for vulnerability prediction? , 2011, Empirical Software Engineering.

[66]  Konrad Rieck,et al.  Generalized vulnerability extrapolation using abstract syntax trees , 2012, ACSAC '12.

[67]  David Lo,et al.  File-Level Defect Prediction: Unsupervised vs. Supervised Models , 2017, 2017 ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM).

[68]  David Brumley,et al.  Program-Adaptive Mutational Fuzzing , 2015, 2015 IEEE Symposium on Security and Privacy.

[69]  Laurie A. Williams,et al.  Challenges with applying vulnerability prediction models , 2015, HotSoS.

[70]  Christopher Krügel,et al.  Pixy: a static analysis tool for detecting Web application vulnerabilities , 2006, 2006 IEEE Symposium on Security and Privacy (S&P'06).