Delta-Bench: Differential Benchmark for Static Analysis Security Testing Tools

Background: Static analysis security testing (SAST) tools may be evaluated using synthetic micro benchmarks and benchmarks based on real-world software. Aims: The aim of this study is to address the limitations of the existing SAST tool benchmarks: lack of vulnerability realism, uncertain ground truth, and large amount of findings not related to analyzed vulnerability. Method: We propose Delta-Bench - a novel approach for the automatic construction of benchmarks for SAST tools based on differencing vulnerable and fixed versions in Free and Open Source (FOSS) repositories. To test our approach, we used 7 state of the art SAST tools against 70 revisions of four major versions of Apache Tomcat spanning 62 distinct Common Vulnerabilities and Exposures (CVE) fixes and vulnerable files totalling over 100K lines of code as the source of ground truth vulnerabilities. Results: Our experiment allows us to draw interesting conclusions (e.g., tools perform differently due to the selected benchmark). Conclusions: Delta-Bench allows SAST tools to be automatically evaluated on the real-world historical vulnerabilities using only the findings that a tool produced for the analysed vulnerability.

[1]  Michael D. Ernst,et al.  Defects4J: a database of existing faults to enable controlled testing studies for Java programs , 2014, ISSTA 2014.

[2]  Christian Bird,et al.  What developers want and need from program analysis: An empirical study , 2016, 2016 31st IEEE/ACM International Conference on Automated Software Engineering (ASE).

[3]  Thorsten Holz,et al.  Static Detection of Second-Order Vulnerabilities in Web Applications , 2014, USENIX Security Symposium.

[4]  Paul E. Black,et al.  SATE V Ockham Sound Analysis Criteria , 2016 .

[5]  Ulf Nilsson,et al.  A Comparative Study of Industrial Static Analysis Tools , 2008, SSV.

[6]  Sarfraz Khurshid,et al.  Are These Bugs Really "Normal"? , 2015, 2015 IEEE/ACM 12th Working Conference on Mining Software Repositories.

[7]  Chanchal Kumar Roy,et al.  LHDiff: A Language-Independent Hybrid Approach for Tracking Source Code Lines , 2013, 2013 IEEE International Conference on Software Maintenance.

[8]  Andreas Zeller,et al.  Predicting vulnerable software components , 2007, CCS '07.

[9]  Andreas Zeller,et al.  The impact of tangled code changes on defect prediction models , 2015, Empirical Software Engineering.

[10]  Hao Tang,et al.  Enhancing Defect Prediction with Static Defect Analysis , 2015, Internetware.

[11]  Ali Mili,et al.  Programming Language Use in US Academia and Industry , 2015, Informatics Educ..

[12]  Benjamin Livshits,et al.  Finding Security Vulnerabilities in Java Applications with Static Analysis , 2005, USENIX Security Symposium.

[13]  Laurie A. Williams,et al.  Evaluating Complexity, Code Churn, and Developer Activity Metrics as Indicators of Software Vulnerabilities , 2011, IEEE Transactions on Software Engineering.

[14]  William K. Robertson,et al.  LAVA: Large-Scale Automated Vulnerability Addition , 2016, 2016 IEEE Symposium on Security and Privacy (SP).

[15]  Martin Johns,et al.  Scanstud: A Methodology for Systematic, Fine-Grained Evaluation of Static Analysis Tools , 2011, 2011 IEEE Fourth International Conference on Software Testing, Verification and Validation Workshops.

[16]  Tracy Hall,et al.  A Systematic Literature Review on Fault Prediction Performance in Software Engineering , 2012, IEEE Transactions on Software Engineering.

[17]  Vadim Okun,et al.  Of Massive Static Analysis Data , 2013, 2013 IEEE Seventh International Conference on Software Security and Reliability Companion.

[18]  Peng Li,et al.  A comparative study on software vulnerability static analysis techniques and tools , 2010, 2010 IEEE International Conference on Information Theory and Information Security.

[19]  Fabio Massacci,et al.  An automatic method for assessing the versions affected by a vulnerability , 2015, Empirical Software Engineering.

[20]  Li Li,et al.  Watch out for this commit! A study of influential software changes , 2016, J. Softw. Evol. Process..

[21]  Mark Harman,et al.  An Analysis and Survey of the Development of Mutation Testing , 2011, IEEE Transactions on Software Engineering.

[22]  Vadim Okun,et al.  Static Analysis Tool Exposition (SATE) 2008 , 2009 .

[23]  Martin P. Robillard,et al.  Non-essential changes in version histories , 2011, 2011 33rd International Conference on Software Engineering (ICSE).