A Machine Learning Approach to Classify Security Patches into Vulnerability Types

With the increasing usage of open source software (OSS) in both free and proprietary applications, vulnerabilities embedded in OSS are also propagated to the underlying applications. It is critical to find security patches to fix these vulnerabilities, especially those essential to reduce security risk. Unfortunately, given a security patch, currently there does not exist a way to automatically recognize the vulnerability that is fixed. In this paper, we first conduct an empirical study on security patches by type (i.e., corresponding vulnerability type), using a large-scale dataset collected from the National Vulnerability Database (NVD). Based on analysis results, we develop a machine learning-based system to help identify the vulnerability type of a given security patch. The evaluation results show that our system achieves good performance.

[1]  Shouhuai Xu,et al.  VulPecker: an automated vulnerability detection system based on code similarity analysis , 2016, ACSAC.

[2]  Daniela Micucci,et al.  Automatic Software Repair: A Survey , 2018, 2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE).

[3]  Ehab Al-Shaer,et al.  ThreatZoom: neural network for automated vulnerability mitigation , 2019, HotSoS.

[4]  David Lie,et al.  Talos: Neutralizing Vulnerabilities with Security Workarounds for Rapid Response , 2016, 2016 IEEE Symposium on Security and Privacy (SP).

[5]  Yuhua Qi,et al.  The strength of random search on automated program repair , 2014, ICSE.

[6]  David Lo,et al.  Identifying Linux bug fixing patches , 2012, 2012 34th International Conference on Software Engineering (ICSE).

[7]  Wenbo Guo,et al.  Towards the Detection of Inconsistencies in Public Security Vulnerability Reports , 2019, USENIX Security Symposium.

[8]  Hang Zhang,et al.  Precise and Accurate Patch Presence Test for Binaries , 2018, USENIX Security Symposium.

[9]  Amiram Yehudai,et al.  Boosting Automatic Commit Classification Into Maintenance Activities By Utilizing Source Code Changes , 2017, PROMISE.

[10]  Shouhuai Xu,et al.  VulDeePecker: A Deep Learning-Based System for Vulnerability Detection , 2018, NDSS.

[11]  Mikhail J. Atallah,et al.  Algorithms and Theory of Computation Handbook , 2009, Chapman & Hall/CRC Applied Algorithms and Data Structures series.

[12]  Ahmed E. Hassan,et al.  Security versus performance bugs: a case study on Firefox , 2011, MSR '11.

[13]  Rongxin Wu,et al.  Dealing with noise in defect prediction , 2011, 2011 33rd International Conference on Software Engineering (ICSE).

[14]  Zhendong Su,et al.  An Empirical Study on Real Bug Fixes , 2015, 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering.

[15]  Paul Alpar,et al.  Customization of Open Source Software in Companies , 2009, OSS.

[16]  Sushil Jajodia,et al.  Detecting "0-Day" Vulnerability: An Empirical Study of Secret Security Patch in OSS , 2019, 2019 49th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN).

[17]  Michael W. Godfrey,et al.  Automatic classication of large changes into maintenance categories , 2009, 2009 IEEE 17th International Conference on Program Comprehension.

[18]  Chen Liu,et al.  R2Fix: Automatically Generating Bug Fixes from Bug Reports , 2013, 2013 IEEE Sixth International Conference on Software Testing, Verification and Validation.

[19]  Rui Abreu,et al.  A Survey on Software Fault Localization , 2016, IEEE Transactions on Software Engineering.

[20]  Yang Liu,et al.  SPAIN: Security Patch Analysis for Binaries towards Understanding the Pain and Pills , 2017, 2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE).

[21]  Kangjie Lu,et al.  Detecting Missing-Check Bugs via Semantic- and Context-Aware Criticalness and Constraints Inferences , 2019, USENIX Security Symposium.

[22]  Jinghui Cheng,et al.  Analysis and Detection of Information Types of Open Source Software Issue Discussions , 2019, 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE).

[23]  Emerson R. Murphy-Hill,et al.  The design of bug fixes , 2013, 2013 35th International Conference on Software Engineering (ICSE).

[24]  Andreas Zeller,et al.  When do changes induce fixes? , 2005, ACM SIGSOFT Softw. Eng. Notes.

[25]  Ahmed E. Hassan,et al.  Predicting faults using the complexity of code changes , 2009, 2009 IEEE 31st International Conference on Software Engineering.

[26]  David Lo,et al.  A Deeper Look into Bug Fixes: Patterns, Replacements, Deletions, and Additions , 2016, 2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR).

[27]  Heejo Lee,et al.  VUDDY: A Scalable Approach for Vulnerable Code Clone Discovery , 2017, 2017 IEEE Symposium on Security and Privacy (SP).

[28]  Vern Paxson,et al.  A Large-Scale Empirical Study of Security Patches , 2017, CCS.

[29]  Ashish Sureka,et al.  Application of LSSVM and SMOTE on Seven Open Source Projects for Predicting Refactoring at Class Level , 2017, 2017 24th Asia-Pacific Software Engineering Conference (APSEC).

[30]  Wenke Lee,et al.  Automating Patching of Vulnerable Open-Source Software Versions in Application Binaries , 2019, NDSS.

[31]  Matthew Smith,et al.  VCCFinder: Finding Potential Vulnerabilities in Open-Source Projects to Assist Code Audits , 2015, CCS.