SQVDT: A scalable quantitative vulnerability detection technique for source code security assessment

Vulnerability detection and exploit is becoming a very important part of security, especially in malware code delivery, hacking a system, efforts to create patches, improving the source code, or updating a software. Vulnerabilities in applications, including browsers, media players, online services, document readers, and so forth. are often exploited and cause a serious damage. In this article, we propose a vulnerability detection technique to detect vulnerabilities in software, as well as shared libraries at source code level. We crawl the vulnerable source code by tracing and locating the patch files from different web sources according to their CVE‐numbers and built a fingerprint index of 2931 vulnerable files. Then we developed a vulnerability detection approach based on code clone detection technique and detect hundreds of vulnerabilities in thousands of GitHub open source projects, which are not noticed before as vulnerable. We detected vulnerabilities in some very famous recently available software, including latest version of Linux, HTC‐kernel, FindX‐8.1‐kernel, and in 7‐TB of C/C++ source code (152,823 open source projects). In this study, we discuss some of the very high severity level (CVSS) vulnerabilities that are detected by our approach. Furthermore, we performed an empirical evaluation and verification on these vulnerabilities, including intraproject clone vulnerabilities, copied‐kernel clone vulnerabilities, and library‐used clone vulnerabilities. Our technique is very fast, efficient, reliable, practical, scalable, and can be implemented at industrial level. The comparison with the state‐of‐the‐art tools shows the effectiveness of our approach.

[1]  Junaid Akram,et al.  An Integrated Software Vulnerability Discovery Model based on Artificial Neural Network , 2019, SEKE.

[2]  Yajin Zhou,et al.  Dissecting Android Malware: Characterization and Evolution , 2012, 2012 IEEE Symposium on Security and Privacy.

[3]  Elmar Jürgens,et al.  Do code clones matter? , 2009, 2009 IEEE 31st International Conference on Software Engineering.

[4]  Andreas Zeller,et al.  Where Should We Fix This Bug? A Two-Phase Recommendation Model , 2013, IEEE Transactions on Software Engineering.

[5]  Ping Luo,et al.  DroidMD: An Efficient and Scalable Android Malware Detection Approach at Source Code Level , 2019 .

[6]  Shouhuai Xu,et al.  VulPecker: an automated vulnerability detection system based on code similarity analysis , 2016, ACSAC.

[7]  Ellis E. Eghan,et al.  Tracing known security vulnerabilities in software repositories - A Semantic Web enabled modeling approach , 2016, Sci. Comput. Program..

[8]  Thomas A. Mazzuchi,et al.  Bayesian-model averaging using MCMCBayes for web-browser vulnerability discovery , 2019, Reliab. Eng. Syst. Saf..

[9]  Hui Guo,et al.  Research on Detecting Windows Vulnerabilities Based on Security Patch Comparison , 2016, 2016 Sixth International Conference on Instrumentation & Measurement, Computer, Communication and Control (IMCCC).

[10]  Shouhuai Xu,et al.  VulDeePecker: A Deep Learning-Based System for Vulnerability Detection , 2018, NDSS.

[11]  Ruchi Sharma,et al.  Vulnerability Discovery in Open- and Closed-Source Software: A New Paradigm , 2018, Advances in Intelligent Systems and Computing.

[12]  Yan Cao,et al.  VFDETECT: A vulnerable code clone detection system based on vulnerability fingerprint , 2017, 2017 IEEE 3rd Information Technology and Mechatronics Engineering Conference (ITOEC).

[13]  David Brumley,et al.  Automatic Patch-Based Exploit Generation is Possible: Techniques and Implications , 2008, 2008 IEEE Symposium on Security and Privacy (sp 2008).

[14]  Yajin Zhou,et al.  Detecting repackaged smartphone applications in third-party android marketplaces , 2012, CODASPY '12.

[15]  Heejo Lee,et al.  Software systems at risk: An empirical study of cloned vulnerabilities in practice , 2018, Comput. Secur..

[16]  Heejo Lee,et al.  VUDDY: A Scalable Approach for Vulnerable Code Clone Discovery , 2017, 2017 IEEE Symposium on Security and Privacy (SP).

[17]  Hitesh Sajnani,et al.  A parallel and efficient approach to large scale clone detection , 2013, 2013 7th International Workshop on Software Clones (IWSC).

[18]  SainiVaibhav,et al.  A parallel and efficient approach to large scale clone detection , 2015 .

[19]  Ashay Sinha,et al.  Performance evaluation of MySQL, Cassandra and HBase for heavy write operation , 2016, 2016 3rd International Conference on Recent Advances in Information Technology (RAIT).

[20]  Elmar Jürgens,et al.  CloneDetective - A workbench for clone detection research , 2009, 2009 IEEE 31st International Conference on Software Engineering.

[21]  Jugal Kalita,et al.  A Survey of Software Clone Detection Techniques , 2016 .

[22]  Peng Liu,et al.  Achieving accuracy and scalability simultaneously in detecting application clones on Android markets , 2014, ICSE.

[23]  Ping Luo,et al.  DroidCC: A Scalable Clone Detection Approach for Android Applications to Detect Similarity at Source Code Level , 2018, 2018 IEEE 42nd Annual Computer Software and Applications Conference (COMPSAC).

[24]  David Brumley,et al.  ReDeBug: Finding Unpatched Code Clones in Entire OS Distributions , 2012, 2012 IEEE Symposium on Security and Privacy.

[25]  Ping Luo,et al.  DCCD: An Efficient and Scalable Distributed Code Clone Detection Technique for Big Code , 2018, SEKE.

[26]  Jeffrey Avery,et al.  Formally modeling deceptive patches using a game-based approach , 2018, Comput. Secur..

[27]  Leyla Bilge,et al.  The Attack of the Clones: A Study of the Impact of Shared Code on Vulnerability Patching , 2015, 2015 IEEE Symposium on Security and Privacy.

[28]  Huan Luo,et al.  Which Android App Store Can Be Trusted in China? , 2014, 2014 IEEE 38th Annual Computer Software and Applications Conference.

[29]  Junaid Akram,et al.  How to build a vulnerability benchmark to overcome cyber security attacks , 2020, IET Inf. Secur..

[30]  Zhendong Su,et al.  DECKARD: Scalable and Accurate Tree-Based Detection of Code Clones , 2007, 29th International Conference on Software Engineering (ICSE'07).

[31]  Ping Luo,et al.  IBFET: Index‐based features extraction technique for scalable code clone detection at file level granularity , 2019, Softw. Pract. Exp..

[32]  Indrajit Ray,et al.  Measuring, analyzing and predicting security vulnerabilities in software systems , 2007, Comput. Secur..

[33]  Ping Luo,et al.  VCIPR: Vulnerable Code is Identifiable When a Patch is Released (Hacker's Perspective) , 2019, 2019 12th IEEE Conference on Software Testing, Validation and Verification (ICST).

[34]  Majid Mumtaz,et al.  An RSA Based Authentication System for Smart IoT Environment , 2019, 2019 IEEE 21st International Conference on High Performance Computing and Communications; IEEE 17th International Conference on Smart City; IEEE 5th International Conference on Data Science and Systems (HPCC/SmartCity/DSS).

[35]  Alexander Romanovsky,et al.  Experience Report: Study of Vulnerabilities of Enterprise Operating Systems , 2017, 2017 IEEE 28th International Symposium on Software Reliability Engineering (ISSRE).