Transformer-based Vulnerability Detection in Code at EditTime: Zero-shot, Few-shot, or Fine-tuning?

Software vulnerabilities bear enterprises significant costs. Despite extensive efforts in research and development of software vulnerability detection methods, uncaught vulnerabilities continue to put software owners and users at risk. Many current vulnerability detection methods require that code snippets can compile and build before attempting detection. This, unfortunately, introduces a long latency between the time a vulnerability is injected to the time it is removed, which can substantially increases the cost of fixing a vulnerability. We recognize that the current advances in machine learning can be used to detect vulnerable code patterns on syntactically incomplete code snippets as the developer is writing the code at EditTime. In this paper we present a practical system that leverages deep learning on a large-scale data set of vulnerable code patterns to learn complex manifestations of more than 250 vulnerability types and detect vulnerable code patterns at EditTime. We discuss zero-shot, few-shot, and fine-tuning approaches on state of the art pre-trained Large Language Models (LLMs). We show that in comparison with state of the art vulnerability detection models our approach improves the state of the art by 10%. We also evaluate our approach to detect vulnerability in auto-generated code by code LLMs. Evaluation on a benchmark of high-risk code scenarios shows a reduction of up to 90% vulnerability reduction.

[1]  Joanna C. S. Santos,et al.  An Empirical Study of Code Smells in Transformer-based Code Generation Techniques , 2022, 2022 IEEE 22nd International Working Conference on Source Code Analysis and Manipulation (SCAM).

[2]  Jonathan Bell,et al.  CONFETTI: Amplifying Concolic Guidance for Fuzzers , 2022, 2022 IEEE/ACM 44th International Conference on Software Engineering (ICSE).

[3]  C. Tantithamthavorn,et al.  LineVul: A Transformer-based Line-Level Vulnerability Prediction , 2022, 2022 IEEE/ACM 19th International Conference on Mining Software Repositories (MSR).

[4]  Ryan J. Lowe,et al.  Training language models to follow instructions with human feedback , 2022, NeurIPS.

[5]  Cherepanov,et al.  Competition-level code generation with AlphaCode , 2022, Science.

[6]  Xi Xiao,et al.  Path Transitions Tell More: Optimizing Fuzzing Schedules via Runtime Program States , 2022, 2022 IEEE/ACM 44th International Conference on Software Engineering (ICSE).

[7]  Ivan Beschastnikh,et al.  Linear-time Temporal Logic guided Greybox Fuzzing , 2021, 2022 IEEE/ACM 44th International Conference on Software Engineering (ICSE).

[8]  Chenguang Zhu,et al.  Self-Attention based Automated Vulnerability Detection with Effective Data Representation , 2021, 2021 IEEE Intl Conf on Parallel & Distributed Processing with Applications, Big Data & Cloud Computing, Sustainable Computing & Communications, Social Computing & Networking (ISPA/BDCloud/SocialCom/SustainCom).

[9]  Wojciech Zaremba,et al.  Evaluating Large Language Models Trained on Code , 2021, ArXiv.

[10]  Tien N. Nguyen,et al.  Vulnerability detection with fine-grained interpretations , 2021, ESEC/SIGSOFT FSE.

[11]  Baishakhi Ray,et al.  Deep Learning Based Vulnerability Detection: Are We There Yet? , 2020, IEEE Transactions on Software Engineering.

[12]  Neel Sundaresan,et al.  IntelliCode compose: code generation using transformer , 2020, ESEC/SIGSOFT FSE.

[13]  Ting Liu,et al.  CodeBERT: A Pre-Trained Model for Programming and Natural Languages , 2020, FINDINGS.

[14]  Shangqing Liu,et al.  Devign: Effective Vulnerability Identification by Learning Comprehensive Program Semantics via Graph Neural Networks , 2019, NeurIPS.

[15]  Yang Liu,et al.  Cerebro: context-aware adaptive fuzzing for effective vulnerability detection , 2019, ESEC/SIGSOFT FSE.

[16]  Yang Liu,et al.  Superion: Grammar-Aware Greybox Fuzzing , 2018, 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE).

[17]  Nathan M. VanHoudnos,et al.  Towards security defect prediction with AI , 2018, ArXiv.

[18]  Shouhuai Xu,et al.  SySeVR: A Framework for Using Deep Learning to Detect Software Vulnerabilities , 2018, IEEE Transactions on Dependable and Secure Computing.

[19]  Onur Ozdemir,et al.  Automated Vulnerability Detection in Source Code Using Deep Representation Learning , 2018, 2018 17th IEEE International Conference on Machine Learning and Applications (ICMLA).

[20]  Shouhuai Xu,et al.  VulDeePecker: A Deep Learning-Based System for Vulnerability Detection , 2018, NDSS.

[21]  Yang Liu,et al.  Steelix: program-state based binary fuzzing , 2017, ESEC/SIGSOFT FSE.

[22]  Yang Liu,et al.  SPAIN: Security Patch Analysis for Binaries towards Understanding the Pain and Pills , 2017, 2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE).

[23]  Jaegul Choo,et al.  End-to-End Prediction of Buffer Overruns from Raw Source Code via Neural Memory Networks , 2017, IJCAI.

[24]  Yang Liu,et al.  BinGo: cross-architecture cross-OS binary search , 2016, SIGSOFT FSE.

[25]  Zhi Jin,et al.  Building Program Vector Representations for Deep Learning , 2014, KSEM.

[26]  David M. W. Powers,et al.  Evaluation: from precision, recall and F-measure to ROC, informedness, markedness and correlation , 2011, ArXiv.

[27]  Jing Xie,et al.  ASIDE: IDE support for web application security , 2011, ACSAC '11.

[28]  Dawson R. Engler,et al.  A few billion lines of code later , 2010, Commun. ACM.

[29]  Shiva Azadegan,et al.  Moving beyond security tracks: integrating security in cs0 and cs1 , 2008, SIGCSE '08.

[30]  Andreas Zeller,et al.  Predicting vulnerable software components , 2007, CCS '07.

[31]  Lucas Layman,et al.  Toward Reducing Fault Fix Time: Understanding Developer Behavior for the Design of Automated Fault Detection Tools , 2007, First International Symposium on Empirical Software Engineering and Measurement (ESEM 2007).

[32]  Inger Anne Tøndel,et al.  How can the developer benefit from security modeling? , 2007, The Second International Conference on Availability, Reliability and Security (ARES'07).

[33]  Walter Baziuk,et al.  BNR/NORTEL: path to improve product quality, reliability and customer satisfaction , 1995, Proceedings of Sixth International Symposium on Software Reliability Engineering. ISSRE'95.

[34]  Barry W. Boehm,et al.  Software Engineering Economics , 1993, IEEE Transactions on Software Engineering.

[35]  S. Savarese,et al.  A Conversational Paradigm for Program Synthesis , 2022, ArXiv.

[36]  Brendan Dolan-Gavitt,et al.  Security Implications of Large Language Model Code Assistants: A User Study , 2022, ArXiv.

[37]  Shin Hwei Tan,et al.  Improving automatically generated code from Codex via Automated Program Repair , 2022, ArXiv.

[38]  Ramesh Karri,et al.  An Empirical Cybersecurity Evaluation of GitHub Copilot's Code Contributions , 2021, ArXiv.

[39]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[40]  Tuan Nguyen,et al.  Maximal Divergence Sequential Autoencoder for Binary Software Vulnerability Detection , 2019, ICLR.

[41]  Ki-Woong Park,et al.  Learning Binary Code with Deep Learning to Detect Software Weakness , 2017 .