How Secure is Code Generated by ChatGPT?

In recent years, large language models have been responsible for great advances in the field of artificial intelligence (AI). ChatGPT in particular, an AI chatbot developed and recently released by OpenAI, has taken the field to the next level. The conversational model is able not only to process human-like text, but also to translate natural language into code. However, the safety of programs generated by ChatGPT should not be overlooked. In this paper, we perform an experiment to address this issue. Specifically, we ask ChatGPT to generate a number of program and evaluate the security of the resulting source code. We further investigate whether ChatGPT can be prodded to improve the security by appropriate prompts, and discuss the ethical aspects of using AI to generate code. Results suggest that ChatGPT is aware of potential vulnerabilities, but nonetheless often generates source code that are not robust to certain attacks.

[1]  F. Fischer,et al.  ChatGPT for good? On opportunities and challenges of large language models for education , 2023, Learning and Individual Differences.

[2]  A. Borji A Categorical Archive of ChatGPT Failures , 2023, ArXiv.

[3]  E. A. V. van Dis,et al.  ChatGPT: five priorities for research , 2023, Nature.

[4]  J. Petke,et al.  An Analysis of the Automatic Bug Fixing Performance of ChatGPT , 2023, 2023 IEEE/ACM International Workshop on Automated Program Repair (APR).

[5]  Debdeep Mukhopadhyay,et al.  Generating Secure Hardware using ChatGPT Resistant to CWEs , 2023, IACR Cryptol. ePrint Arch..

[6]  M. Ingrisch,et al.  ChatGPT makes medicine easy to swallow: an exploratory case study on simplified radiology reports , 2022, European radiology.

[7]  Lisa Anne Hendricks,et al.  Training Compute-Optimal Large Language Models , 2022, ArXiv.

[8]  Marcel van Gerven,et al.  Explainable Deep Learning: A Field Guide for the Uninitiated , 2020, J. Artif. Intell. Res..

[9]  Luciano Floridi,et al.  GPT-3: Its Nature, Scope, Limits, and Consequences , 2020, Minds and Machines.

[10]  Oleksandr Polozov,et al.  Program Synthesis and Semantic Parsing with Learned Code Idioms , 2019, NeurIPS.

[11]  Omer Levy,et al.  code2seq: Generating Sequences from Structured Representations of Code , 2018, ICLR.

[12]  Francisco Servant,et al.  The impact of regular expression denial of service (ReDoS) in practice: an empirical study at the ecosystem scale , 2018, ESEC/SIGSOFT FSE.

[13]  Robert C. Seacord,et al.  Java Deserialization Vulnerabilities and Mitigations , 2017, 2017 IEEE Cybersecurity Development (SecDev).

[14]  Song Wang,et al.  Automatically Learning Semantic Features for Defect Prediction , 2016, 2016 IEEE/ACM 38th International Conference on Software Engineering (ICSE).

[15]  Robert C. Seacord,et al.  Secure Coding in C and C++ (SEI Series in Software Engineering) , 2013 .

[16]  Mira Mezini,et al.  Learning from examples to improve code completion systems , 2009, ESEC/SIGSOFT FSE.

[17]  Robert C. Seacord,et al.  Secure coding in C and C , 2005 .

[18]  Peter P. Swire A Model for When Disclosure Helps Security: What Is Different About Computer and Network Security? , 2004, J. Telecommun. High Technol. Law.

[19]  John Viega,et al.  Secure programming cookbook for C and C , 2003 .