Can ChatGPT Pass An Introductory Level Functional Language Programming Course?

The recent introduction of ChatGPT has drawn significant attention from both industry and academia due to its impressive capabilities in solving a diverse range of tasks, including language translation, text summarization, and computer programming. Its capability for writing, modifying, and even correcting code together with its ease of use and access is already dramatically impacting computer science education. This paper aims to explore how well ChatGPT can perform in an introductory-level functional language programming course. In our systematic evaluation, we treated ChatGPT as one of our students and demonstrated that it can achieve a grade B- and its rank in the class is 155 out of 314 students overall. Our comprehensive evaluation provides valuable insights into ChatGPT's impact from both student and instructor perspectives. Additionally, we identify several potential benefits that ChatGPT can offer to both groups. Overall, we believe that this study significantly clarifies and advances our understanding of ChatGPT's capabilities and potential impact on computer science education.

[1]  Jacques Klein,et al.  Is ChatGPT the Ultimate Programming Assistant - How far is it? , 2023, ArXiv.

[2]  Siyuan Ma,et al.  Summary of ChatGPT-Related Research and Perspective Towards the Future of Large Language Models , 2023, Meta-Radiology.

[3]  Douglas C. Schmidt,et al.  A Prompt Pattern Catalog to Enhance Prompt Engineering with ChatGPT , 2023, ArXiv.

[4]  Nan Jiang,et al.  Impact of Code Language Models on Automated Program Repair , 2023, 2023 IEEE/ACM 45th International Conference on Software Engineering (ICSE).

[5]  J. Petke,et al.  An Analysis of the Automatic Bug Fixing Performance of ChatGPT , 2023, 2023 IEEE/ACM International Workshop on Automated Program Repair (APR).

[6]  X. Si,et al.  Identifying Different Student Clusters in Functional Programming Assignments: From Quick Learners to Struggling Students , 2023, SIGCSE.

[7]  Alexander M. Rush,et al.  Interactive and Visual Prompt Engineering for Ad-hoc Task Adaptation with Large Language Models , 2022, IEEE Transactions on Visualization and Computer Graphics.

[8]  N. Polikarpova,et al.  Grounded Copilot: How Programmers Interact with Code-Generating Models , 2022, Proc. ACM Program. Lang..

[9]  Jimmy Ba,et al.  Large Language Models Are Human-Level Prompt Engineers , 2022, ICLR.

[10]  Shafiq R. Joty,et al.  FOLIO: Natural Language Reasoning with First-Order Logic , 2022, ArXiv.

[11]  N. A. Madi,et al.  How Readable is Model-generated Code? Examining Readability and Visual Inspection of GitHub Copilot , 2022, ASE.

[12]  Sarah Nadi,et al.  An Empirical Evaluation of GitHub Copilot's Code Suggestions , 2022, 2022 IEEE/ACM 19th International Conference on Mining Software Repositories (MSR).

[13]  M. Nagappan,et al.  Is GitHub's Copilot as Bad As Humans at Introducing Vulnerabilities in Code? , 2022, ArXiv.

[14]  Cherepanov,et al.  Competition-level code generation with AlphaCode , 2022, Science.

[15]  Dale Schuurmans,et al.  Chain of Thought Prompting Elicits Reasoning in Large Language Models , 2022, NeurIPS.

[16]  Wojciech Zaremba,et al.  Evaluating Large Language Models Trained on Code , 2021, ArXiv.

[17]  Florian Matthes,et al.  CodeTrans: Towards Cracking the Language of Silicone's Code Through Self-Supervised Deep Learning and High Performance Computing , 2021, ArXiv.

[18]  Haden Hooyeon Lee Effectiveness of Real-time Feedback and Instructive Hints in Graduate CS Courses via Automated Grading System , 2021, SIGCSE.

[19]  Emily M. Bender,et al.  On the Dangers of Stochastic Parrots: Can Language Models Be Too Big? 🦜 , 2021, FAccT.

[20]  Laria Reynolds,et al.  Prompt Programming for Large Language Models: Beyond the Few-Shot Paradigm , 2021, CHI Extended Abstracts.

[21]  Emery D. Berger,et al.  Mossad: defeating software plagiarism detection , 2020, Proc. ACM Program. Lang..

[22]  Yejin Choi,et al.  Commonsense Reasoning for Natural Language Processing , 2020, ACL.

[23]  Mark Chen,et al.  Language Models are Few-Shot Learners , 2020, NeurIPS.

[24]  Ting Liu,et al.  CodeBERT: A Pre-Trained Model for Programming and Natural Languages , 2020, FINDINGS.

[25]  Colin Raffel,et al.  Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer , 2019, J. Mach. Learn. Res..

[26]  Brigitte Pientka,et al.  Teaching the art of functional programming using automated grading (experience report) , 2019, Proc. ACM Program. Lang..

[27]  Muath Alkhalaf,et al.  Automated Grading Systems for Programming Assignments: A Literature Review , 2019, International Journal of Advanced Computer Science and Applications.

[28]  Ilya Sutskever,et al.  Language Models are Unsupervised Multitask Learners , 2019 .

[29]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[30]  F. Sandu,et al.  INTELLIGENT EDUCATION ASSISTANT POWERED BY CHATBOTS , 2018, 14th International Conference eLearning and Software for Education.

[31]  Burkhard Wünsche,et al.  Intelligent tutoring systems for programming education: a systematic review , 2018, ACE.

[32]  Alec Radford,et al.  Improving Language Understanding by Generative Pre-Training , 2018 .

[33]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[34]  Irena Koprinska,et al.  Mining autograding data in computer science education , 2016, ACSW.

[35]  Zachary Chase Lipton A Critical Review of Recurrent Neural Networks for Sequence Learning , 2015, ArXiv.

[36]  Chris Wilcox,et al.  The Role of Automation in Undergraduate Computer Science Education , 2015, SIGCSE.

[37]  Fred Martin,et al.  Impact of auto-grading on an introductory computing course , 2013 .

[38]  Václav Snásel,et al.  Overview and Comparison of Plagiarism Detection Tools , 2011, DATESO.

[39]  Jinan Fiaidhi,et al.  PlagDetect: a Java programming plagiarism detection tool , 2010, INROADS.

[40]  Naser Abu,et al.  Developing an intelligent tutoring system for students learning to program in C , 2008 .

[41]  Stephen H. Edwards,et al.  Web-CAT: automatically grading programming assignments , 2008, ITiCSE.

[42]  K.W. Bowyer,et al.  Experience using "MOSS" to detect cheating on programming assignments , 1999, FIE'99 Frontiers in Education. 29th Annual Frontiers in Education Conference. Designing the Future of Science and Engineering Education. Conference Proceedings (IEEE Cat. No.99CH37011.

[43]  Masoud Yazdani,et al.  Intelligent tutoring systems: An overview , 1986 .