Deep Reinforcement Learning for Programming Language Correction

Novice programmers often struggle with the formal syntax of programming languages. To assist them, we design a novel programming language correction framework amenable to reinforcement learning. The framework allows an agent to mimic human actions for text navigation and editing. We demonstrate that the agent can be trained through self-exploration directly from the raw input, that is, program text itself, without any knowledge of the formal syntax of the programming language. We leverage expert demonstrations for one tenth of the training data to accelerate training. The proposed technique is evaluated on 6975 erroneous C programs with typographic errors, written by students during an introductory programming course. Our technique fixes 14% more programs and 29% more compiler error messages relative to those fixed by a state-of-the-art tool, DeepFix, which uses a fully supervised neural machine translation approach.

[1]  Raymond Hendy Susanto,et al.  The CoNLL-2014 Shared Task on Grammatical Error Correction , 2014 .

[2]  V. Javier Traver,et al.  On Compiler Error Messages: What They Say and What They Mean , 2010, Adv. Hum. Comput. Interact..

[3]  Yuan Yu,et al.  TensorFlow: A system for large-scale machine learning , 2016, OSDI.

[4]  Branimir Boguraev,et al.  Natural Language Engineering , 1995 .

[5]  Pieter Abbeel,et al.  An Application of Reinforcement Learning to Aerobatic Helicopter Flight , 2006, NIPS.

[6]  Brett Browning,et al.  A survey of robot learning from demonstration , 2009, Robotics Auton. Syst..

[7]  Geoffrey J. Gordon,et al.  A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning , 2010, AISTATS.

[8]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[9]  T. Michael Knasel,et al.  Robotics and autonomous systems , 1988, Robotics Auton. Syst..

[10]  M. V. Rossum,et al.  In Neural Computation , 2022 .

[11]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[12]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[13]  Arnab Bhattacharya,et al.  Automatic Grading and Feedback using Program Repair for Introductory Programming Courses , 2017, ITiCSE.

[14]  Na-Rae Han,et al.  Detection of Grammatical Errors Involving Prepositions , 2007, ACL 2007.

[15]  Yuandong Tian,et al.  Training Agent for First-Person Shooter Game with Actor-Critic Curriculum Learning , 2016, ICLR.

[16]  Alex Graves,et al.  Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.

[17]  Michael I. Jordan,et al.  Advances in Neural Information Processing Systems 30 , 1995 .

[18]  Nizar Habash,et al.  The Illinois-Columbia System in the CoNLL-2014 Shared Task , 2014, CoNLL Shared Task.

[19]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[20]  Regina Barzilay,et al.  Language Understanding for Text-based Games using Deep Reinforcement Learning , 2015, EMNLP.

[21]  Pieter Abbeel,et al.  Apprenticeship learning via inverse reinforcement learning , 2004, ICML.

[22]  Hwee Tou Ng,et al.  Proceedings of the Conference on Empirical Methods in Natural Language Processing , 2008 .

[23]  Dan Roth,et al.  Generating Confusion Sets for Context-Sensitive Error Correction , 2010, EMNLP.

[24]  Na-Rae Han,et al.  Detecting errors in English article usage by non-native speakers , 2006, Natural Language Engineering.

[25]  Rahul Gupta,et al.  DeepFix: Fixing Common C Language Errors by Deep Learning , 2017, AAAI.