论文信息 - Neural Program Synthesis with Priority Queue Training - 字舞流文

Neural Program Synthesis with Priority Queue Training

We consider the task of program synthesis in the presence of a reward function over the output of programs, where the goal is to find programs with maximal rewards. We employ an iterative optimization scheme, where we train an RNN on a dataset of K best programs from a priority queue of the generated programs so far. Then, we synthesize new programs and add them to the priority queue by sampling from the RNN. We benchmark our algorithm, called priority queue training (or PQT), against genetic algorithm and reinforcement learning baselines on a simple but expressive Turing complete programming language called BF. Our experimental results show that our simple PQT algorithm significantly outperforms the baselines. By adding a program length penalty to the reward function, we are able to synthesize short, human readable programs.

Quoc V. Le | Mohammad Norouzi | Daniel A. Abolafia | Mohammad Norouzi

[1] Wojciech Zaremba,et al. Learning Simple Algorithms from Examples , 2015, ICML.

[2] Trevor Darrell,et al. Learning to Reason: End-to-End Module Networks for Visual Question Answering , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[3] Sumit Gulwani,et al. Dimensions in program synthesis , 2010, Formal Methods in Computer Aided Design.

[4] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.

[5] Martín Abadi,et al. Learning a Natural Language Interface with Neural Programmer , 2016, ICLR.

[6] Sebastian Nowozin,et al. DeepCoder: Learning to Write Programs , 2016, ICLR.

[7] Vikash K. Mansinghka,et al. Gaussian Process Structure Learning via Probabilistic Inverse Compilation , 2016 .

[8] Chen Liang,et al. Neural Symbolic Machines: Learning Semantic Parsers on Freebase with Weak Supervision , 2016, ACL.

[9] Marcin Andrychowicz,et al. Neural Random Access Machines , 2015, ERCIM News.

[10] Pushmeet Kohli,et al. RobustFill: Neural Program Learning under Noisy I/O , 2017, ICML.

[11] Li Fei-Fei,et al. Inferring and Executing Programs for Visual Reasoning , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[12] Earl T. Barr,et al. Learning Python Code Suggestion with a Sparse Pointer Network , 2016, ArXiv.

[13] Guillermo Vigueras,et al. Towards Automatic Learning of Heuristics for Mechanical Transformations of Procedural Code , 2016, PROLE.

[14] Pushmeet Kohli,et al. Adaptive Neural Compilation , 2016, NIPS.

[15] George Kurian,et al. Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation , 2016, ArXiv.

[16] Alexey Radul,et al. Time Series Structure Discovery via Probabilistic Program Synthesis , 2016 .

[17] Tim Rocktäschel,et al. Programming with a Differentiable Forth Interpreter , 2016, ICML.

[18] Dawn Xiaodong Song,et al. Making Neural Programming Architectures Generalize via Recursion , 2017, ICLR.

[19] Dana Angluin,et al. Learning Regular Sets from Queries and Counterexamples , 1987, Inf. Comput..

[20] Tomas Mikolov,et al. Inferring Algorithmic Patterns with Stack-Augmented Recurrent Nets , 2015, NIPS.

[21] Dan Klein,et al. Learning to Compose Neural Networks for Question Answering , 2016, NAACL.

[22] Rico Sennrich,et al. A Parallel Corpus of Python Functions and Documentation Strings for Automated Code Documentation and Code Generation , 2017, IJCNLP.

[23] Lukasz Kaiser,et al. Neural GPUs Learn Algorithms , 2015, ICLR.

[24] Navdeep Jaitly,et al. Pointer Networks , 2015, NIPS.

[25] Luc De Raedt,et al. Inductive Logic Programming: Theory and Methods , 1994, J. Log. Program..

[26] Nando de Freitas,et al. Neural Programmer-Interpreters , 2015, ICLR.

[27] Dale Schuurmans,et al. Bridging the Gap Between Value and Policy Based Reinforcement Learning , 2017, NIPS.

[28] Lihong Li,et al. Neuro-Symbolic Program Synthesis , 2016, ICLR.

[29] Justin Emile Gottschlich,et al. AI programmer: autonomously creating software programs using genetic algorithms , 2017, GECCO Companion.

[30] Percy Liang,et al. From Language to Programs: Bridging Reinforcement Learning and Maximum Marginal Likelihood , 2017, ACL.

[31] Pushmeet Kohli,et al. TerpreT: A Probabilistic Programming Language for Program Induction , 2016, ArXiv.

[32] Phillip D. Summers,et al. A Methodology for LISP Program Construction from Examples , 1977, J. ACM.

[33] Michael I. Jordan,et al. Learning Programs: A Hierarchical Bayesian Approach , 2010, ICML.

[34] Quoc V. Le,et al. Neural Programmer: Inducing Latent Programs with Gradient Descent , 2015, ICLR.

[35] Alan W. Biermann,et al. The Inference of Regular LISP Programs from Examples , 1978, IEEE Transactions on Systems, Man, and Cybernetics.

[36] Martin Wattenberg,et al. Google’s Multilingual Neural Machine Translation System: Enabling Zero-Shot Translation , 2016, TACL.

[37] Swarat Chaudhuri,et al. Bayesian Sketch Learning for Program Synthesis , 2017, ArXiv.

[38] Swarat Chaudhuri,et al. Neural Sketch Learning for Conditional Program Generation , 2017, ICLR.

[39] Quoc V. Le,et al. Neural Architecture Search with Reinforcement Learning , 2016, ICLR.

[40] Jing Peng,et al. Function Optimization using Connectionist Reinforcement Learning Algorithms , 1991 .

[41] Samy Bengio,et al. Neural Combinatorial Optimization with Reinforcement Learning , 2016, ICLR.

[42] Wojciech Zaremba,et al. Learning to Execute , 2014, ArXiv.

[43] Ronald J. Williams,et al. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.

[44] Truyen Tran,et al. A deep language model for software code , 2016, FSE 2016.

[45] Alex Graves,et al. Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.

[46] Dale Schuurmans,et al. Improving Policy Gradient by Exploring Under-appreciated Rewards , 2016, ICLR.

[47] Alex Graves,et al. Neural Turing Machines , 2014, ArXiv.

[48] Dan Klein,et al. Abstract Syntax Networks for Code Generation and Semantic Parsing , 2017, ACL.

[49] Joshua B. Tenenbaum,et al. Human-level concept learning through probabilistic program induction , 2015, Science.

[50] Sergio Gomez Colmenarejo,et al. Hybrid computing using a neural network with dynamic external memory , 2016, Nature.

[51] Tony Beltramelli,et al. pix2code: Generating Code from a Graphical User Interface Screenshot , 2017, EICS.