Neural Attribute Machines for Program Generation

Recurrent neural networks have achieved remarkable success at generating sequences with complex structures, thanks to advances that include richer embeddings of input and cures for vanishing gradients. Trained only on sequences from a known grammar, though, they can still struggle to learn rules and constraints of the grammar. Neural Attribute Machines (NAMs) are equipped with a logical machine that represents the underlying grammar, which is used to teach the constraints to the neural machine by (i) augmenting the input sequence, and (ii) optimizing a custom loss function. Unlike traditional RNNs, NAMs are exposed to the grammar, as well as samples from the language of the grammar. During generation, NAMs make significantly fewer violations of the constraints of the underlying grammar than RNNs trained only on samples from the language of the grammar.

[1]  Andreas Zeller,et al.  Fuzzing with Code Fragments , 2012, USENIX Security Symposium.

[2]  Martin T. Vechev,et al.  PHOG: Probabilistic Model for Code , 2016, ICML.

[3]  Daniel Tarlow,et al.  Structured Generative Models of Natural Source Code , 2014, ICML.

[4]  Phil Blunsom,et al.  Learning to Transduce with Unbounded Memory , 2015, NIPS.

[5]  Koray Kavukcuoglu,et al.  Pixel Recurrent Neural Networks , 2016, ICML.

[6]  Swarat Chaudhuri,et al.  Neural Sketch Learning for Conditional Program Generation , 2017, ICLR.

[7]  Premkumar T. Devanbu,et al.  On the naturalness of software , 2016, Commun. ACM.

[8]  Jürgen Schmidhuber,et al.  Parallel Multi-Dimensional LSTM, With Application to Fast Biomedical Volumetric Image Segmentation , 2015, NIPS.

[9]  Jean-François Raskin,et al.  Visibly Pushdown Transducers ⋆ , 2008 .

[10]  Douglas Eck,et al.  A Neural Representation of Sketch Drawings , 2017, ICLR.

[11]  Quoc V. Le,et al.  Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[12]  Donald E. Knuth Semantics of context-free languages: Correction , 2005, Mathematical systems theory.

[13]  Zhi Jin,et al.  Discriminative Neural Sentence Modeling by Tree-Based Convolution , 2015, EMNLP.

[14]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[15]  Horia Margarit,et al.  A Batch-Normalized Recurrent Network for Sentiment Classification , 2016 .

[16]  Samy Bengio,et al.  Show and tell: A neural image caption generator , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Anh Tuan Nguyen,et al.  Graph-Based Statistical Language Model for Code , 2015, 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering.

[18]  Donald E. Knuth,et al.  Semantics of context-free languages , 1968, Mathematical systems theory.

[19]  Jing Bai,et al.  SAE-RNN Deep Learning for RGB-D Based Object Recognition , 2014, ICIC.

[20]  Daniel J. Rosenkrantz,et al.  Attributed Translations , 1974, J. Comput. Syst. Sci..

[21]  Gang Wang,et al.  DAG-Recurrent Neural Networks for Scene Labeling , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Swarat Chaudhuri,et al.  Bayesian Sketch Learning for Program Synthesis , 2017, ArXiv.

[23]  George C. Necula,et al.  CIL: Intermediate Language and Tools for Analysis and Transformation of C Programs , 2002, CC.

[24]  Sebastian Nowozin,et al.  DeepCoder: Learning to Write Programs , 2016, ICLR.

[25]  Lihong Li,et al.  Neuro-Symbolic Program Synthesis , 2016, ICLR.

[26]  Thomas Reps,et al.  The Synthesizer Generator: A System for Constructing Language-Based Editors , 1988 .

[27]  Charles A. Sutton,et al.  Mining idioms from source code , 2014, SIGSOFT FSE.

[28]  Anh Tuan Nguyen,et al.  A statistical semantic language model for source code , 2013, ESEC/FSE 2013.

[29]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[30]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[31]  Eran Yahav,et al.  Code completion with statistical language models , 2014, PLDI.

[32]  Christos H. Papadimitriou,et al.  Complexity Characterizations of Attribute Grammar Languages , 1987, Proceeding Structure in Complexity Theory.

[33]  Jen-Tzung Chien,et al.  Bayesian Recurrent Neural Network for Language Modeling , 2016, IEEE Transactions on Neural Networks and Learning Systems.