Selecting Representative Examples for Program Synthesis

Program synthesis is a class of regression problems where one seeks a solution, in the form of a source-code program, mapping the inputs to their corresponding outputs exactly. Due to its precise and combinatorial nature, program synthesis is commonly formulated as a constraint satisfaction problem, where input-output examples are encoded as constraints and solved with a constraint solver. A key challenge of this formulation is scalability: while constraint solvers work well with a few well-chosen examples, a large set of examples can incur significant overhead in both time and memory. We describe a method to discover a subset of examples that is both small and representative: the subset is constructed iteratively, using a neural network to predict the probability of unchosen examples conditioned on the chosen examples in the subset, and greedily adding the least probable example. We empirically evaluate the representativeness of the subsets constructed by our method, and demonstrate such subsets can significantly improve synthesis time and stability.

[1]  Geoffrey E. Hinton,et al.  A Learning Algorithm for Boltzmann Machines , 1985, Cogn. Sci..

[2]  Rupak Majumdar,et al.  Tools and Algorithms for the Construction and Analysis of Systems , 1997, Lecture Notes in Computer Science.

[3]  Armando Solar-Lezama,et al.  Learning to Infer Graphics Programs from Hand-Drawn Images , 2017, NeurIPS.

[4]  Sanjit A. Seshia,et al.  Combinatorial sketching for finite programs , 2006, ASPLOS XII.

[5]  Leslie Pack Kaelbling,et al.  Learning to Acquire Information , 2017, UAI.

[6]  M. L. Fisher,et al.  An analysis of approximations for maximizing submodular set functions—I , 1978, Math. Program..

[7]  Alex Graves,et al.  Neural Turing Machines , 2014, ArXiv.

[8]  Sumit Gulwani,et al.  Oracle-guided component-based program synthesis , 2010, 2010 ACM/IEEE 32nd International Conference on Software Engineering.

[9]  Dinakar Dhurjati,et al.  Scaling up Superoptimization , 2016 .

[10]  Sebastian Nowozin,et al.  DeepCoder: Learning to Write Programs , 2016, ICLR.

[11]  Lihong Li,et al.  Neuro-Symbolic Program Synthesis , 2016, ICLR.

[12]  Toby Walsh,et al.  The SAT Phase Transition , 1994, ECAI.

[13]  Paolo Papotti,et al.  Generating Concise Entity Matching Rules , 2017, SIGMOD Conference.

[14]  Alfred V. Aho,et al.  The Transitive Reduction of a Directed Graph , 1972, SIAM J. Comput..

[15]  Matthew J. Hausknecht,et al.  Neural Program Meta-Induction , 2017, NIPS.

[16]  Dawn Xiaodong Song,et al.  Making Neural Programming Architectures Generalize via Recursion , 2017, ICLR.

[17]  Sumit Gulwani,et al.  Spreadsheet data manipulation using examples , 2012, CACM.

[18]  Pushmeet Kohli,et al.  TerpreT: A Probabilistic Programming Language for Program Induction , 2016, ArXiv.

[19]  Nikolaj Bjørner,et al.  Z3: An Efficient SMT Solver , 2008, TACAS.

[20]  Dawson R. Engler,et al.  KLEE: Unassisted and Automatic Generation of High-Coverage Tests for Complex Systems Programs , 2008, OSDI.

[21]  Armando Solar-Lezama,et al.  SWAPPER: A framework for automatic generation of formula simplifiers based on conditional rewrite rules , 2016, 2016 Formal Methods in Computer-Aided Design (FMCAD).

[22]  Armando Solar-Lezama,et al.  Unsupervised Learning by Program Synthesis , 2015, NIPS.

[23]  Quoc V. Le,et al.  Neural Programmer: Inducing Latent Programs with Gradient Descent , 2015, ICLR.

[24]  Nando de Freitas,et al.  Neural Programmer-Interpreters , 2015, ICLR.